Unbalanced tablets or extra rfiles

2016-06-07 Thread Andrew Hulbert
Hi all, A few questions on behavior if you have any time... 1. When looking in accumulo's HDFS directories I'm seeing a situation where "tablets" aka "directories" for a table have more than the default 1G split threshold worth of rfiles in them. In one large instance, we have 400G worth of

Re: Unbalanced tablets or extra rfiles

2016-06-07 Thread Andrew Hulbert
2016 at 4:03 PM, Andrew Hulbert <ahulb...@ccri.com <mailto:ahulb...@ccri.com>> wrote: Hi all, A few questions on behavior if you have any time... 1. When looking in accumulo's HDFS directories I'm seeing a situation where "tablets" aka "direct

Re: Recovery file versus directory

2016-03-19 Thread Andrew Hulbert
errors since (3 days). Not sure why that would be a problem except for the few times that the metadata table was involved. Andrew On 03/18/2016 09:43 AM, Andrew Hulbert wrote: I'll tar them up and see what I can find! Thanks. On 03/17/2016 08:18 PM, Michael Wall wrote: Andrew, Sounds a lot

Re: Optimize Accumulo scan speed

2016-04-10 Thread Andrew Hulbert
of the same range but different shards put on the same machine and disk. Anyway, performance were better than not having sharding, so I will reenable it and do some tests with the number of shards. On Sun, Apr 10, 2016 at 5:25 PM, Andrew Hulbert <ahulb...@ccri.com <mailto:ahulb...@ccri.com&g

Re: Optimize Accumulo scan speed

2016-04-10 Thread Andrew Hulbert
Mario, Are you using a Scanner or a BatchScanner? One thing we did in the past with a geohash-based schema was to prefix a shard ID in front of the geohash that allows you to involve all the tservers in the scan. You'd multiply your ranges by the number of tservers you have but if the client

Re: Accumulo 1.7.1 on Docker

2016-03-22 Thread Andrew Hulbert
+1 for Josh's suggestion. Not sure if its the same problem, but I too had to add some sleeps for single node accumulo startup scripts directly before and after init'ing accumulo. I dug in once and it seemed that the datanode needed more time to let other processes know the files in HDFS

Re: Accumulo 1.7.1 on Docker

2016-03-22 Thread Andrew Hulbert
situations like this. Very frustrating. I would love to know what's happening and get this working reliably. Having an "Apache-owned" docker image for Accumulo would be nice (and would help us internally as well as externally, IMO). Andrew Hulbert wrote: +1

Recovery file versus directory

2016-03-08 Thread Andrew Hulbert
Hi folks, We experienced a problem this morning with a recovery on 1.6.1 that went something like this: FileNotFoundException: File does not exist: hdfs:///accumulo/recovery//failed/data at Tablet.java:1410 at Tablet.java:1233 etc. at TabletServer:2923 Interestingly enough, at

Re: java.IO.EOFException: ..../accumulo/recovery/.../part-r-00000/index not a SequenceFile.

2016-10-18 Thread Andrew Hulbert
Mike, For one of the WALs I backed up the recovery directory and that initiated a new recovery attempt as indicated in the tserver debug log... Then the exception was thrown: Unable to find recovery files for extent xx logentry x hdfs://path/to/wal/ Any ideas? I figure we can

Re: java.IO.EOFException: ..../accumulo/recovery/.../part-r-00000/index not a SequenceFile.

2016-10-18 Thread Andrew Hulbert
I'll try to dig up the full error from the tserver On 10/18/2016 10:30 AM, Josh Elser wrote: Do you have the full exception for the "Expected protocol id.." error? That looks like it might be incorrect usage of Thrift on our part.. Andrew Hulbert wrote: Mike, So backing up and

Re: java.IO.EOFException: ..../accumulo/recovery/.../part-r-00000/index not a SequenceFile.

2016-10-18 Thread Andrew Hulbert
Note that the error is more like this: Expected protocol id ff82 but got 35 (0!;38\\;82,:9997, ) On 10/18/2016 10:28 AM, Andrew Hulbert wrote: Mike, So backing up and then later deleting the recovery directories a few times did the trick. It seemed that removing the initial bad one

Re: java.IO.EOFException: ..../accumulo/recovery/.../part-r-00000/index not a SequenceFile.

2016-10-18 Thread Andrew Hulbert
n see some actual logs? Are all the WALs there? Do you find any of the WAL deleted by GC in the gc logs? Do you find any duplicates WALs in the HDFS trash? On Tue, Oct 18, 2016 at 9:32 AM, Andrew Hulbert <ahulb...@ccri.com <mailto:ahulb...@ccri.com>> wrote: Mike, For one o

Re: java.IO.EOFException: ..../accumulo/recovery/.../part-r-00000/index not a SequenceFile.

2016-10-18 Thread Andrew Hulbert
I think it is attempting to do migrations at the moment FYI On 10/18/2016 10:40 AM, Andrew Hulbert wrote: Yes, it looks similar. Esp these parts: 2015-11-19 22:43:05,998 [impl.TabletServerBatchReaderIterator] DEBUG: org.apache.thrift.protocol.TProtocolException: Expected protocol id

VFS class reloading

2016-11-17 Thread Andrew Hulbert
Hi all, I've been noticing some weird behavior that occurs when trying to update jars for custom classpaths using VFS. After updating jars in HDFS they seem to still be cached even though I have deleted the namespace, classpath context and jar in HDFS. After putting the jar at a different

Re: VFS version in 1.6.6 binary release

2016-12-02 Thread Andrew Hulbert
ee https://issues.apache.org/jira/browse/ACCUMULO-3470 Mike On Fri, Dec 2, 2016 at 10:27 AM, Andrew Hulbert <ahulb...@ccri.com <mailto:ahulb...@ccri.com>> wrote: Hi all, It appears that the commons-vfs2 jar that ships with the 1.6.6 binary tar.gz is sti

Re: upgrading from 1.8.x to 1.9.x

2019-06-17 Thread Andrew Hulbert
We had no major issues upgrading to 1.9.x from 1.8.x We did have some WAL issues with 1.9.2 that I think may now be fixed in 1.9.3 (we just upgraded). On 6/10/19 8:02 PM, Adam J. Shook wrote: 1.9.x is effectively the continuation of the 1.8.x bug fix releases.  I've also upgraded several