https://issues.apache.org/jira/browse/ACCUMULO-3603
-Eric On Wed, Feb 18, 2015 at 7:12 PM, Denis <[email protected]> wrote: > On 2/18/15, Christopher <[email protected]> wrote: > > > To rule out some scenarios, is it possible that your clients are writing > to > > the wrong tables? > That was the first idea, so I added assert()'s to the code of the > writers few days ago. No assert was triggered, but some invalid values > appear after new tserver failure. > > > Have you ever seen a failure affecting a table which does > > not exist (like what might happen if there's an off-by-one error in the > WAL > > code)? Or affecting the metadata tables? > No. > Also, no tables were created or deleted during last two months. > > > Can you reproduce this error reliably, or can you share the relevant > ingest > > code which can reproduce this failure? > > I will think how to reproduce it. > What could be special about the code: inserts are performed to few > (5..8) tables at once (one data table + few index tables) but no > MultiTableBatchWriter is used. Few BatchWriter`s (one per table) are > created and flushed consequentially, in the same thread. For Accumulo > 1.4 it was a performance optimization, if worked faster than > MultiTableBatchWriter. Not sure if it is so for 1.6.1, this code was > not changed after migration to 1.6.1. > In all cases with invalid values the index tables were affected (one > of the index table had values typical for another of the index > tables). > > > Also, what kind of tablet server failures are you experiencing when this > happens? > Spontaneous power-offs. There is something wrong with the power units > so every 2-3 days one of the servers suddenly turns off and reboots. >
