Re: Mutation Rejected exception with server Error 1

2015-12-23 Thread Josh Elser
Eric Newton wrote: Failure to talk to zookeeper is *really* unexpected. Have you noticed your nodes using any significant swap? Emphasis on this. Failing to connect to ZooKeeper for 60s (2*30) is a very long time (although, I think I have seen JVM GC pauses longer before). A couple of gene

Re: Mutation Rejected exception with server Error 1

2015-12-23 Thread Eric Newton
I was simplifying a bit too much. If an error propagates all the way to an Accumulo client call, then it has stopped retrying for you. An example: - create a batchwriter. this creates an update session within the tserver - mutations are sent against this session id - mutations are pushed

Re: Read and writing rfiles

2015-12-23 Thread Jeff Kubina
> Are the hadoop nodes handling your map-reduce job also running tservers? > Yes. Do the Accumulo log files show the exception? If so, can you post it? Yes, but nothing helpful to track down the cause, it was a very sparse error message. I will try to post the full error messages.

Re: Read and writing rfiles

2015-12-23 Thread David Medinets
Are the hadoop nodes handling your map-reduce job also running tservers? Do the Accumulo log files show the exception? If so, can you post it? On Wed, Dec 23, 2015 at 9:12 AM, Jeff Kubina wrote: > I've have a mapreduce job that reads rfiles as Accumulo key/value > pairs using FileSKVIterator wi

Read and writing rfiles

2015-12-23 Thread Jeff Kubina
I've have a mapreduce job that reads rfiles as Accumulo key/value pairs using FileSKVIterator within a RecordReader, partition/shuffles them based on the byte string of the key, and writes them out as new rfiles using the AccumuloFileOutputFormat. The objective is to create larger rfiles for bulk i

Re: Mutation Rejected exception with server Error 1

2015-12-23 Thread mohit.kaushik
Thanks for the beautiful explanation Eric, so this means that if I get Mutations rejected exception due to tablet server failure, the batchwriter will resend them to some other server and I do not have worry about them. Great... But what is the case when we get mutations rejected exception a

Re: Mutation Rejected exception with server Error 1

2015-12-23 Thread Eric Newton
By default, accumulo traces major and minor compactions. Distributed tracing is one way we try to figure out where time is being spent. You can read the Google Dapper paper to get a better description of the framework. The tracing framework pushes the trace information into the trace table by for

Re: Mutation Rejected exception with server Error 1

2015-12-23 Thread Eric Newton
The accumulo batch writer will re-send mutations if a tablet server fails, or rejects the mutations because the tablet has moved. There's nothing you have to do to recover from fail-overs and re-balancing. I'm not a kernel expert, but I believe that a swappiness setting of "1" is equivalent to "0