I am experiencing an issue when bulk importing the results of a mapreduce job of losing one or more tservers. After the job is finished and the bulk import is kicked off, I observe the following in the lost tserver's logs:
2014-01-10 23:14:21,312 [zookeeper.DistributedWorkQueue] INFO : Got unexpected zookeeper event: None for /accumulo/f76cacfa-e117-4999-893a-1eba79920f2c/recover y 2014-01-10 23:14:21,312 [zookeeper.DistributedWorkQueue] INFO : Got unexpected zookeeper event: None for /accumulo/f76cacfa-e117-4999-893a-1eba79920f2c/bulk_failed_copyq 2014-01-10 23:14:21,369 [zookeeper.DistributedWorkQueue] ERROR: Failed to look for work org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /accumulo/f76cacfa-e117-4999-893a-1eba79920f2c/bulk_failed_copyq However, the bulk import actually succeeded and all is well with the data in the table. I have to restart the tserver each time this happens which is not a viable solution for production. I am using Accumulo 1.5.0. Tservers have 12G of RAM and index caching, CF bloom filters, and groups are turned on for the table in question. Any ideas why this might be happening? Thanks, Anthony
