Hi All,  

I’m experiencing an issue on multiple hosts w/ Zookeeper 4.6 where Apache Solr 
filled the /overseer/queue node too full and can no longer read from it, and 
now I’m trying to “rmr /overseer/queue” to get things working again. Both 
systems have 200k+ child nodes of the node at fault.

On both systems I set -Djute.maxbuffer=5242880 within the zkServer.sh 
throughout the cluster and -Djute.maxbuffer=10000000 in zkCli.sh. On one system 
I couldn’t get this to work until I set zkCli’s setting substantially higher 
than the zkServer’s, but I *did* get it to work and have since cleared the 
queue for that given system.

However, I’m beating my head against a wall for our other system. I’ve set all 
of the exact same settings and am having no luck rmr’ing the node. I’ve tried 
bumping the maxbuffer settings to 2-4x as high and still no luck. Every attempt 
from zkCli results in "ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /overseer/queue"  

I’m at my wits end here. I’ve checked everything over and over and cannot see 
any reason why this should not be working. It appears as a correctly set JVM 
arg when I grep the zookeeper process. Any advice from anyone is appreciated!

--  
James Hardwick

Reply via email to