Ouch, I forgot to place the Zookeeper logs on web. Since they do not
include timestamps and I have restarted MCF after a few changes, I guess
it will be difficult to get the relevant lines. I'll do that next time
it hangs, probably in the end of the day.
I will add the new Zookeeper configuration settings as Lalit suggested
next time I'm restarting MCF.
How many worker threads are you using? How many documents (about) do
you crawl before things hang?
Throttling -> max connections: 30
Throttling -> Max fetches/min: 100
Bandwith -> max connections: 25
Bandwith -> max kbytes/sec: 8000
Bandwith -> max fetches/min: 20
I have four jobs configured. The one I'm running now has 100,000
documents configured. Totally around 110,000 documents for all four jobs.
I guess there are more documents involved since the largest job excludes
a lot of documents based on sophisticated and complex filtering rules.
Maybe 50% more even though they are not added to Solr (but they are of
course fetched).
Erlend
You may also want to try to increase the parameter: maxClientCnxns in
zookeeper.cfg to something bigger, if you have a lot of worker threads.
I'm thinking 1000 or some such. See if it makes a difference for you.
I'll try that at next restart.
Erlend