Believe it or not, I was able to reproduce this here with a crawl of 100000
documents.  I get this in the Zookeeper server-side log, hundreds of times:

>>>>>>
[SyncThread:0] ERROR org.apache.zookeeper.server.NIOServerCnxn - Unexpected
Exce
ption:
java.nio.channels.CancelledKeyException
        at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
        at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)
        at
org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.ja
va:153)
        at
org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.
java:1076)
        at
org.apache.zookeeper.server.FinalRequestProcessor.processRequest(Fina
lRequestProcessor.java:170)
        at
org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestPro
cessor.java:167)
        at
org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProce
ssor.java:101)
[SyncThread:0] ERROR org.apache.zookeeper.server.NIOServerCnxn - Unexpected
Exce
ption:
java.nio.channels.CancelledKeyException
        at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
        at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)
        at
org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.ja
va:153)
        at
org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.
java:1076)
        at
org.apache.zookeeper.server.FinalRequestProcessor.processRequest(Fina
lRequestProcessor.java:170)
        at
org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestPro
cessor.java:167)
        at
org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProce
ssor.java:101)
<<<<<<

... and then everything locks up.  I have no idea what is happening; seems
to be an NIO exception ZooKeeper is not expecting.

Karl


On Tue, Sep 16, 2014 at 7:52 AM, Erlend Garåsen <[email protected]>
wrote:

>
> Ouch, I forgot to place the Zookeeper logs on web. Since they do not
> include timestamps and I have restarted MCF after a few changes, I guess it
> will be difficult to get the relevant lines. I'll do that next time it
> hangs, probably in the end of the day.
>
> I will add the new Zookeeper configuration settings as Lalit suggested
> next time I'm restarting MCF.
>
>  How many worker threads are you using?  How many documents (about) do
>> you crawl before things hang?
>>
>
> Throttling -> max connections: 30
> Throttling -> Max fetches/min: 100
> Bandwith -> max connections: 25
> Bandwith -> max kbytes/sec: 8000
> Bandwith -> max fetches/min: 20
>
> I have four jobs configured. The one I'm running now has 100,000 documents
> configured. Totally around 110,000 documents for all four jobs.
>
> I guess there are more documents involved since the largest job excludes a
> lot of documents based on sophisticated and complex filtering rules. Maybe
> 50% more even though they are not added to Solr (but they are of course
> fetched).
>
> Erlend
>
>
>> You may also want to try to increase the parameter: maxClientCnxns in
>> zookeeper.cfg to something bigger, if you have a lot of worker threads.
>> I'm thinking 1000 or some such.  See if it makes a difference for you.
>>
>
> I'll try that at next restart.
>
> Erlend
>

Reply via email to