Thanks Karl,

While crawling is very slow, its taking long so a bit of frustrating and as
i have multiple high volume jobs that too in parallel, it does not seem to
be a good thing.

I have also raised it on Zookeeper forums @
http://zookeeper-user.578899.n2.nabble.com/Getting-errors-in-zookeeper-logs-td7580260.html
but waiting for reply.

Regards.

On Mon, Sep 15, 2014 at 6:51 PM, Karl Wright <[email protected]> wrote:

> HI Lalit,
>
> When MCF cannot reach zookeeper, MCF crawls will pause until the zookeeper
> connections are reestablished.  Then the crawls should resume.  This should
> *not* abort your crawls, but it will make them very slow.
>
> I am not a zookeeper expert, so I would post on their message boards to
> see if there is any adjustment that can be made to zookeeper parameters
> that would improve zookeeper behavior when you have a flaky network.
> However, since the obvious solution is to fix your network, they may not
> have a code solution for you.
>
> Thanks,
> Karl
>
>
> On Mon, Sep 15, 2014 at 9:15 AM, lalit jangra <[email protected]>
> wrote:
>
>> Thanks Karl,
>>
>> Ideally resetting connections should be taken care by zookeeper itself as
>> i could see re-establishment of connections later in logs.
>>
>> Can you suggest any way to overcome this in addition to network issue
>> resolution as my crawls are not working again and again? Anything in config
>> files etc.?
>>
>> Regards.
>>
>>
>> On Mon, Sep 15, 2014 at 6:39 PM, Karl Wright <[email protected]> wrote:
>>
>>> Hi Lalit,
>>>
>>> Zookeeper will keep working, but you should understand that you are
>>> dropping connections to your zookeeper members for unknown reasons, which
>>> is causing your crawl to stall when it happens.  This argues that perhaps
>>> you have some network flakiness of some kind.
>>>
>>> Karl
>>>
>>>
>>> On Mon, Sep 15, 2014 at 8:59 AM, lalit jangra <[email protected]>
>>> wrote:
>>>
>>>>
>>>> Hi,
>>>>
>>>> I am running cluster of two Apache ManifoldCF nodes on two separate
>>>> machines each of which having 3 zookeeper instances (total 6 instances in
>>>> cluster). When i am running up manifoldCF agents, i see below warning
>>>> during startup.
>>>>
>>>> [http-bio-80-exec-2-SendThread(iwdc1preecma03.iwater.ie:2181)] INFO
>>>> org.apache.zookeeper.ClientCnxn - Unable to read additional data from
>>>> server sessionid 0x0, likely server has closed socket, closing socket
>>>> connection and attempting reconnect
>>>>
>>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2182)] INFO
>>>> org.apache.zookeeper.ClientCnxn - Opening socket connection to server
>>>> iwdc2preecma04.iwater.ie/10.231.72.25:2182. Will not attempt to
>>>> authenticate using SASL (unknown error)
>>>>
>>>>
>>>> Also i could see below error in logs in while agents are running.
>>>>
>>>> [http-bio-80-exec-2] INFO org.apache.zookeeper.ZooKeeper - Initiating
>>>> client connection,
>>>> connectString=iwdc1preecma03:2181,iwdc1preecma03:2182,iwdc1preecma03:2183,iwdc2preecma04:2181,iwdc2preecma04:2182,iwdc2preecma04:2183
>>>> sessionTimeout=4000
>>>> watcher=org.apache.manifoldcf.core.lockmanager.ZooKeeperConnection$ZooKeeperWatcher@51d83fd7
>>>>
>>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2182)] INFO
>>>> org.apache.zookeeper.ClientCnxn - Opening socket connection to server
>>>> iwdc2preecma04.iwater.ie/10.231.72.25:2182. Will not attempt to
>>>> authenticate using SASL (unknown error)
>>>>
>>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2182)] INFO
>>>> org.apache.zookeeper.ClientCnxn - Socket connection established to
>>>> iwdc2preecma04.iwater.ie/10.231.72.25:2182, initiating session
>>>>
>>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2182)] WARN
>>>> org.apache.zookeeper.ClientCnxn - Session 0x0 for server
>>>> iwdc2preecma04.iwater.ie/10.231.72.25:2182, unexpected error, closing
>>>> socket connection and attempting reconnect
>>>>
>>>> java.io.IOException: Connection reset by peer
>>>>
>>>>         at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>>>>
>>>>         at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>>>>
>>>>         at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:225)
>>>>
>>>>         at sun.nio.ch.IOUtil.read(IOUtil.java:193)
>>>>
>>>>         at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:375)
>>>>
>>>>         at
>>>> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68)
>>>>
>>>>         at
>>>> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
>>>>
>>>>         at
>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
>>>>
>>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2183)] INFO
>>>> org.apache.zookeeper.ClientCnxn - Opening socket connection to server
>>>> iwdc2preecma04.iwater.ie/10.231.72.25:2183. Will not attempt to
>>>> authenticate using SASL (unknown error)
>>>>
>>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2183)] INFO
>>>> org.apache.zookeeper.ClientCnxn - Socket connection established to
>>>> iwdc2preecma04.iwater.ie/10.231.72.25:2183, initiating session
>>>>
>>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2183)] INFO
>>>> org.apache.zookeeper.ClientCnxn - Session establishment complete on server
>>>> iwdc2preecma04.iwater.ie/10.231.72.25:2183, sessionid =
>>>> 0x6487851bd330078, negotiated timeout = 4000
>>>>
>>>>
>>>> Below are configurations for 1. zookeeper nodes & 2. MCF nodes for
>>>> zookeeper.
>>>>
>>>>
>>>> *zoo.cfg :  Same for all six zookeeper nodes.*
>>>>
>>>>
>>>> # The number of milliseconds of each tick
>>>>
>>>> tickTime=2000
>>>>
>>>> dataDir=/app/IW/zookeeper/data/data.1
>>>>
>>>> dataLogDir=/app/IW/zookeeper/logs/log.1
>>>>
>>>> clientPort=2181
>>>>
>>>> server.1=iwdc1preecma03:2888:3888
>>>>
>>>> server.2=iwdc1preecma03:2889:3889
>>>>
>>>> server.3=iwdc1preecma03:2890:3890
>>>>
>>>> server.4=iwdc2preecma04:2891:3891
>>>>
>>>> server.5=iwdc2preecma04:2892:3892
>>>>
>>>> server.6=iwdc2preecma04:2893:3893
>>>>
>>>> # The number of ticks that the initial
>>>>
>>>> # synchronization phase can take
>>>>
>>>> initLimit=10
>>>>
>>>> # The number of ticks that can pass between
>>>>
>>>> # sending a request and getting an acknowledgement
>>>>
>>>> syncLimit=5
>>>>
>>>> # the directory where the snapshot is stored.
>>>>
>>>> # do not use /tmp for storage, /tmp here is just
>>>>
>>>> # example sakes.
>>>>
>>>> #dataDir=/tmp/zookeeper
>>>>
>>>> # the port at which the clients will connect
>>>>
>>>> #clientPort=2181
>>>>
>>>> # the maximum number of client connections.
>>>>
>>>> # increase this if you need to handle more clients
>>>>
>>>> #maxClientCnxns=60
>>>>
>>>> #
>>>>
>>>> # Be sure to read the maintenance section of the
>>>>
>>>> # administrator guide before turning on autopurge.
>>>>
>>>> #
>>>>
>>>> #
>>>> http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
>>>>
>>>> #
>>>>
>>>> # The number of snapshots to retain in dataDir
>>>>
>>>> autopurge.snapRetainCount=3
>>>>
>>>> # Purge task interval in hours
>>>>
>>>> # Set to "0" to disable auto purge feature
>>>>
>>>> autopurge.purgeInterval=1
>>>>
>>>>
>>>>
>>>> *ManifoldCF configurations : same for both ManifoldCF nodes.*
>>>>
>>>>
>>>> <property name="org.apache.manifoldcf.lockmanagerclass"
>>>> value="org.apache.manifoldcf.core.lockmanager.ZooKeeperLockManager"/>
>>>>
>>>>   <property name="org.apache.manifoldcf.zookeeper.connectstring"
>>>> value="iwdc1preecma03:2181,iwdc1preecma03:2182,iwdc1preecma03:2183,iwdc2preecma04:2181,iwdc2preecma04:2182,iwdc2preecma04:2183"/>
>>>>
>>>> <property name="org.apache.manifoldcf.zookeeper.sessiontimeout"
>>>> value="4000"/>
>>>>
>>>>
>>>>
>>>> *I want to know if due to above warnings/errors, will zookeeper stop
>>>> working or will zookeeper will work and these are non-failing messages,
>>>> because ManifoldCF jobs are stuck while i can see these errors.*
>>>>
>>>> Please suggest.
>>>>
>>>> Regards,
>>>> Lalit.
>>>>
>>>>
>>>
>>
>>
>> --
>> Regards,
>> Lalit.
>>
>
>


-- 
Regards,
Lalit.

Reply via email to