Hi Lalit,

If you have more than one unspecified Java process, EACH ONE will allocate
25% of available memory by default.  So you will have to do more than just
free up some MCF memory to get this to work.

Karl


On Mon, Sep 15, 2014 at 12:29 PM, lalit jangra <[email protected]>
wrote:

> Thanks Karl,
>
> I think this is the reason why my zookeeper nodes are resetting connection
> due to instability. What i will try in the meantime is to reduce MCF memory
> to 1.5G and leave rest unassigned so that will to 5.5 G for Java itself ,
> more than 25% rule and see if it works.
>
> I also checked out Zookeeper documentation but no specific inputs i could
> take from it.
>
> Regards.
>
> On Mon, Sep 15, 2014 at 9:52 PM, Karl Wright <[email protected]> wrote:
>
>> Hi Lalit,
>>
>> I can't speak for Solr's memory consumption, but you absolutely need to
>> give Solr enough memory to avoid OOM errors or things will not work
>> properly.
>>
>> As for MCF, 3G is more than enough; probably you could give it 1G and be
>> fine.
>>
>> For Zookeeper, remember that it is a Java process.  On 64-bit unix
>> machines, Java by default takes 25% of the total system memory.  I would
>> look at their documentation to figure out what they need, and assign
>> precisely that amount, otherwise zk will obviously not be stable.
>>
>> Thanks,
>> Karl
>>
>>
>> On Mon, Sep 15, 2014 at 12:17 PM, lalit jangra <[email protected]>
>> wrote:
>>
>>> Hi Karl,
>>>
>>> Out of 12G, i have assigned 5G to solr as i could see a lot of Out of
>>> Memory errors/Java heap space issues while crawling large jobs,after which
>>> it seems to be OK. Also i have assigned 3G to MCF where it is quire
>>> comfortable. In rest of 4G, i am assuming is enough for OS & zookeeper
>>> nodes. I am currently running job for 35K documents & i could see more than
>>> 500MB memory free.
>>>
>>> Any thoughts?
>>>
>>> Regards.
>>>
>>> On Mon, Sep 15, 2014 at 8:45 PM, Karl Wright <[email protected]> wrote:
>>>
>>>> HI Lalit,
>>>>
>>>> The best way in Java to assess memory usage is to turn on JVM garbage
>>>> collection verbose output.  Then you can see how often the system garbage
>>>> collects etc, and whether post-GC usage grows over time.
>>>>
>>>> 12G should be more than enough, so if you find you are running into
>>>> memory limits with that configuration, it would be worth trying to figure
>>>> out why that is happening.
>>>>
>>>> Karl
>>>>
>>>>
>>>> On Mon, Sep 15, 2014 at 11:04 AM, lalit jangra <
>>>> [email protected]> wrote:
>>>>
>>>>> Hi Karl,
>>>>>
>>>>> Can i see zookeeper connection reset messages due to system running on
>>>>> top of memory limits as i have 12G of RAM and can see its using 11.5G 
>>>>> while
>>>>> job is running?
>>>>>
>>>>>
>>>>> Is there any way i should ascertain memory to zookeeper nodes & if so,
>>>>> is there any yardstick?
>>>>>
>>>>> Regards.
>>>>>
>>>>> On Mon, Sep 15, 2014 at 7:16 PM, Karl Wright <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hi Lalit,
>>>>>>
>>>>>> Looks like this is the result of a tomcat shutdown, and is a probable
>>>>>> race condition bug in Zookeeper:
>>>>>>
>>>>>>
>>>>>> http://mail-archives.apache.org/mod_mbox/tomcat-users/201306.mbox/%[email protected]%3E
>>>>>>
>>>>>> Karl
>>>>>>
>>>>>>
>>>>>> On Mon, Sep 15, 2014 at 9:41 AM, lalit jangra <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Hi Karl,
>>>>>>>
>>>>>>> Along with this, i could see below errors in tomcat catalina.out.
>>>>>>>
>>>>>>> Sep 15, 2014 1:06:14 PM org.apache.catalina.loader.WebappClassLoader
>>>>>>> loadClass
>>>>>>>
>>>>>>> INFO: Illegal access: this web application instance has been stopped
>>>>>>> already.  Could not load org.apache.zookeeper.server.ZooTrace.  The
>>>>>>> eventual following stack trace is caused by an error thrown for 
>>>>>>> debugging
>>>>>>> purposes as well as to attempt to terminate the thread which caused the
>>>>>>> illegal access, and has no functional impact.
>>>>>>>
>>>>>>> java.lang.IllegalStateException
>>>>>>>
>>>>>>>         at
>>>>>>> org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1612)
>>>>>>>
>>>>>>>         at
>>>>>>> org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1571)
>>>>>>>
>>>>>>>         at
>>>>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1115)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> [http-bio-80-exec-1-SendThread(iwdc2preecma04.iwater.ie:2183)]
>>>>>>> ERROR org.apache.zookeeper.ClientCnxn - from 
>>>>>>> http-bio-80-exec-1-SendThread(
>>>>>>> iwdc2preecma04.iwater.ie:2183)
>>>>>>>
>>>>>>> java.lang.NoClassDefFoundError: org/apache/zookeeper/server/ZooTrace
>>>>>>>
>>>>>>>         at
>>>>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1115)
>>>>>>>
>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>> org.apache.zookeeper.server.ZooTrace
>>>>>>>
>>>>>>>         at
>>>>>>> org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1720)
>>>>>>>
>>>>>>>         at
>>>>>>> org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1571)
>>>>>>>
>>>>>>>         ... 1 more
>>>>>>>
>>>>>>> [http-bio-80-exec-1-SendThread(iwdc2preecma04.iwater.ie:2182)]
>>>>>>> ERROR org.apache.zookeeper.ClientCnxn - from 
>>>>>>> http-bio-80-exec-1-SendThread(
>>>>>>> iwdc2preecma04.iwater.ie:2182)
>>>>>>>
>>>>>>> Sep 15, 2014 1:06:14 PM org.apache.coyote.AbstractProtocol destroy
>>>>>>>
>>>>>>> INFO: Destroying ProtocolHandler ["http-bio-80"]
>>>>>>>
>>>>>>> java.lang.NoClassDefFoundError: org/apache/zookeeper/server/ZooTrace
>>>>>>>
>>>>>>> Regards.
>>>>>>>
>>>>>>> On Mon, Sep 15, 2014 at 7:05 PM, lalit jangra <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Thanks Karl,
>>>>>>>>
>>>>>>>> While crawling is very slow, its taking long so a bit of
>>>>>>>> frustrating and as i have multiple high volume jobs that too in 
>>>>>>>> parallel,
>>>>>>>> it does not seem to be a good thing.
>>>>>>>>
>>>>>>>> I have also raised it on Zookeeper forums @
>>>>>>>> http://zookeeper-user.578899.n2.nabble.com/Getting-errors-in-zookeeper-logs-td7580260.html
>>>>>>>> but waiting for reply.
>>>>>>>>
>>>>>>>> Regards.
>>>>>>>>
>>>>>>>> On Mon, Sep 15, 2014 at 6:51 PM, Karl Wright <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> HI Lalit,
>>>>>>>>>
>>>>>>>>> When MCF cannot reach zookeeper, MCF crawls will pause until the
>>>>>>>>> zookeeper connections are reestablished.  Then the crawls should 
>>>>>>>>> resume.
>>>>>>>>> This should *not* abort your crawls, but it will make them very slow.
>>>>>>>>>
>>>>>>>>> I am not a zookeeper expert, so I would post on their message
>>>>>>>>> boards to see if there is any adjustment that can be made to zookeeper
>>>>>>>>> parameters that would improve zookeeper behavior when you have a flaky
>>>>>>>>> network.  However, since the obvious solution is to fix your network, 
>>>>>>>>> they
>>>>>>>>> may not have a code solution for you.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Karl
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Sep 15, 2014 at 9:15 AM, lalit jangra <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> Thanks Karl,
>>>>>>>>>>
>>>>>>>>>> Ideally resetting connections should be taken care by zookeeper
>>>>>>>>>> itself as i could see re-establishment of connections later in logs.
>>>>>>>>>>
>>>>>>>>>> Can you suggest any way to overcome this in addition to network
>>>>>>>>>> issue resolution as my crawls are not working again and again? 
>>>>>>>>>> Anything in
>>>>>>>>>> config files etc.?
>>>>>>>>>>
>>>>>>>>>> Regards.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Sep 15, 2014 at 6:39 PM, Karl Wright <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Lalit,
>>>>>>>>>>>
>>>>>>>>>>> Zookeeper will keep working, but you should understand that you
>>>>>>>>>>> are dropping connections to your zookeeper members for unknown 
>>>>>>>>>>> reasons,
>>>>>>>>>>> which is causing your crawl to stall when it happens.  This argues 
>>>>>>>>>>> that
>>>>>>>>>>> perhaps you have some network flakiness of some kind.
>>>>>>>>>>>
>>>>>>>>>>> Karl
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Sep 15, 2014 at 8:59 AM, lalit jangra <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> I am running cluster of two Apache ManifoldCF nodes on two
>>>>>>>>>>>> separate machines each of which having 3 zookeeper instances 
>>>>>>>>>>>> (total 6
>>>>>>>>>>>> instances in cluster). When i am running up manifoldCF agents, i 
>>>>>>>>>>>> see below
>>>>>>>>>>>> warning during startup.
>>>>>>>>>>>>
>>>>>>>>>>>> [http-bio-80-exec-2-SendThread(iwdc1preecma03.iwater.ie:2181)]
>>>>>>>>>>>> INFO org.apache.zookeeper.ClientCnxn - Unable to read additional 
>>>>>>>>>>>> data from
>>>>>>>>>>>> server sessionid 0x0, likely server has closed socket, closing 
>>>>>>>>>>>> socket
>>>>>>>>>>>> connection and attempting reconnect
>>>>>>>>>>>>
>>>>>>>>>>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2182)]
>>>>>>>>>>>> INFO org.apache.zookeeper.ClientCnxn - Opening socket connection 
>>>>>>>>>>>> to server
>>>>>>>>>>>> iwdc2preecma04.iwater.ie/10.231.72.25:2182. Will not attempt
>>>>>>>>>>>> to authenticate using SASL (unknown error)
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Also i could see below error in logs in while agents are
>>>>>>>>>>>> running.
>>>>>>>>>>>>
>>>>>>>>>>>> [http-bio-80-exec-2] INFO org.apache.zookeeper.ZooKeeper -
>>>>>>>>>>>> Initiating client connection,
>>>>>>>>>>>> connectString=iwdc1preecma03:2181,iwdc1preecma03:2182,iwdc1preecma03:2183,iwdc2preecma04:2181,iwdc2preecma04:2182,iwdc2preecma04:2183
>>>>>>>>>>>> sessionTimeout=4000
>>>>>>>>>>>> watcher=org.apache.manifoldcf.core.lockmanager.ZooKeeperConnection$ZooKeeperWatcher@51d83fd7
>>>>>>>>>>>>
>>>>>>>>>>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2182)]
>>>>>>>>>>>> INFO org.apache.zookeeper.ClientCnxn - Opening socket connection 
>>>>>>>>>>>> to server
>>>>>>>>>>>> iwdc2preecma04.iwater.ie/10.231.72.25:2182. Will not attempt
>>>>>>>>>>>> to authenticate using SASL (unknown error)
>>>>>>>>>>>>
>>>>>>>>>>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2182)]
>>>>>>>>>>>> INFO org.apache.zookeeper.ClientCnxn - Socket connection 
>>>>>>>>>>>> established to
>>>>>>>>>>>> iwdc2preecma04.iwater.ie/10.231.72.25:2182, initiating session
>>>>>>>>>>>>
>>>>>>>>>>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2182)]
>>>>>>>>>>>> WARN org.apache.zookeeper.ClientCnxn - Session 0x0 for server
>>>>>>>>>>>> iwdc2preecma04.iwater.ie/10.231.72.25:2182, unexpected error,
>>>>>>>>>>>> closing socket connection and attempting reconnect
>>>>>>>>>>>>
>>>>>>>>>>>> java.io.IOException: Connection reset by peer
>>>>>>>>>>>>
>>>>>>>>>>>>         at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>>>>>>>>>>>>
>>>>>>>>>>>>         at
>>>>>>>>>>>> sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>>>>>>>>>>>>
>>>>>>>>>>>>         at
>>>>>>>>>>>> sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:225)
>>>>>>>>>>>>
>>>>>>>>>>>>         at sun.nio.ch.IOUtil.read(IOUtil.java:193)
>>>>>>>>>>>>
>>>>>>>>>>>>         at
>>>>>>>>>>>> sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:375)
>>>>>>>>>>>>
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68)
>>>>>>>>>>>>
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
>>>>>>>>>>>>
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
>>>>>>>>>>>>
>>>>>>>>>>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2183)]
>>>>>>>>>>>> INFO org.apache.zookeeper.ClientCnxn - Opening socket connection 
>>>>>>>>>>>> to server
>>>>>>>>>>>> iwdc2preecma04.iwater.ie/10.231.72.25:2183. Will not attempt
>>>>>>>>>>>> to authenticate using SASL (unknown error)
>>>>>>>>>>>>
>>>>>>>>>>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2183)]
>>>>>>>>>>>> INFO org.apache.zookeeper.ClientCnxn - Socket connection 
>>>>>>>>>>>> established to
>>>>>>>>>>>> iwdc2preecma04.iwater.ie/10.231.72.25:2183, initiating session
>>>>>>>>>>>>
>>>>>>>>>>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2183)]
>>>>>>>>>>>> INFO org.apache.zookeeper.ClientCnxn - Session establishment 
>>>>>>>>>>>> complete on
>>>>>>>>>>>> server iwdc2preecma04.iwater.ie/10.231.72.25:2183, sessionid =
>>>>>>>>>>>> 0x6487851bd330078, negotiated timeout = 4000
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Below are configurations for 1. zookeeper nodes & 2. MCF nodes
>>>>>>>>>>>> for zookeeper.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *zoo.cfg :  Same for all six zookeeper nodes.*
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> # The number of milliseconds of each tick
>>>>>>>>>>>>
>>>>>>>>>>>> tickTime=2000
>>>>>>>>>>>>
>>>>>>>>>>>> dataDir=/app/IW/zookeeper/data/data.1
>>>>>>>>>>>>
>>>>>>>>>>>> dataLogDir=/app/IW/zookeeper/logs/log.1
>>>>>>>>>>>>
>>>>>>>>>>>> clientPort=2181
>>>>>>>>>>>>
>>>>>>>>>>>> server.1=iwdc1preecma03:2888:3888
>>>>>>>>>>>>
>>>>>>>>>>>> server.2=iwdc1preecma03:2889:3889
>>>>>>>>>>>>
>>>>>>>>>>>> server.3=iwdc1preecma03:2890:3890
>>>>>>>>>>>>
>>>>>>>>>>>> server.4=iwdc2preecma04:2891:3891
>>>>>>>>>>>>
>>>>>>>>>>>> server.5=iwdc2preecma04:2892:3892
>>>>>>>>>>>>
>>>>>>>>>>>> server.6=iwdc2preecma04:2893:3893
>>>>>>>>>>>>
>>>>>>>>>>>> # The number of ticks that the initial
>>>>>>>>>>>>
>>>>>>>>>>>> # synchronization phase can take
>>>>>>>>>>>>
>>>>>>>>>>>> initLimit=10
>>>>>>>>>>>>
>>>>>>>>>>>> # The number of ticks that can pass between
>>>>>>>>>>>>
>>>>>>>>>>>> # sending a request and getting an acknowledgement
>>>>>>>>>>>>
>>>>>>>>>>>> syncLimit=5
>>>>>>>>>>>>
>>>>>>>>>>>> # the directory where the snapshot is stored.
>>>>>>>>>>>>
>>>>>>>>>>>> # do not use /tmp for storage, /tmp here is just
>>>>>>>>>>>>
>>>>>>>>>>>> # example sakes.
>>>>>>>>>>>>
>>>>>>>>>>>> #dataDir=/tmp/zookeeper
>>>>>>>>>>>>
>>>>>>>>>>>> # the port at which the clients will connect
>>>>>>>>>>>>
>>>>>>>>>>>> #clientPort=2181
>>>>>>>>>>>>
>>>>>>>>>>>> # the maximum number of client connections.
>>>>>>>>>>>>
>>>>>>>>>>>> # increase this if you need to handle more clients
>>>>>>>>>>>>
>>>>>>>>>>>> #maxClientCnxns=60
>>>>>>>>>>>>
>>>>>>>>>>>> #
>>>>>>>>>>>>
>>>>>>>>>>>> # Be sure to read the maintenance section of the
>>>>>>>>>>>>
>>>>>>>>>>>> # administrator guide before turning on autopurge.
>>>>>>>>>>>>
>>>>>>>>>>>> #
>>>>>>>>>>>>
>>>>>>>>>>>> #
>>>>>>>>>>>> http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
>>>>>>>>>>>>
>>>>>>>>>>>> #
>>>>>>>>>>>>
>>>>>>>>>>>> # The number of snapshots to retain in dataDir
>>>>>>>>>>>>
>>>>>>>>>>>> autopurge.snapRetainCount=3
>>>>>>>>>>>>
>>>>>>>>>>>> # Purge task interval in hours
>>>>>>>>>>>>
>>>>>>>>>>>> # Set to "0" to disable auto purge feature
>>>>>>>>>>>>
>>>>>>>>>>>> autopurge.purgeInterval=1
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *ManifoldCF configurations : same for both ManifoldCF nodes.*
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> <property name="org.apache.manifoldcf.lockmanagerclass"
>>>>>>>>>>>> value="org.apache.manifoldcf.core.lockmanager.ZooKeeperLockManager"/>
>>>>>>>>>>>>
>>>>>>>>>>>>   <property
>>>>>>>>>>>> name="org.apache.manifoldcf.zookeeper.connectstring"
>>>>>>>>>>>> value="iwdc1preecma03:2181,iwdc1preecma03:2182,iwdc1preecma03:2183,iwdc2preecma04:2181,iwdc2preecma04:2182,iwdc2preecma04:2183"/>
>>>>>>>>>>>>
>>>>>>>>>>>> <property name="org.apache.manifoldcf.zookeeper.sessiontimeout"
>>>>>>>>>>>> value="4000"/>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *I want to know if due to above warnings/errors, will zookeeper
>>>>>>>>>>>> stop working or will zookeeper will work and these are non-failing
>>>>>>>>>>>> messages, because ManifoldCF jobs are stuck while i can see these 
>>>>>>>>>>>> errors.*
>>>>>>>>>>>>
>>>>>>>>>>>> Please suggest.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Lalit.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Regards,
>>>>>>>>>> Lalit.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Regards,
>>>>>>>> Lalit.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Regards,
>>>>>>> Lalit.
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Regards,
>>>>> Lalit.
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Regards,
>>> Lalit.
>>>
>>
>>
>
>
> --
> Regards,
> Lalit.
>

Reply via email to