[ 
https://issues.apache.org/jira/browse/CONNECTORS-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15907305#comment-15907305
 ] 

Karl Wright commented on CONNECTORS-1395:
-----------------------------------------

[~guystanden]: Have a look at the end of the log:

{code}
[SessionTracker] INFO org.apache.zookeeper.server.ZooKeeperServer - Expiring 
session 0x15ab972b0fb0475, timeout of 10000ms exceeded
[SessionTracker] INFO org.apache.zookeeper.server.ZooKeeperServer - Expiring 
session 0x15ab972b0fb0478, timeout of 10000ms exceeded
[SessionTracker] INFO org.apache.zookeeper.server.ZooKeeperServer - Expiring 
session 0x15ab972b0fb0471, timeout of 10000ms exceeded
[SessionTracker] INFO org.apache.zookeeper.server.ZooKeeperServer - Expiring 
session 0x15ab972b0fb047c, timeout of 10000ms exceeded
[SessionTracker] INFO org.apache.zookeeper.server.ZooKeeperServer - Expiring 
session 0x15ab972b0fb047e, timeout of 10000ms exceeded
[SessionTracker] INFO org.apache.zookeeper.server.ZooKeeperServer - Expiring 
session 0x15ab972b0fb0479, timeout of 10000ms exceeded
[SessionTracker] INFO org.apache.zookeeper.server.ZooKeeperServer - Expiring 
session 0x15ab972b0fb0476, timeout of 10000ms exceeded
[SessionTracker] INFO org.apache.zookeeper.server.ZooKeeperServer - Expiring 
session 0x15ab972b0fb047b, timeout of 10000ms exceeded
[SessionTracker] INFO org.apache.zookeeper.server.ZooKeeperServer - Expiring 
session 0x15ab972b0fb0473, timeout of 10000ms exceeded
[SessionTracker] INFO org.apache.zookeeper.server.ZooKeeperServer - Expiring 
session 0x15ab972b0fb0472, timeout of 10000ms exceeded
[SessionTracker] INFO org.apache.zookeeper.server.ZooKeeperServer - Expiring 
session 0x15ab972b0fb047a, timeout of 10000ms exceeded
[SessionTracker] INFO org.apache.zookeeper.server.ZooKeeperServer - Expiring 
session 0x15ab972b0fb047d, timeout of 10000ms exceeded
[SessionTracker] INFO org.apache.zookeeper.server.ZooKeeperServer - Expiring 
session 0x15ab972b0fb0474, timeout of 10000ms exceeded
[SessionTracker] INFO org.apache.zookeeper.server.ZooKeeperServer - Expiring 
session 0x15ab972b0fb0477, timeout of 10000ms exceeded
[SessionTracker] INFO org.apache.zookeeper.server.ZooKeeperServer - Expiring 
session 0x15ab972b0fb047f, timeout of 10000ms exceeded
{code}

and

{code}
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:8349] WARN 
org.apache.zookeeper.server.NIOServerCnxn - Exception causing close of session 
0x15ab972b0fb0411 due to java.io.IOException: An established connection was 
aborted by the software in your host machine
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:8349] INFO 
org.apache.zookeeper.server.NIOServerCnxn - Closed socket connection for client 
/0:0:0:0:0:0:0:1:49337 which had sessionid 0x15ab972b0fb0411
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:8349] WARN 
org.apache.zookeeper.server.NIOServerCnxn - Exception causing close of session 
0x15ab972b0fb0481 due to java.io.IOException: An established connection was 
aborted by the software in your host machine
[SyncThread:0] INFO org.apache.zookeeper.server.ZooKeeperServer - Established 
session 0x15ab972b0fb0484 with negotiated timeout 10000 for client 
/0:0:0:0:0:0:0:1:49357
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:8349] INFO 
org.apache.zookeeper.server.NIOServerCnxn - Closed socket connection for client 
/127.0.0.1:49346 which had sessionid 0x15ab972b0fb0481
{code}

These are not good and from what I can see they will cause locks to be dropped. 
 The question is: why are you seeing them? 

It might be configuration and/or the load on your machine, but it is clearly 
the smoking gun as far as your failures are concerned.

Let's look at the client-side session timeout.  There is a ManifoldCF property 
which controls this (in the general properties file):

{code}
name = org.apache.manifoldcf.zookeeper.sessiontimeout, value = time in 
milliseconds
{code}

However, the default is already 5 minutes:

{code}
        int sessionTimeout = 
ManifoldCF.getIntProperty(zookeeperSessionTimeoutParameter,300000);
{code}

But the example properties.xml value is only 2000:

{code}
  <property name="org.apache.manifoldcf.zookeeper.sessiontimeout" value="2000"/>
{code}

I suggest therefore that you increase the value of this property to something 
like 300000, and see what happens then.  This won't fix the sessions that get 
closed underneath Zookeeper, but it will prevent the session timeouts, I hope.

Thanks!


> Unexpected jobqueue status - record id 1488898668325, expecting active 
> status, saw 4
> ------------------------------------------------------------------------------------
>
>                 Key: CONNECTORS-1395
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1395
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Framework core
>    Affects Versions: ManifoldCF 2.6
>            Reporter: Karl Wright
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 2.7
>
>         Attachments: Agents Stack 4.txt, Manifoldcf log4.txt, MCF Log 2.txt, 
> PostgreSQL Log Extract 4.txt, WebApps Stack 4.txt, ZK Console 2.txt, ZK 
> Console Output.txt, ZK log 4.txt
>
>
> User saw this in the log, after which the system hung:
> {code}
> ERROR 2017-03-08 00:25:30,433 (Worker thread '14') - Exception tossed: 
> Unexpected jobqueue status - record id 1488898668325, expecting active 
> status, saw 4
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Unexpected 
> jobqueue status - record id 1488898668325, expecting active status, saw 4
>                 at 
> org.apache.manifoldcf.crawler.jobs.JobQueue.updateCompletedRecord(JobQueue.java:1019)
>                 at 
> org.apache.manifoldcf.crawler.jobs.JobManager.markDocumentCompletedMultiple(JobManager.java:3271)
>                 at 
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:710)
> WARN 2017-03-08 00:25:30,449 (Worker thread '23') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:25:30,449 (Worker thread '24') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:25:30,464 (Worker thread '9') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:25:30,464 (Worker thread '0') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:25:31,900 (Worker thread '11') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:25:31,900 (Worker thread '29') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:25:32,867 (Worker thread '10') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:25:32,867 (Worker thread '2') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:25:33,335 (Worker thread '8') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:25:36,642 (Worker thread '20') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:25:37,422 (Worker thread '21') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:25:38,280 (Worker thread '22') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:25:38,280 (Worker thread '3') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:25:38,280 (Worker thread '5') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:25:38,826 (Worker thread '28') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:25:39,045 (Worker thread '13') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:25:45,425 (Worker thread '4') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:25:45,425 (Worker thread '15') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:25:45,425 (Worker thread '17') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:25:46,392 (Worker thread '25') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:25:46,392 (Worker thread '27') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:26:11,043 (Worker thread '1') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:26:35,817 (Worker thread '19') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:26:35,817 (Worker thread '26') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:26:36,753 (Worker thread '7') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:26:39,248 (Worker thread '6') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:26:39,248 (Worker thread '18') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> WARN 2017-03-08 00:26:43,129 (Worker thread '16') - Service interruption 
> reported for job 1488898090224 connection 'web': Job no longer active
> FATAL 2017-03-08 00:32:24,819 (Idle cleanup thread) - Error tossed: Can't 
> release lock we don't hold
> java.lang.IllegalStateException: Can't release lock we don't hold”
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to