[jira] [Commented] (CONNECTORS-1592) Found long running query in manifold scheduled job

2019-04-10 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814326#comment-16814326
 ] 

Karl Wright commented on CONNECTORS-1592:
-

[~goovaertsr] Yes, if you have no intention of doing hopcount filtering ever, 
then disable hop count filtering forever.  It's far easier on the database.

Having said that, I'm pretty sure you have other problems too.


> Found long running query in manifold scheduled job
> --
>
> Key: CONNECTORS-1592
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1592
> Project: ManifoldCF
>  Issue Type: Bug
>Affects Versions: ManifoldCF 2.12
>Reporter: Subasini Rath
>Priority: Major
> Attachments: LongRunningWithPlan_thread39.txt, 
> SELECT_blocked_queries.txt, postgresql.conf, properties.xml
>
>
> Hi Karl,
>    I am also facing the above mentioned issue. (Similar to Connector-880)
> I am using manifold2.12 binary version. I am using Solr output connector and 
> Web repository connection. Manifold is using all default configuration.
> When I am running the jobs manually, it runs fine. Same jobs have been 
> scheduled to run everyday.
> I am getting below exceptions and the job gets hanged/ going to waiting stage.
> Could you please help me in resolving the same.
> I am getting the below error -
> Scenario-1
> WARN 2019-03-08T23:58:20,338 (qtp550147359-413) - Found a long-running query 
> (2706114 ms): [SELECT 
> t0.id,t0.description,t0.status,t0.starttime,t0.endtime,t0.errortext FROM jobs 
> t0 ORDER BY description ASC]
>  WARN 2019-03-08T23:58:20,337 (Document delete stuffer thread) - Found a 
> long-running query (2737370 ms): [SELECT id FROM jobs WHERE status=? LIMIT 1]
>  WARN 2019-03-08T23:58:20,339 (Job reset thread) - Found a long-running query 
> (2770133 ms): [SELECT id FROM jobs WHERE status IN (?,?)]
>  WARN 2019-03-08T23:58:20,386 (Document delete stuffer thread) - Parameter 0: 
> 'e'
>  WARN 2019-03-08T23:58:20,337 (Set priority thread) - Found a long-running 
> query (2732379 ms): [SELECT id,dochash,docid,jobid FROM jobqueue WHERE 
> needpriority=? LIMIT 1000]
>  WARN 2019-03-08T23:58:20,386 (Set priority thread) - Parameter 0: 'T'
>  WARN 2019-03-08T23:58:20,386 (Job reset thread) - Parameter 0: 'I'
>  WARN 2019-03-08T23:58:20,386 (Job reset thread) - Parameter 1: 'i'
>  WARN 2019-03-08T23:58:20,372 (Seeding thread) - Parameter 2: '1552047176062'
>  WARN 2019-03-08T23:58:20,474 (Document cleanup stuffer thread) - Found a 
> long-running query (2737524 ms): [SELECT id FROM jobs WHERE status=? LIMIT 1]
>  WARN 2019-03-08T23:58:20,474 (Document cleanup stuffer thread) - Parameter 
> 0: 'S'
>  WARN 2019-03-08T23:58:20,474 (Finisher thread) - Found a long-running query 
> (2752034 ms): [SELECT id FROM jobs WHERE status IN (?,?,?) FOR UPDATE]
>  WARN 2019-03-08T23:58:20,474 (Finisher thread) - Parameter 0: 'A'
>  WARN 2019-03-08T23:58:20,475 (Finisher thread) - Parameter 1: 'W'
>  WARN 2019-03-08T23:58:20,475 (Finisher thread) - Parameter 2: 'R'
>  WARN 2019-03-08T23:58:20,475 (Delete startup thread) - Found a long-running 
> query (2752036 ms): [SELECT id FROM jobs WHERE status=? FOR UPDATE]
>  WARN 2019-03-08T23:58:20,475 (Delete startup thread) - Parameter 0: 'E'
>  WARN 2019-03-08T23:58:20,483 (qtp550147359-4339) - Found a long-running 
> query (2496641 ms): [SELECT 
> t0.id,t0.description,t0.status,t0.starttime,t0.endtime,t0.errortext FROM jobs 
> t0 ORDER BY description ASC]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: 
> isDistinctSelect=[false]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: isGrouped=[false]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: isAggregated=[false]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: columns=[ COLUMN: 
> PUBLIC.JOBS.ID not nullable
>  WARN 2019-03-08T23:58:20,492 (qtp550147359-4346) - Found a long-running 
> query (2435908 ms): [SELECT 
> t0.id,t0.description,t0.status,t0.starttime,t0.endtime,t0.errortext FROM jobs 
> t0 ORDER BY description ASC]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: 
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: ]
>  WARN 2019-03-08T23:58:20,499 (Finisher thread) - Plan: [range variable 1
>  WARN 2019-03-08T23:58:20,499 (Finisher thread) - Plan: join type=INNER
>  WARN 2019-03-08T23:58:20,499 (Finisher thread) - Plan: table=SYSTEM_SUBQUERY
>  WARN 2019-03-08T23:58:20,499 (Finisher thread) - Plan: cardinality=0
>  WARN 2019-03-08T23:58:20,499 (Finisher thread) - Plan: access=FULL SCAN
>  WARN 2019-03-08T23:58:20,500 (Finisher thread) - Plan: join condition = 
> [index=SYS_IDX_13329
>  WARN 2019-03-08T23:58:20,500 (Finisher thread) - Plan: ]
>  WARN 2019-03-08T23:58:20,500 (Finisher thread) - Plan: ][range variable 2
>  WARN 2019-03-08T23:58:20,500 (Finisher thread) - Plan: join 

[jira] [Commented] (CONNECTORS-1592) Found long running query in manifold scheduled job

2019-04-10 Thread roel goovaerts (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814313#comment-16814313
 ] 

roel goovaerts commented on CONNECTORS-1592:


The setting of the hop count mode was kept like this on the justification of 
requirements. 
But I think I'm following you now, I interpreted it as disabling the whole 
'tab'.
If i understand correctly with "disabling hop count filtering", you mean 
setting it to "keep unreachable documents, forever"?

> Found long running query in manifold scheduled job
> --
>
> Key: CONNECTORS-1592
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1592
> Project: ManifoldCF
>  Issue Type: Bug
>Affects Versions: ManifoldCF 2.12
>Reporter: Subasini Rath
>Priority: Major
> Attachments: LongRunningWithPlan_thread39.txt, 
> SELECT_blocked_queries.txt, postgresql.conf, properties.xml
>
>
> Hi Karl,
>    I am also facing the above mentioned issue. (Similar to Connector-880)
> I am using manifold2.12 binary version. I am using Solr output connector and 
> Web repository connection. Manifold is using all default configuration.
> When I am running the jobs manually, it runs fine. Same jobs have been 
> scheduled to run everyday.
> I am getting below exceptions and the job gets hanged/ going to waiting stage.
> Could you please help me in resolving the same.
> I am getting the below error -
> Scenario-1
> WARN 2019-03-08T23:58:20,338 (qtp550147359-413) - Found a long-running query 
> (2706114 ms): [SELECT 
> t0.id,t0.description,t0.status,t0.starttime,t0.endtime,t0.errortext FROM jobs 
> t0 ORDER BY description ASC]
>  WARN 2019-03-08T23:58:20,337 (Document delete stuffer thread) - Found a 
> long-running query (2737370 ms): [SELECT id FROM jobs WHERE status=? LIMIT 1]
>  WARN 2019-03-08T23:58:20,339 (Job reset thread) - Found a long-running query 
> (2770133 ms): [SELECT id FROM jobs WHERE status IN (?,?)]
>  WARN 2019-03-08T23:58:20,386 (Document delete stuffer thread) - Parameter 0: 
> 'e'
>  WARN 2019-03-08T23:58:20,337 (Set priority thread) - Found a long-running 
> query (2732379 ms): [SELECT id,dochash,docid,jobid FROM jobqueue WHERE 
> needpriority=? LIMIT 1000]
>  WARN 2019-03-08T23:58:20,386 (Set priority thread) - Parameter 0: 'T'
>  WARN 2019-03-08T23:58:20,386 (Job reset thread) - Parameter 0: 'I'
>  WARN 2019-03-08T23:58:20,386 (Job reset thread) - Parameter 1: 'i'
>  WARN 2019-03-08T23:58:20,372 (Seeding thread) - Parameter 2: '1552047176062'
>  WARN 2019-03-08T23:58:20,474 (Document cleanup stuffer thread) - Found a 
> long-running query (2737524 ms): [SELECT id FROM jobs WHERE status=? LIMIT 1]
>  WARN 2019-03-08T23:58:20,474 (Document cleanup stuffer thread) - Parameter 
> 0: 'S'
>  WARN 2019-03-08T23:58:20,474 (Finisher thread) - Found a long-running query 
> (2752034 ms): [SELECT id FROM jobs WHERE status IN (?,?,?) FOR UPDATE]
>  WARN 2019-03-08T23:58:20,474 (Finisher thread) - Parameter 0: 'A'
>  WARN 2019-03-08T23:58:20,475 (Finisher thread) - Parameter 1: 'W'
>  WARN 2019-03-08T23:58:20,475 (Finisher thread) - Parameter 2: 'R'
>  WARN 2019-03-08T23:58:20,475 (Delete startup thread) - Found a long-running 
> query (2752036 ms): [SELECT id FROM jobs WHERE status=? FOR UPDATE]
>  WARN 2019-03-08T23:58:20,475 (Delete startup thread) - Parameter 0: 'E'
>  WARN 2019-03-08T23:58:20,483 (qtp550147359-4339) - Found a long-running 
> query (2496641 ms): [SELECT 
> t0.id,t0.description,t0.status,t0.starttime,t0.endtime,t0.errortext FROM jobs 
> t0 ORDER BY description ASC]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: 
> isDistinctSelect=[false]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: isGrouped=[false]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: isAggregated=[false]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: columns=[ COLUMN: 
> PUBLIC.JOBS.ID not nullable
>  WARN 2019-03-08T23:58:20,492 (qtp550147359-4346) - Found a long-running 
> query (2435908 ms): [SELECT 
> t0.id,t0.description,t0.status,t0.starttime,t0.endtime,t0.errortext FROM jobs 
> t0 ORDER BY description ASC]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: 
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: ]
>  WARN 2019-03-08T23:58:20,499 (Finisher thread) - Plan: [range variable 1
>  WARN 2019-03-08T23:58:20,499 (Finisher thread) - Plan: join type=INNER
>  WARN 2019-03-08T23:58:20,499 (Finisher thread) - Plan: table=SYSTEM_SUBQUERY
>  WARN 2019-03-08T23:58:20,499 (Finisher thread) - Plan: cardinality=0
>  WARN 2019-03-08T23:58:20,499 (Finisher thread) - Plan: access=FULL SCAN
>  WARN 2019-03-08T23:58:20,500 (Finisher thread) - Plan: join condition = 
> [index=SYS_IDX_13329
>  WARN 2019-03-08T23:58:20,500 (Finisher thread) - Plan: ]
>  WARN 2019-03-08T23:58:20,500 (Finisher thread) - Plan: 

[jira] [Commented] (CONNECTORS-1592) Found long running query in manifold scheduled job

2019-04-10 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814306#comment-16814306
 ] 

Karl Wright commented on CONNECTORS-1592:
-

{quote}
the largest was 223673ms, the minimum time spent was 172416ms, the others are 
distributed between these extrema
{quote}

I saw a longer-running query than that in the log you posted, some 200ms.  
But the plan was fine.  Once again, locking would have been the only 
explanation.  But if you are seeing no queries running in less than 172416ms, 
then I think you may well have found your problem.  The lion's share of 
Postgresql queries should be executing in well under a second. Times around 
20ms would be typical.  Something is very wrong with your Postgresql 
configuration or installation given that.

{quote}
Just one more question, considering what you said of the hopcount filtering; In 
the "Hop Filters"-tab we have nothing of configuration except for "hop count 
mode" is set to "delete unreachable", which i had interpreted as being the 
default. Is this correct that it is the default, and is there something else we 
could do to disable hop count filtering?
{quote}

That is the default; it's also the most inefficient.  From the manual:

{quote}
On this same tab, you can tell the Framework what to do should there be changes 
in the distance from the root to a document. The choice "Delete unreachable 
documents" requires the Framework to recalculate the distance to every 
potentially affected document whenever a change takes place. This may require 
expensive bookkeeping, however, so you also have the option of ignoring such 
changes. There are two varieties of this latter option - you can ignore the 
changes for now, with the option of turning back on the aggressive bookkeeping 
at a later time, or you can decide not to ever allow changes to propagate, in 
which case the Framework will discard the necessary bookkeeping information 
permanently. This last option is the most efficient.
{quote}


> Found long running query in manifold scheduled job
> --
>
> Key: CONNECTORS-1592
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1592
> Project: ManifoldCF
>  Issue Type: Bug
>Affects Versions: ManifoldCF 2.12
>Reporter: Subasini Rath
>Priority: Major
> Attachments: LongRunningWithPlan_thread39.txt, 
> SELECT_blocked_queries.txt, postgresql.conf, properties.xml
>
>
> Hi Karl,
>    I am also facing the above mentioned issue. (Similar to Connector-880)
> I am using manifold2.12 binary version. I am using Solr output connector and 
> Web repository connection. Manifold is using all default configuration.
> When I am running the jobs manually, it runs fine. Same jobs have been 
> scheduled to run everyday.
> I am getting below exceptions and the job gets hanged/ going to waiting stage.
> Could you please help me in resolving the same.
> I am getting the below error -
> Scenario-1
> WARN 2019-03-08T23:58:20,338 (qtp550147359-413) - Found a long-running query 
> (2706114 ms): [SELECT 
> t0.id,t0.description,t0.status,t0.starttime,t0.endtime,t0.errortext FROM jobs 
> t0 ORDER BY description ASC]
>  WARN 2019-03-08T23:58:20,337 (Document delete stuffer thread) - Found a 
> long-running query (2737370 ms): [SELECT id FROM jobs WHERE status=? LIMIT 1]
>  WARN 2019-03-08T23:58:20,339 (Job reset thread) - Found a long-running query 
> (2770133 ms): [SELECT id FROM jobs WHERE status IN (?,?)]
>  WARN 2019-03-08T23:58:20,386 (Document delete stuffer thread) - Parameter 0: 
> 'e'
>  WARN 2019-03-08T23:58:20,337 (Set priority thread) - Found a long-running 
> query (2732379 ms): [SELECT id,dochash,docid,jobid FROM jobqueue WHERE 
> needpriority=? LIMIT 1000]
>  WARN 2019-03-08T23:58:20,386 (Set priority thread) - Parameter 0: 'T'
>  WARN 2019-03-08T23:58:20,386 (Job reset thread) - Parameter 0: 'I'
>  WARN 2019-03-08T23:58:20,386 (Job reset thread) - Parameter 1: 'i'
>  WARN 2019-03-08T23:58:20,372 (Seeding thread) - Parameter 2: '1552047176062'
>  WARN 2019-03-08T23:58:20,474 (Document cleanup stuffer thread) - Found a 
> long-running query (2737524 ms): [SELECT id FROM jobs WHERE status=? LIMIT 1]
>  WARN 2019-03-08T23:58:20,474 (Document cleanup stuffer thread) - Parameter 
> 0: 'S'
>  WARN 2019-03-08T23:58:20,474 (Finisher thread) - Found a long-running query 
> (2752034 ms): [SELECT id FROM jobs WHERE status IN (?,?,?) FOR UPDATE]
>  WARN 2019-03-08T23:58:20,474 (Finisher thread) - Parameter 0: 'A'
>  WARN 2019-03-08T23:58:20,475 (Finisher thread) - Parameter 1: 'W'
>  WARN 2019-03-08T23:58:20,475 (Finisher thread) - Parameter 2: 'R'
>  WARN 2019-03-08T23:58:20,475 (Delete startup thread) - Found a long-running 
> query (2752036 ms): [SELECT id FROM jobs WHERE status=? FOR UPDATE]
>  WARN 2019-03-08T23:58:20,475 (Delete 

[jira] [Commented] (CONNECTORS-1592) Found long running query in manifold scheduled job

2019-04-10 Thread roel goovaerts (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814287#comment-16814287
 ] 

roel goovaerts commented on CONNECTORS-1592:


Hi Karl,
 
I have not yet seen any "very long-running" queries. upon looking at the logs 
(there was a bunch of long-running queries logged about an hour ago) there is 
not an 'extreme' maximum of time spent on a query: the largest was 223673ms, 
the minimum time spent was 172416ms, the others are distributed between these 
extrema. From this I suppose this is not really the issue.
 
I of course understand that it's not that evident to commit to a conference 
call, thanks for considering.
 
Just one more question, considering what you said of the hopcount filtering; In 
the "Hop Filters"-tab we have nothing of configuration except for "hop count 
mode" is set to "delete unreachable", which i had interpreted as being the 
default. Is this correct that it is the default, and is there something else we 
could do to disable hop count filtering?
 
We will continue to look for other possible external influences.
There is now a possibility that the settings of postgres automatically got 
reverted to the defaults (which would include autovacuum to be on), so we are 
looking into this now.
Thanks again for the info and the quick replies.
 
Regards,
Roel

> Found long running query in manifold scheduled job
> --
>
> Key: CONNECTORS-1592
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1592
> Project: ManifoldCF
>  Issue Type: Bug
>Affects Versions: ManifoldCF 2.12
>Reporter: Subasini Rath
>Priority: Major
> Attachments: LongRunningWithPlan_thread39.txt, 
> SELECT_blocked_queries.txt, postgresql.conf, properties.xml
>
>
> Hi Karl,
>    I am also facing the above mentioned issue. (Similar to Connector-880)
> I am using manifold2.12 binary version. I am using Solr output connector and 
> Web repository connection. Manifold is using all default configuration.
> When I am running the jobs manually, it runs fine. Same jobs have been 
> scheduled to run everyday.
> I am getting below exceptions and the job gets hanged/ going to waiting stage.
> Could you please help me in resolving the same.
> I am getting the below error -
> Scenario-1
> WARN 2019-03-08T23:58:20,338 (qtp550147359-413) - Found a long-running query 
> (2706114 ms): [SELECT 
> t0.id,t0.description,t0.status,t0.starttime,t0.endtime,t0.errortext FROM jobs 
> t0 ORDER BY description ASC]
>  WARN 2019-03-08T23:58:20,337 (Document delete stuffer thread) - Found a 
> long-running query (2737370 ms): [SELECT id FROM jobs WHERE status=? LIMIT 1]
>  WARN 2019-03-08T23:58:20,339 (Job reset thread) - Found a long-running query 
> (2770133 ms): [SELECT id FROM jobs WHERE status IN (?,?)]
>  WARN 2019-03-08T23:58:20,386 (Document delete stuffer thread) - Parameter 0: 
> 'e'
>  WARN 2019-03-08T23:58:20,337 (Set priority thread) - Found a long-running 
> query (2732379 ms): [SELECT id,dochash,docid,jobid FROM jobqueue WHERE 
> needpriority=? LIMIT 1000]
>  WARN 2019-03-08T23:58:20,386 (Set priority thread) - Parameter 0: 'T'
>  WARN 2019-03-08T23:58:20,386 (Job reset thread) - Parameter 0: 'I'
>  WARN 2019-03-08T23:58:20,386 (Job reset thread) - Parameter 1: 'i'
>  WARN 2019-03-08T23:58:20,372 (Seeding thread) - Parameter 2: '1552047176062'
>  WARN 2019-03-08T23:58:20,474 (Document cleanup stuffer thread) - Found a 
> long-running query (2737524 ms): [SELECT id FROM jobs WHERE status=? LIMIT 1]
>  WARN 2019-03-08T23:58:20,474 (Document cleanup stuffer thread) - Parameter 
> 0: 'S'
>  WARN 2019-03-08T23:58:20,474 (Finisher thread) - Found a long-running query 
> (2752034 ms): [SELECT id FROM jobs WHERE status IN (?,?,?) FOR UPDATE]
>  WARN 2019-03-08T23:58:20,474 (Finisher thread) - Parameter 0: 'A'
>  WARN 2019-03-08T23:58:20,475 (Finisher thread) - Parameter 1: 'W'
>  WARN 2019-03-08T23:58:20,475 (Finisher thread) - Parameter 2: 'R'
>  WARN 2019-03-08T23:58:20,475 (Delete startup thread) - Found a long-running 
> query (2752036 ms): [SELECT id FROM jobs WHERE status=? FOR UPDATE]
>  WARN 2019-03-08T23:58:20,475 (Delete startup thread) - Parameter 0: 'E'
>  WARN 2019-03-08T23:58:20,483 (qtp550147359-4339) - Found a long-running 
> query (2496641 ms): [SELECT 
> t0.id,t0.description,t0.status,t0.starttime,t0.endtime,t0.errortext FROM jobs 
> t0 ORDER BY description ASC]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: 
> isDistinctSelect=[false]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: isGrouped=[false]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: isAggregated=[false]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: columns=[ COLUMN: 
> PUBLIC.JOBS.ID not nullable
>  WARN 2019-03-08T23:58:20,492 (qtp550147359-4346) - Found a 

[jira] [Commented] (CONNECTORS-1592) Found long running query in manifold scheduled job

2019-04-09 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813400#comment-16813400
 ] 

Karl Wright commented on CONNECTORS-1592:
-

{quote}
one 'root' query doesn't get committed, this keeps a lock on the job-, 
intrinsiclink- or jobQueue-table and cascades into the bulk of locked queries. 
Main question here is how one query could get stuck; can a query be waiting for 
something from manifold until it is committed?
{quote}

That is certainly possible, but you should see that query logged as a 
very-long-running query in that case.  What is the longest-running query you 
see logged?

{quote}
there is a locking conflict that arises from the jobID being a foreing key 
constraint for both the jobQueue and intrinsiclinks. From debugging we have the 
impression that postgres locks the whole intrinsiclink-table in a query which 
is specified to have one specific jobId.
{quote}

It may do that, but the way Postgresql works is then a SQL exception is thrown, 
and the ManifoldCF code will retry the query.  So this situation cannot cause 
the symptom you are seeing.

The *only* way you can get into this situation is to have one particular query, 
which hits tables that all the other queries depend on, take a very long time.  
And that should show in the log.
If it doesn't show in the log, that means that the locks are being caused 
externally, which is why I pointed at VACUUM FULL as being a potential cause.

{quote}
could using the multi process-functionality of 
org.apache.manifoldcf.usejettyparentclassloader be used to improve this issue?
{quote}

No, won't help.

{quote}
I have read that disabling swap can be good for intensive db-interactions; do 
you have experience with disabling swap improving manifold?
{quote}

Once again, probably immaterial, EXCEPT if your postgresql instances are 
swapping.  That would be bad.

{quote}
is there a possibility that we could set-up a conference call with someone from 
the manifold team?
{quote}

I work full time on an entirely unrelated task and probably there is nobody 
else who would be in a position to go deep on this issue.  So this is unlikely.

One thing I notice, though, is that you are seeing a lot of intrinsiclink 
activity.  If you are not using hopcount filtering, you can disable that 
entirely at the job level.  It might help you (can't be sure until the blocking 
culprit is found though).


> Found long running query in manifold scheduled job
> --
>
> Key: CONNECTORS-1592
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1592
> Project: ManifoldCF
>  Issue Type: Bug
>Affects Versions: ManifoldCF 2.12
>Reporter: Subasini Rath
>Priority: Major
> Attachments: LongRunningWithPlan_thread39.txt, 
> SELECT_blocked_queries.txt, postgresql.conf, properties.xml
>
>
> Hi Karl,
>    I am also facing the above mentioned issue. (Similar to Connector-880)
> I am using manifold2.12 binary version. I am using Solr output connector and 
> Web repository connection. Manifold is using all default configuration.
> When I am running the jobs manually, it runs fine. Same jobs have been 
> scheduled to run everyday.
> I am getting below exceptions and the job gets hanged/ going to waiting stage.
> Could you please help me in resolving the same.
> I am getting the below error -
> Scenario-1
> WARN 2019-03-08T23:58:20,338 (qtp550147359-413) - Found a long-running query 
> (2706114 ms): [SELECT 
> t0.id,t0.description,t0.status,t0.starttime,t0.endtime,t0.errortext FROM jobs 
> t0 ORDER BY description ASC]
>  WARN 2019-03-08T23:58:20,337 (Document delete stuffer thread) - Found a 
> long-running query (2737370 ms): [SELECT id FROM jobs WHERE status=? LIMIT 1]
>  WARN 2019-03-08T23:58:20,339 (Job reset thread) - Found a long-running query 
> (2770133 ms): [SELECT id FROM jobs WHERE status IN (?,?)]
>  WARN 2019-03-08T23:58:20,386 (Document delete stuffer thread) - Parameter 0: 
> 'e'
>  WARN 2019-03-08T23:58:20,337 (Set priority thread) - Found a long-running 
> query (2732379 ms): [SELECT id,dochash,docid,jobid FROM jobqueue WHERE 
> needpriority=? LIMIT 1000]
>  WARN 2019-03-08T23:58:20,386 (Set priority thread) - Parameter 0: 'T'
>  WARN 2019-03-08T23:58:20,386 (Job reset thread) - Parameter 0: 'I'
>  WARN 2019-03-08T23:58:20,386 (Job reset thread) - Parameter 1: 'i'
>  WARN 2019-03-08T23:58:20,372 (Seeding thread) - Parameter 2: '1552047176062'
>  WARN 2019-03-08T23:58:20,474 (Document cleanup stuffer thread) - Found a 
> long-running query (2737524 ms): [SELECT id FROM jobs WHERE status=? LIMIT 1]
>  WARN 2019-03-08T23:58:20,474 (Document cleanup stuffer thread) - Parameter 
> 0: 'S'
>  WARN 2019-03-08T23:58:20,474 (Finisher thread) - Found a long-running query 
> (2752034 ms): [SELECT id FROM jobs WHERE status IN (?,?,?) FOR 

[jira] [Commented] (CONNECTORS-1592) Found long running query in manifold scheduled job

2019-04-09 Thread roel goovaerts (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813384#comment-16813384
 ] 

roel goovaerts commented on CONNECTORS-1592:


Hi Karl, 

We've had some time to analyze and debug in detail. First of all we ran the 
databasemaintenance script when manifold was shut down; after a restart and 2 
hours of crawling it started to log long-running queries again.

While monitoring, we noticed that there are frequent locks following queries; 
normally these logs are resolved quickly (as yo uwould expect). But every once 
in a while the locks start stacking up until postgres is using 100% of cpu and 
is shown as being idle and manifold is idle as well. After a while things pick 
up again and manifold starts logging long-running queries.

SELECT_blocked_queries.txt contains a incomplete list of queries that are 
blocked by another process. This list was captured at a time we were monitoring 
such an inactive moment.

When looking into the tables created in postgres we saw that jobID is a primary 
key of the jobs table, and this is a foreign key for the intrinsiclink-table 
and jobqueue-table. There are 21 entries in the job-table and 1400+ entries in 
the jobQueue-table.

>From our analysis we have some hypothesis:
- one 'root' query doesn't get committed, this keeps a lock on the job-, 
intrinsiclink- or jobQueue-table and cascades into the bulk of locked queries. 
Main question here is how one query could get stuck; can a query be waiting for 
something from manifold until it is committed?
- there is a locking conflict that arises from the jobID being a foreing key 
constraint for both the jobQueue and intrinsiclinks. From debugging we have the 
impression that postgres locks the whole intrinsiclink-table in a query which 
is specified to have one specific jobId.

Your input in this issue would be appreciated. 
Based on the "performance tuning" and "building ManifoldCF" resources we have 
verified our properties to be in the correct database limits; The only thing we 
were wondering, concerning the formula 'manifoldcf_db_pool_size * 
number_of_manifoldcf_processes <= maximum_postgresql_database_handles - 2', is 
if manifold_db_pool_size is postgres.max_connections?

 

some additional questions:
- could using the multi process-functionality of 
org.apache.manifoldcf.usejettyparentclassloader be used to improve this issue?
- I have read that disabling swap can be good for intensive db-interactions; do 
you have experience with disabling swap improving manifold?
- is there a possibility that we could set-up a conference call with someone 
from the manifold team?

Many thanks for your time.

> Found long running query in manifold scheduled job
> --
>
> Key: CONNECTORS-1592
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1592
> Project: ManifoldCF
>  Issue Type: Bug
>Affects Versions: ManifoldCF 2.12
>Reporter: Subasini Rath
>Priority: Major
> Attachments: LongRunningWithPlan_thread39.txt, 
> SELECT_blocked_queries.txt, postgresql.conf, properties.xml
>
>
> Hi Karl,
>    I am also facing the above mentioned issue. (Similar to Connector-880)
> I am using manifold2.12 binary version. I am using Solr output connector and 
> Web repository connection. Manifold is using all default configuration.
> When I am running the jobs manually, it runs fine. Same jobs have been 
> scheduled to run everyday.
> I am getting below exceptions and the job gets hanged/ going to waiting stage.
> Could you please help me in resolving the same.
> I am getting the below error -
> Scenario-1
> WARN 2019-03-08T23:58:20,338 (qtp550147359-413) - Found a long-running query 
> (2706114 ms): [SELECT 
> t0.id,t0.description,t0.status,t0.starttime,t0.endtime,t0.errortext FROM jobs 
> t0 ORDER BY description ASC]
>  WARN 2019-03-08T23:58:20,337 (Document delete stuffer thread) - Found a 
> long-running query (2737370 ms): [SELECT id FROM jobs WHERE status=? LIMIT 1]
>  WARN 2019-03-08T23:58:20,339 (Job reset thread) - Found a long-running query 
> (2770133 ms): [SELECT id FROM jobs WHERE status IN (?,?)]
>  WARN 2019-03-08T23:58:20,386 (Document delete stuffer thread) - Parameter 0: 
> 'e'
>  WARN 2019-03-08T23:58:20,337 (Set priority thread) - Found a long-running 
> query (2732379 ms): [SELECT id,dochash,docid,jobid FROM jobqueue WHERE 
> needpriority=? LIMIT 1000]
>  WARN 2019-03-08T23:58:20,386 (Set priority thread) - Parameter 0: 'T'
>  WARN 2019-03-08T23:58:20,386 (Job reset thread) - Parameter 0: 'I'
>  WARN 2019-03-08T23:58:20,386 (Job reset thread) - Parameter 1: 'i'
>  WARN 2019-03-08T23:58:20,372 (Seeding thread) - Parameter 2: '1552047176062'
>  WARN 2019-03-08T23:58:20,474 (Document cleanup stuffer thread) - Found a 
> long-running query (2737524 ms): [SELECT id FROM jobs 

[jira] [Commented] (CONNECTORS-1592) Found long running query in manifold scheduled job

2019-04-04 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809709#comment-16809709
 ] 

Karl Wright commented on CONNECTORS-1592:
-

[~goovaertsr], this is a perfectly fine plan, as you see in the execution 
estimates here:

{code}
WARN 2019-04-03T14:09:04,328 (Worker thread '39') -  Plan: Planning time: 0.706 
ms
 WARN 2019-04-03T14:09:04,328 (Worker thread '39') -  Plan: Execution time: 
0.382 ms
{code}

And yet the time (in this case) is 2 seconds for execution, which is still not 
bad actually, given that MCF is pounding on the database.

As I said before, there is no indication of actual bad plans.  Instead, the 
database as a whole is going offline or is being locked down for an extended 
period of time.



> Found long running query in manifold scheduled job
> --
>
> Key: CONNECTORS-1592
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1592
> Project: ManifoldCF
>  Issue Type: Bug
>Affects Versions: ManifoldCF 2.12
>Reporter: Subasini Rath
>Priority: Major
> Attachments: LongRunningWithPlan_thread39.txt
>
>
> Hi Karl,
>    I am also facing the above mentioned issue. (Similar to Connector-880)
> I am using manifold2.12 binary version. I am using Solr output connector and 
> Web repository connection. Manifold is using all default configuration.
> When I am running the jobs manually, it runs fine. Same jobs have been 
> scheduled to run everyday.
> I am getting below exceptions and the job gets hanged/ going to waiting stage.
> Could you please help me in resolving the same.
> I am getting the below error -
> Scenario-1
> WARN 2019-03-08T23:58:20,338 (qtp550147359-413) - Found a long-running query 
> (2706114 ms): [SELECT 
> t0.id,t0.description,t0.status,t0.starttime,t0.endtime,t0.errortext FROM jobs 
> t0 ORDER BY description ASC]
>  WARN 2019-03-08T23:58:20,337 (Document delete stuffer thread) - Found a 
> long-running query (2737370 ms): [SELECT id FROM jobs WHERE status=? LIMIT 1]
>  WARN 2019-03-08T23:58:20,339 (Job reset thread) - Found a long-running query 
> (2770133 ms): [SELECT id FROM jobs WHERE status IN (?,?)]
>  WARN 2019-03-08T23:58:20,386 (Document delete stuffer thread) - Parameter 0: 
> 'e'
>  WARN 2019-03-08T23:58:20,337 (Set priority thread) - Found a long-running 
> query (2732379 ms): [SELECT id,dochash,docid,jobid FROM jobqueue WHERE 
> needpriority=? LIMIT 1000]
>  WARN 2019-03-08T23:58:20,386 (Set priority thread) - Parameter 0: 'T'
>  WARN 2019-03-08T23:58:20,386 (Job reset thread) - Parameter 0: 'I'
>  WARN 2019-03-08T23:58:20,386 (Job reset thread) - Parameter 1: 'i'
>  WARN 2019-03-08T23:58:20,372 (Seeding thread) - Parameter 2: '1552047176062'
>  WARN 2019-03-08T23:58:20,474 (Document cleanup stuffer thread) - Found a 
> long-running query (2737524 ms): [SELECT id FROM jobs WHERE status=? LIMIT 1]
>  WARN 2019-03-08T23:58:20,474 (Document cleanup stuffer thread) - Parameter 
> 0: 'S'
>  WARN 2019-03-08T23:58:20,474 (Finisher thread) - Found a long-running query 
> (2752034 ms): [SELECT id FROM jobs WHERE status IN (?,?,?) FOR UPDATE]
>  WARN 2019-03-08T23:58:20,474 (Finisher thread) - Parameter 0: 'A'
>  WARN 2019-03-08T23:58:20,475 (Finisher thread) - Parameter 1: 'W'
>  WARN 2019-03-08T23:58:20,475 (Finisher thread) - Parameter 2: 'R'
>  WARN 2019-03-08T23:58:20,475 (Delete startup thread) - Found a long-running 
> query (2752036 ms): [SELECT id FROM jobs WHERE status=? FOR UPDATE]
>  WARN 2019-03-08T23:58:20,475 (Delete startup thread) - Parameter 0: 'E'
>  WARN 2019-03-08T23:58:20,483 (qtp550147359-4339) - Found a long-running 
> query (2496641 ms): [SELECT 
> t0.id,t0.description,t0.status,t0.starttime,t0.endtime,t0.errortext FROM jobs 
> t0 ORDER BY description ASC]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: 
> isDistinctSelect=[false]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: isGrouped=[false]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: isAggregated=[false]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: columns=[ COLUMN: 
> PUBLIC.JOBS.ID not nullable
>  WARN 2019-03-08T23:58:20,492 (qtp550147359-4346) - Found a long-running 
> query (2435908 ms): [SELECT 
> t0.id,t0.description,t0.status,t0.starttime,t0.endtime,t0.errortext FROM jobs 
> t0 ORDER BY description ASC]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: 
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: ]
>  WARN 2019-03-08T23:58:20,499 (Finisher thread) - Plan: [range variable 1
>  WARN 2019-03-08T23:58:20,499 (Finisher thread) - Plan: join type=INNER
>  WARN 2019-03-08T23:58:20,499 (Finisher thread) - Plan: table=SYSTEM_SUBQUERY
>  WARN 2019-03-08T23:58:20,499 (Finisher thread) - Plan: cardinality=0
>  WARN 2019-03-08T23:58:20,499 (Finisher thread) - Plan: access=FULL SCAN
> 

[jira] [Commented] (CONNECTORS-1592) Found long running query in manifold scheduled job

2019-04-04 Thread roel goovaerts (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809654#comment-16809654
 ] 

roel goovaerts commented on CONNECTORS-1592:


Hi Karl, 
We are analyzing the plans of of the long-running queries and still have some 
questions/uncertainties.
If you would be so kind, I have attached a truncated log of one thread with the 
description and plan of on long-running query.

The main questions at this time are:
 * is this indeed a bad plan?
 * is this query influenced by the optimization of the db? (or how up to date 
the db is, as it is referenced to at one point in the documentation)
 * would the query be different if the db had been optimized/vacuumed?
[^LongRunningWithPlan_thread39.txt]

 

> Found long running query in manifold scheduled job
> --
>
> Key: CONNECTORS-1592
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1592
> Project: ManifoldCF
>  Issue Type: Bug
>Affects Versions: ManifoldCF 2.12
>Reporter: Subasini Rath
>Priority: Major
> Attachments: LongRunningWithPlan_thread39.txt
>
>
> Hi Karl,
>    I am also facing the above mentioned issue. (Similar to Connector-880)
> I am using manifold2.12 binary version. I am using Solr output connector and 
> Web repository connection. Manifold is using all default configuration.
> When I am running the jobs manually, it runs fine. Same jobs have been 
> scheduled to run everyday.
> I am getting below exceptions and the job gets hanged/ going to waiting stage.
> Could you please help me in resolving the same.
> I am getting the below error -
> Scenario-1
> WARN 2019-03-08T23:58:20,338 (qtp550147359-413) - Found a long-running query 
> (2706114 ms): [SELECT 
> t0.id,t0.description,t0.status,t0.starttime,t0.endtime,t0.errortext FROM jobs 
> t0 ORDER BY description ASC]
>  WARN 2019-03-08T23:58:20,337 (Document delete stuffer thread) - Found a 
> long-running query (2737370 ms): [SELECT id FROM jobs WHERE status=? LIMIT 1]
>  WARN 2019-03-08T23:58:20,339 (Job reset thread) - Found a long-running query 
> (2770133 ms): [SELECT id FROM jobs WHERE status IN (?,?)]
>  WARN 2019-03-08T23:58:20,386 (Document delete stuffer thread) - Parameter 0: 
> 'e'
>  WARN 2019-03-08T23:58:20,337 (Set priority thread) - Found a long-running 
> query (2732379 ms): [SELECT id,dochash,docid,jobid FROM jobqueue WHERE 
> needpriority=? LIMIT 1000]
>  WARN 2019-03-08T23:58:20,386 (Set priority thread) - Parameter 0: 'T'
>  WARN 2019-03-08T23:58:20,386 (Job reset thread) - Parameter 0: 'I'
>  WARN 2019-03-08T23:58:20,386 (Job reset thread) - Parameter 1: 'i'
>  WARN 2019-03-08T23:58:20,372 (Seeding thread) - Parameter 2: '1552047176062'
>  WARN 2019-03-08T23:58:20,474 (Document cleanup stuffer thread) - Found a 
> long-running query (2737524 ms): [SELECT id FROM jobs WHERE status=? LIMIT 1]
>  WARN 2019-03-08T23:58:20,474 (Document cleanup stuffer thread) - Parameter 
> 0: 'S'
>  WARN 2019-03-08T23:58:20,474 (Finisher thread) - Found a long-running query 
> (2752034 ms): [SELECT id FROM jobs WHERE status IN (?,?,?) FOR UPDATE]
>  WARN 2019-03-08T23:58:20,474 (Finisher thread) - Parameter 0: 'A'
>  WARN 2019-03-08T23:58:20,475 (Finisher thread) - Parameter 1: 'W'
>  WARN 2019-03-08T23:58:20,475 (Finisher thread) - Parameter 2: 'R'
>  WARN 2019-03-08T23:58:20,475 (Delete startup thread) - Found a long-running 
> query (2752036 ms): [SELECT id FROM jobs WHERE status=? FOR UPDATE]
>  WARN 2019-03-08T23:58:20,475 (Delete startup thread) - Parameter 0: 'E'
>  WARN 2019-03-08T23:58:20,483 (qtp550147359-4339) - Found a long-running 
> query (2496641 ms): [SELECT 
> t0.id,t0.description,t0.status,t0.starttime,t0.endtime,t0.errortext FROM jobs 
> t0 ORDER BY description ASC]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: 
> isDistinctSelect=[false]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: isGrouped=[false]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: isAggregated=[false]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: columns=[ COLUMN: 
> PUBLIC.JOBS.ID not nullable
>  WARN 2019-03-08T23:58:20,492 (qtp550147359-4346) - Found a long-running 
> query (2435908 ms): [SELECT 
> t0.id,t0.description,t0.status,t0.starttime,t0.endtime,t0.errortext FROM jobs 
> t0 ORDER BY description ASC]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: 
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: ]
>  WARN 2019-03-08T23:58:20,499 (Finisher thread) - Plan: [range variable 1
>  WARN 2019-03-08T23:58:20,499 (Finisher thread) - Plan: join type=INNER
>  WARN 2019-03-08T23:58:20,499 (Finisher thread) - Plan: table=SYSTEM_SUBQUERY
>  WARN 2019-03-08T23:58:20,499 (Finisher thread) - Plan: cardinality=0
>  WARN 2019-03-08T23:58:20,499 (Finisher thread) - Plan: access=FULL SCAN
>  WARN 

[jira] [Commented] (CONNECTORS-1592) Found long running query in manifold scheduled job

2019-04-03 Thread roel goovaerts (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808658#comment-16808658
 ] 

roel goovaerts commented on CONNECTORS-1592:


Thanks for the quick reply, I will get back to you if this information is not 
enough to fix it.

> Found long running query in manifold scheduled job
> --
>
> Key: CONNECTORS-1592
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1592
> Project: ManifoldCF
>  Issue Type: Bug
>Affects Versions: ManifoldCF 2.12
>Reporter: Subasini Rath
>Priority: Major
>
> Hi Karl,
>    I am also facing the above mentioned issue. (Similar to Connector-880)
> I am using manifold2.12 binary version. I am using Solr output connector and 
> Web repository connection. Manifold is using all default configuration.
> When I am running the jobs manually, it runs fine. Same jobs have been 
> scheduled to run everyday.
> I am getting below exceptions and the job gets hanged/ going to waiting stage.
> Could you please help me in resolving the same.
> I am getting the below error -
> Scenario-1
> WARN 2019-03-08T23:58:20,338 (qtp550147359-413) - Found a long-running query 
> (2706114 ms): [SELECT 
> t0.id,t0.description,t0.status,t0.starttime,t0.endtime,t0.errortext FROM jobs 
> t0 ORDER BY description ASC]
>  WARN 2019-03-08T23:58:20,337 (Document delete stuffer thread) - Found a 
> long-running query (2737370 ms): [SELECT id FROM jobs WHERE status=? LIMIT 1]
>  WARN 2019-03-08T23:58:20,339 (Job reset thread) - Found a long-running query 
> (2770133 ms): [SELECT id FROM jobs WHERE status IN (?,?)]
>  WARN 2019-03-08T23:58:20,386 (Document delete stuffer thread) - Parameter 0: 
> 'e'
>  WARN 2019-03-08T23:58:20,337 (Set priority thread) - Found a long-running 
> query (2732379 ms): [SELECT id,dochash,docid,jobid FROM jobqueue WHERE 
> needpriority=? LIMIT 1000]
>  WARN 2019-03-08T23:58:20,386 (Set priority thread) - Parameter 0: 'T'
>  WARN 2019-03-08T23:58:20,386 (Job reset thread) - Parameter 0: 'I'
>  WARN 2019-03-08T23:58:20,386 (Job reset thread) - Parameter 1: 'i'
>  WARN 2019-03-08T23:58:20,372 (Seeding thread) - Parameter 2: '1552047176062'
>  WARN 2019-03-08T23:58:20,474 (Document cleanup stuffer thread) - Found a 
> long-running query (2737524 ms): [SELECT id FROM jobs WHERE status=? LIMIT 1]
>  WARN 2019-03-08T23:58:20,474 (Document cleanup stuffer thread) - Parameter 
> 0: 'S'
>  WARN 2019-03-08T23:58:20,474 (Finisher thread) - Found a long-running query 
> (2752034 ms): [SELECT id FROM jobs WHERE status IN (?,?,?) FOR UPDATE]
>  WARN 2019-03-08T23:58:20,474 (Finisher thread) - Parameter 0: 'A'
>  WARN 2019-03-08T23:58:20,475 (Finisher thread) - Parameter 1: 'W'
>  WARN 2019-03-08T23:58:20,475 (Finisher thread) - Parameter 2: 'R'
>  WARN 2019-03-08T23:58:20,475 (Delete startup thread) - Found a long-running 
> query (2752036 ms): [SELECT id FROM jobs WHERE status=? FOR UPDATE]
>  WARN 2019-03-08T23:58:20,475 (Delete startup thread) - Parameter 0: 'E'
>  WARN 2019-03-08T23:58:20,483 (qtp550147359-4339) - Found a long-running 
> query (2496641 ms): [SELECT 
> t0.id,t0.description,t0.status,t0.starttime,t0.endtime,t0.errortext FROM jobs 
> t0 ORDER BY description ASC]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: 
> isDistinctSelect=[false]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: isGrouped=[false]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: isAggregated=[false]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: columns=[ COLUMN: 
> PUBLIC.JOBS.ID not nullable
>  WARN 2019-03-08T23:58:20,492 (qtp550147359-4346) - Found a long-running 
> query (2435908 ms): [SELECT 
> t0.id,t0.description,t0.status,t0.starttime,t0.endtime,t0.errortext FROM jobs 
> t0 ORDER BY description ASC]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: 
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: ]
>  WARN 2019-03-08T23:58:20,499 (Finisher thread) - Plan: [range variable 1
>  WARN 2019-03-08T23:58:20,499 (Finisher thread) - Plan: join type=INNER
>  WARN 2019-03-08T23:58:20,499 (Finisher thread) - Plan: table=SYSTEM_SUBQUERY
>  WARN 2019-03-08T23:58:20,499 (Finisher thread) - Plan: cardinality=0
>  WARN 2019-03-08T23:58:20,499 (Finisher thread) - Plan: access=FULL SCAN
>  WARN 2019-03-08T23:58:20,500 (Finisher thread) - Plan: join condition = 
> [index=SYS_IDX_13329
>  WARN 2019-03-08T23:58:20,500 (Finisher thread) - Plan: ]
>  WARN 2019-03-08T23:58:20,500 (Finisher thread) - Plan: ][range variable 2
>  WARN 2019-03-08T23:58:20,500 (Finisher thread) - Plan: join type=INNER
>  WARN 2019-03-08T23:58:20,500 (Finisher thread) - Plan: table=JOBS
>  WARN 2019-03-08T23:58:20,500 (Finisher thread) - Plan: cardinality=3
>  WARN 2019-03-08T23:58:20,500 (Finisher thread) - Plan: access=INDEX PRED
>  WARN 

[jira] [Commented] (CONNECTORS-1592) Found long running query in manifold scheduled job

2019-04-03 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808652#comment-16808652
 ] 

Karl Wright commented on CONNECTORS-1592:
-

[~goovaertsr], when a large number of queries are blocked and do not get 
executed for a while (in this case 132000ms or so), then when all of them 
finally fire they are all reported as "slow running queries".  The question is: 
why are all of these queries blocked?

Tuple bloat just makes the database generally get slower and slower, so that is 
not it.

If you execute "VACUUM FULL" while ManifoldCF is running, that *could* do it, 
since tables get completely locked one at a time and are recreated.  It is 
recommended that you either shut ManifoldCF down during this time, or create a 
"signalling file" which tells ManifoldCF to not do any real work until it goes 
away.  Your choice.  If you want to know more about the latter option, please 
let me know.

If this isn't due to a concurrent "VACUUM FULL", then we're left with finding 
some other cause.  While it is taking place, there may be a way of getting 
Postgresql's state across all requests; that would be the ideal way to figure 
it out.


> Found long running query in manifold scheduled job
> --
>
> Key: CONNECTORS-1592
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1592
> Project: ManifoldCF
>  Issue Type: Bug
>Affects Versions: ManifoldCF 2.12
>Reporter: Subasini Rath
>Priority: Major
>
> Hi Karl,
>    I am also facing the above mentioned issue. (Similar to Connector-880)
> I am using manifold2.12 binary version. I am using Solr output connector and 
> Web repository connection. Manifold is using all default configuration.
> When I am running the jobs manually, it runs fine. Same jobs have been 
> scheduled to run everyday.
> I am getting below exceptions and the job gets hanged/ going to waiting stage.
> Could you please help me in resolving the same.
> I am getting the below error -
> Scenario-1
> WARN 2019-03-08T23:58:20,338 (qtp550147359-413) - Found a long-running query 
> (2706114 ms): [SELECT 
> t0.id,t0.description,t0.status,t0.starttime,t0.endtime,t0.errortext FROM jobs 
> t0 ORDER BY description ASC]
>  WARN 2019-03-08T23:58:20,337 (Document delete stuffer thread) - Found a 
> long-running query (2737370 ms): [SELECT id FROM jobs WHERE status=? LIMIT 1]
>  WARN 2019-03-08T23:58:20,339 (Job reset thread) - Found a long-running query 
> (2770133 ms): [SELECT id FROM jobs WHERE status IN (?,?)]
>  WARN 2019-03-08T23:58:20,386 (Document delete stuffer thread) - Parameter 0: 
> 'e'
>  WARN 2019-03-08T23:58:20,337 (Set priority thread) - Found a long-running 
> query (2732379 ms): [SELECT id,dochash,docid,jobid FROM jobqueue WHERE 
> needpriority=? LIMIT 1000]
>  WARN 2019-03-08T23:58:20,386 (Set priority thread) - Parameter 0: 'T'
>  WARN 2019-03-08T23:58:20,386 (Job reset thread) - Parameter 0: 'I'
>  WARN 2019-03-08T23:58:20,386 (Job reset thread) - Parameter 1: 'i'
>  WARN 2019-03-08T23:58:20,372 (Seeding thread) - Parameter 2: '1552047176062'
>  WARN 2019-03-08T23:58:20,474 (Document cleanup stuffer thread) - Found a 
> long-running query (2737524 ms): [SELECT id FROM jobs WHERE status=? LIMIT 1]
>  WARN 2019-03-08T23:58:20,474 (Document cleanup stuffer thread) - Parameter 
> 0: 'S'
>  WARN 2019-03-08T23:58:20,474 (Finisher thread) - Found a long-running query 
> (2752034 ms): [SELECT id FROM jobs WHERE status IN (?,?,?) FOR UPDATE]
>  WARN 2019-03-08T23:58:20,474 (Finisher thread) - Parameter 0: 'A'
>  WARN 2019-03-08T23:58:20,475 (Finisher thread) - Parameter 1: 'W'
>  WARN 2019-03-08T23:58:20,475 (Finisher thread) - Parameter 2: 'R'
>  WARN 2019-03-08T23:58:20,475 (Delete startup thread) - Found a long-running 
> query (2752036 ms): [SELECT id FROM jobs WHERE status=? FOR UPDATE]
>  WARN 2019-03-08T23:58:20,475 (Delete startup thread) - Parameter 0: 'E'
>  WARN 2019-03-08T23:58:20,483 (qtp550147359-4339) - Found a long-running 
> query (2496641 ms): [SELECT 
> t0.id,t0.description,t0.status,t0.starttime,t0.endtime,t0.errortext FROM jobs 
> t0 ORDER BY description ASC]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: 
> isDistinctSelect=[false]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: isGrouped=[false]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: isAggregated=[false]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: columns=[ COLUMN: 
> PUBLIC.JOBS.ID not nullable
>  WARN 2019-03-08T23:58:20,492 (qtp550147359-4346) - Found a long-running 
> query (2435908 ms): [SELECT 
> t0.id,t0.description,t0.status,t0.starttime,t0.endtime,t0.errortext FROM jobs 
> t0 ORDER BY description ASC]
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: 
>  WARN 2019-03-08T23:58:20,492 (Finisher thread) - Plan: ]
>  WARN 

[jira] [Commented] (CONNECTORS-1592) Found long running query in manifold scheduled job

2019-04-03 Thread roel goovaerts (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808643#comment-16808643
 ] 

roel goovaerts commented on CONNECTORS-1592:


Hi,

I am also experiencing the same (or likewise) issue; from time to time we 
notice that manifold gets stuck and has no activity (by way of scheduled 
crawling) whatsoever. 
As far as my knowledge goes this could be related to the 'dead tuple bloat' and 
could be resolved by 'vacuum full', but the databasemaintenance script is run 
daily.

An example log from such a moment:


{noformat}
logs/manifoldcf.log: WARN 2019-04-02T18:29:17,988 (Worker thread '94') - Found 
a long-running query (132676 ms): [UPDATE hopcount SET distance=?,deathmark=? 
WHERE id IN(SELECT ownerid FROM hopdeletedeps t0 WHERE t0.jobid=? AND 
t0.childidhash=? AND EXISTS(SELECT 'x' FROM intrinsiclink t1 WHERE 
t1.jobid=t0.jobid AND t1.linktype=t0.linktype AND 
t1.parentidhash=t0.parentidhash AND t1.childidhash=t0.childidhash AND 
t1.isnew=?))] logs/manifoldcf.log: WARN 2019-04-02T18:29:17,988 (Worker thread 
'15') - Found a long-running query (131477 ms): [UPDATE hopcount SET 
distance=?,deathmark=? WHERE id IN(SELECT ownerid FROM hopdeletedeps t0 WHERE 
t0.jobid=? AND t0.childidhash=? AND EXISTS(SELECT 'x' FROM intrinsiclink t1 
WHERE t1.jobid=t0.jobid AND t1.linktype=t0.linktype AND 
t1.parentidhash=t0.parentidhash AND t1.childidhash=t0.childidhash AND 
t1.isnew=?))] logs/manifoldcf.log: WARN 2019-04-02T18:29:17,989 (Worker thread 
'23') - Found a long-running query (133229 ms): [UPDATE intrinsiclink SET 
processid=?,isnew=? WHERE jobid=? AND parentidhash=? AND linktype=? AND 
childidhash=?] logs/manifoldcf.log: WARN 2019-04-02T18:29:17,989 (Worker thread 
'8') - Found a long-running query (133217 ms): [SELECT parentidhash FROM 
intrinsiclink WHERE jobid=? AND (parentidhash=? OR parentidhash=? OR 
parentidhash=? OR parentidhash=? OR parentidhash=? OR parentidhash=? OR 
parentidhash=? OR parentidhash=? OR parentidhash=? OR parentidhash=? OR 
parentidhash=? OR parentidhash=? OR parentidhash=? OR parentidhash=? OR 
parentidhash=? OR parentidhash=? OR parentidhash=? OR parentidhash=? OR 
parentidhash=? OR parentidhash=?) AND linktype=? AND childidhash=? FOR UPDATE] 
logs/manifoldcf.log: WARN 2019-04-02T18:29:17,989 (Worker thread '36') - Found 
a long-running query (133212 ms): [UPDATE intrinsiclink SET processid=?,isnew=? 
WHERE jobid=? AND parentidhash=? AND linktype=? AND childidhash=?] 
logs/manifoldcf.log: WARN 2019-04-02T18:29:17,989 (Worker thread '29') - Found 
a long-running query (133168 ms): [SELECT parentidhash FROM intrinsiclink WHERE 
jobid=? AND (parentidhash=? OR parentidhash=? OR parentidhash=? OR 
parentidhash=? OR parentidhash=? OR parentidhash=?) AND linktype=? AND 
childidhash=? FOR UPDATE] logs/manifoldcf.log: WARN 2019-04-02T18:29:17,993 
(Worker thread '55') - Found a long-running query (132950 ms): [UPDATE hopcount 
SET distance=?,deathmark=? WHERE id IN(SELECT ownerid FROM hopdeletedeps t0 
WHERE t0.jobid=? AND t0.childidhash=? AND EXISTS(SELECT 'x' FROM intrinsiclink 
t1 WHERE t1.jobid=t0.jobid AND t1.linktype=t0.linktype AND 
t1.parentidhash=t0.parentidhash AND t1.childidhash=t0.childidhash AND 
t1.isnew=?))] logs/manifoldcf.log: WARN 2019-04-02T18:29:17,993 (Worker thread 
'31') - Found a long-running query (133216 ms): [SELECT parentidhash FROM 
intrinsiclink WHERE jobid=? AND (parentidhash=? OR parentidhash=? OR 
parentidhash=? OR parentidhash=? OR parentidhash=? OR parentidhash=? OR 
parentidhash=? OR parentidhash=? OR parentidhash=? OR parentidhash=? OR 
parentidhash=? OR parentidhash=? OR parentidhash=? OR parentidhash=? OR 
parentidhash=? OR parentidhash=? OR parentidhash=? OR parentidhash=? OR 
parentidhash=? OR parentidhash=?) AND linktype=? AND childidhash=? FOR UPDATE] 
logs/manifoldcf.log: WARN 2019-04-02T18:29:17,994 (Worker thread '79') - Found 
a long-running query (133228 ms): [UPDATE intrinsiclink SET processid=?,isnew=? 
WHERE jobid=? AND parentidhash=? AND linktype=? AND childidhash=?] 
logs/manifoldcf.log: WARN 2019-04-02T18:29:18,005 (Worker thread '88') - Found 
a long-running query (133234 ms): [UPDATE intrinsiclink SET 
processid=NULL,isnew=? WHERE jobid=? AND childidhash=? AND isnew IN (?,?)] 
logs/manifoldcf.log: WARN 2019-04-02T18:29:18,036 (Worker thread '45') - Found 
a long-running query (133329 ms): [SELECT id,status,checktime FROM jobqueue 
WHERE dochash=? AND jobid=? FOR UPDATE] logs/manifoldcf.log: WARN 
2019-04-02T18:29:18,036 (Worker thread '60') - Found a long-running query 
(133264 ms): [SELECT id,status,checktime FROM jobqueue WHERE dochash=? AND 
jobid=? FOR UPDATE] logs/manifoldcf.log: WARN 2019-04-02T18:29:18,037 (Worker 
thread '38') - Found a long-running query (133468 ms): [SELECT 
id,status,checktime FROM jobqueue WHERE dochash=? AND jobid=? FOR UPDATE] 
logs/manifoldcf.log: WARN 2019-04-02T18:29:18,037 (Worker