Hi,

I need some help to better understand what I am experiencing with a job using a 
JDBC connector. The job crawls a table containing approximately 9M of 
documents. I am currently unable to complete this job as it randomly fails with 
the Following error :

Error : Unexpected jobqueue status - record id 15588697113928, expecting active 
status, saw 0

The logs are showing a lot of long running queries, here is an example but 
there are a lot and on different query kind, but all of them concern the 
jobqueue table  :

WARN 2019-05-27T01:16:56,048 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db|Found a long-running query (92568 ms): 
[SELECT t0.id,t0.jobid,t0.dochash,t0.docid,t0.status,t0.failtime,t0.failcount 
FROM jobqueue t0  WHERE t0.docpriority<? AND  (t0.status=? OR t0.status=?) AND 
t0.checkaction=? AND t0.checktime<=? AND EXISTS(SELECT 'x' FROM jobs t1 WHERE  
(t1.status=? OR t1.status=?) AND t1.id=t0.jobid AND t1.priority=?) AND NOT 
EXISTS(SELECT 'x' FROM jobqueue t2 WHERE t2.dochash=t0.dochash AND  
(t2.status=? OR t2.status=? OR t2.status=? OR t2.status=? OR t2.status=? OR 
t2.status=?) AND t2.jobid!=t0.jobid) AND NOT EXISTS(SELECT 'x' FROM 
prereqevents t3,events t4 WHERE t0.id=t3.owner AND t3.eventname=t4.name) ORDER 
BY t0.docpriority ASC LIMIT 40000]
 WARN 2019-05-27T01:16:56,053 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db|  Parameter 0: '1.000000001E9'
 WARN 2019-05-27T01:16:56,053 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db|  Parameter 1: 'P'
 WARN 2019-05-27T01:16:56,053 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db|  Parameter 2: 'G'
 WARN 2019-05-27T01:16:56,053 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db|  Parameter 3: 'R'
 WARN 2019-05-27T01:16:56,053 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db|  Parameter 4: '1558919723463'
 WARN 2019-05-27T01:16:56,053 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db|  Parameter 5: 'A'
 WARN 2019-05-27T01:16:56,053 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db|  Parameter 6: 'a'
 WARN 2019-05-27T01:16:56,053 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db|  Parameter 7: '5'
 WARN 2019-05-27T01:16:56,053 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db|  Parameter 8: 'A'
 WARN 2019-05-27T01:16:56,053 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db|  Parameter 9: 'F'
 WARN 2019-05-27T01:16:56,053 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db|  Parameter 10: 'a'
 WARN 2019-05-27T01:16:56,053 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db|  Parameter 11: 'f'
 WARN 2019-05-27T01:16:56,053 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db|  Parameter 12: 'D'
 WARN 2019-05-27T01:16:56,053 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db|  Parameter 13: 'd'
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan: Limit  (cost=89494.71..89495.11 
rows=160 width=91) (actual time=15468.494..15481.412 rows=40000 loops=1)
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:   ->  Sort  (cost=89494.71..89495.11 
rows=160 width=91) (actual time=15468.493..15477.328 rows=40000 loops=1)
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:         Sort Key: t0.docpriority
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:         Sort Method: external sort  
Disk: 102584kB
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:         ->  Nested Loop Anti Join  
(cost=42423.35..89488.86 rows=160 width=91) (actual time=234.622..13792.727 
rows=1164882 loops=1)
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:               ->  Hash Anti Join  
(cost=42422.79..88190.34 rows=160 width=91) (actual time=234.593..1316.236 
rows=1164882 loops=1)
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:                     Hash Cond: (t0.id 
= t3.owner)
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:                     ->  Nested Loop  
(cost=42395.08..88160.62 rows=160 width=91) (actual time=234.568..1002.111 
rows=1164882 loops=1)
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:                           Join 
Filter: (t0.jobid = t1.id)
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:                           ->  Seq 
Scan on jobs t1  (cost=0.00..11.40 rows=1 width=8) (actual time=0.015..0.016 
rows=1 loops=1)
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:                                 
Filter: ((priority = '5'::bigint) AND ((status = 'A'::bpchar) OR (status = 
'a'::bpchar)))
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:                                 Rows 
Removed by Filter: 6
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:                           ->  Bitmap 
Heap Scan on jobqueue t0  (cost=42395.08..87989.60 rows=12769 width=91) (actual 
time=234.548..656.692 rows=1164882 loops=1)
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:                                 
Recheck Cond: (((docpriority < '1000000001'::double precision) AND (status = 
'P'::bpchar) AND (checkaction = 'R'::bpchar) AND (checktime <= 
'1558919723463'::bigint)) OR ((status = 'G'::bpchar) AND (checkaction = 
'R'::bpchar) AND (checktime <= '1558919723463'::bigint)))
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:                                 
Filter: (docpriority < '1000000001'::double precision)
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:                                 Heap 
Blocks: exact=20094
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:                                 ->  
BitmapOr  (cost=42395.08..42395.08 rows=12769 width=0) (actual 
time=229.219..229.219 rows=0 loops=1)
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:                                       
->  Bitmap Index Scan on i1556284697879  (cost=0.00..42384.25 rows=12769 
width=0) (actual time=229.176..229.176 rows=1164882 loops=1)
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:                                       
      Index Cond: ((docpriority < '1000000001'::double precision) AND (status = 
'P'::bpchar) AND (checkaction = 'R'::bpchar) AND (checktime <= 
'1558919723463'::bigint))
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:                                       
->  Bitmap Index Scan on i1556284697880  (cost=0.00..4.45 rows=1 width=0) 
(actual time=0.041..0.041 rows=93 loops=1)
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:                                       
      Index Cond: ((status = 'G'::bpchar) AND (checkaction = 'R'::bpchar) AND 
(checktime <= '1558919723463'::bigint))
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:                     ->  Hash  
(cost=26.08..26.08 rows=130 width=8) (actual time=0.007..0.007 rows=0 loops=1)
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:                           Buckets: 
1024  Batches: 1  Memory Usage: 8kB
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:                           ->  Hash 
Join  (cost=12.93..26.08 rows=130 width=8) (actual time=0.006..0.006 rows=0 
loops=1)
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:                                 Hash 
Cond: ((t3.eventname)::text = (t4.name)::text)
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:                                 ->  
Seq Scan on prereqevents t3  (cost=0.00..11.40 rows=140 width=524) (actual 
time=0.005..0.005 rows=0 loops=1)
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:                                 ->  
Hash  (cost=11.30..11.30 rows=130 width=516) (never executed)
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:                                       
->  Seq Scan on events t4  (cost=0.00..11.30 rows=130 width=516) (never 
executed)
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:               ->  Index Scan using 
i1556284697878 on jobqueue t2  (cost=0.56..8.11 rows=1 width=49) (actual 
time=0.010..0.010 rows=0 loops=1164882)
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:                     Index Cond: 
((dochash)::text = (t0.dochash)::text)
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:                     Filter: ((jobid 
<> t0.jobid) AND ((status = 'A'::bpchar) OR (status = 'F'::bpchar) OR (status = 
'a'::bpchar) OR (status = 'f'::bpchar) OR (status = 'D'::bpchar) OR (status = 
'd'::bpchar)))
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan:                     Rows Removed by 
Filter: 1
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan: Planning time: 1.455 ms
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Plan: Execution time: 15518.479 ms
 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db|
 WARN 2019-05-27T01:17:11,749 (Stuffer thread) - 
MCF|MCF-agent|apache.manifoldcf.db| Stats: n_distinct=3.0 
most_common_vals={C,P,A} 
most_common_freqs={0.779200017,0.220666662,0.000133333335}

The same happens if I split the job into several jobs of 1.5M of docs at the 
difference that if I restart the jobs, some may end but with no guarantee.

I don’t really understand what is causing those long running queries and the 
fact that the job/jobs can fail on an Error : Unexpected status…

I tried to add the following parameter to the properties.xml to have better 
logs but it changed nothing : <property 
name="org.apache.manifoldcf.diagnostics" value="DEBUG"/>

The only thing I did that can explain a lack of performances is that I raised 
the org.apache.manifoldcf.ui.maxstatuscount from 500k to 2M. But I was only 
expecting low performances on job status UI. Is it possible that it is 
responsible of the long running queries ?

Thanks for the help,
Julien


---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus

Reply via email to