For MCF I am using a Postgres 10.1 database which is located on the same machine that performs the crawl, so no reason there is a latency problem.
The MCF threads are configured like this : <property name="org.apache.manifoldcf.database.maxhandles" value="200"/> <property name="org.apache.manifoldcf.crawler.threads" value="50"/> <property name="org.apache.manifoldcf.crawler.expirethreads" value="10"/> <property name="org.apache.manifoldcf.crawler.cleanupthreads" value="10"/> <property name="org.apache.manifoldcf.crawler.deletethreads" value="10"/> For info, following the recommended settings from the MCF documentation, I applied the following Postgres properties : max_connections = 405 shared_buffers = 1024MB checkpoint_timeout = 900s max_wal_size = 14GB autovacuum = off Julien De : Karl Wright Envoyé le :mardi 28 mai 2019 15:02 À : [email protected] Objet :Re: Long running queries on jobqueue Hi Julien, "Error : Unexpected jobqueue status - record id 15588697113928, expecting active status, saw 0" As you said, the only thing that can be done here is to turn on diagnostic logging. Essentially, the status returned is not possible if the database is truly honoring transactional integrity, which is why the system gives up at that point. I have noted that this may be due to network latency reaching the database from the machine that is doing the crawling, but no way to tell for sure. Since it's a database-related issue, it's essential also to know which database you are using here. "I don’t really understand what is causing those long running queries..." The plan that is being executed looks OK but not great, which is bad news for a database that is large. The solution there is to hand-analyze the jobqueue and jobs tables. ManifoldCF does try to keep up with that but it is always possible that at some point the plan will go bad; it's heuristic at best. Karl On Tue, May 28, 2019 at 8:37 AM Julien <[email protected]> wrote: Hi, I need some help to better understand what I am experiencing with a job using a JDBC connector. The job crawls a table containing approximately 9M of documents. I am currently unable to complete this job as it randomly fails with the Following error : Error : Unexpected jobqueue status - record id 15588697113928, expecting active status, saw 0 The logs are showing a lot of long running queries, here is an example but there are a lot and on different query kind, but all of them concern the jobqueue table : WARN 2019-05-27T01:16:56,048 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db|Found a long-running query (92568 ms): [SELECT t0.id,t0.jobid,t0.dochash,t0.docid,t0.status,t0.failtime,t0.failcount FROM jobqueue t0 WHERE t0.docpriority<? AND (t0.status=? OR t0.status=?) AND t0.checkaction=? AND t0.checktime<=? AND EXISTS(SELECT 'x' FROM jobs t1 WHERE (t1.status=? OR t1.status=?) AND t1.id=t0.jobid AND t1.priority=?) AND NOT EXISTS(SELECT 'x' FROM jobqueue t2 WHERE t2.dochash=t0.dochash AND (t2.status=? OR t2.status=? OR t2.status=? OR t2.status=? OR t2.status=? OR t2.status=?) AND t2.jobid!=t0.jobid) AND NOT EXISTS(SELECT 'x' FROM prereqevents t3,events t4 WHERE t0.id=t3.owner AND t3.eventname=t4.name) ORDER BY t0.docpriority ASC LIMIT 40000] WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 0: '1.000000001E9' WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 1: 'P' WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 2: 'G' WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 3: 'R' WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 4: '1558919723463' WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 5: 'A' WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 6: 'a' WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 7: '5' WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 8: 'A' WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 9: 'F' WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 10: 'a' WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 11: 'f' WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 12: 'D' WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 13: 'd' WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Limit (cost=89494.71..89495.11 rows=160 width=91) (actual time=15468.494..15481.412 rows=40000 loops=1) WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Sort (cost=89494.71..89495.11 rows=160 width=91) (actual time=15468.493..15477.328 rows=40000 loops=1) WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Sort Key: t0.docpriority WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Sort Method: external sort Disk: 102584kB WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Nested Loop Anti Join (cost=42423.35..89488.86 rows=160 width=91) (actual time=234.622..13792.727 rows=1164882 loops=1) WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Hash Anti Join (cost=42422.79..88190.34 rows=160 width=91) (actual time=234.593..1316.236 rows=1164882 loops=1) WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Hash Cond: (t0.id = t3.owner) WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Nested Loop (cost=42395.08..88160.62 rows=160 width=91) (actual time=234.568..1002.111 rows=1164882 loops=1) WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Join Filter: (t0.jobid = t1.id) WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Seq Scan on jobs t1 (cost=0.00..11.40 rows=1 width=8) (actual time=0.015..0.016 rows=1 loops=1) WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Filter: ((priority = '5'::bigint) AND ((status = 'A'::bpchar) OR (status = 'a'::bpchar))) WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Rows Removed by Filter: 6 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Bitmap Heap Scan on jobqueue t0 (cost=42395.08..87989.60 rows=12769 width=91) (actual time=234.548..656.692 rows=1164882 loops=1) WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Recheck Cond: (((docpriority < '1000000001'::double precision) AND (status = 'P'::bpchar) AND (checkaction = 'R'::bpchar) AND (checktime <= '1558919723463'::bigint)) OR ((status = 'G'::bpchar) AND (checkaction = 'R'::bpchar) AND (checktime <= '1558919723463'::bigint))) WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Filter: (docpriority < '1000000001'::double precision) WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Heap Blocks: exact=20094 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> BitmapOr (cost=42395.08..42395.08 rows=12769 width=0) (actual time=229.219..229.219 rows=0 loops=1) WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Bitmap Index Scan on i1556284697879 (cost=0.00..42384.25 rows=12769 width=0) (actual time=229.176..229.176 rows=1164882 loops=1) WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Index Cond: ((docpriority < '1000000001'::double precision) AND (status = 'P'::bpchar) AND (checkaction = 'R'::bpchar) AND (checktime <= '1558919723463'::bigint)) WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Bitmap Index Scan on i1556284697880 (cost=0.00..4.45 rows=1 width=0) (actual time=0.041..0.041 rows=93 loops=1) WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Index Cond: ((status = 'G'::bpchar) AND (checkaction = 'R'::bpchar) AND (checktime <= '1558919723463'::bigint)) WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Hash (cost=26.08..26.08 rows=130 width=8) (actual time=0.007..0.007 rows=0 loops=1) WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Buckets: 1024 Batches: 1 Memory Usage: 8kB WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Hash Join (cost=12.93..26.08 rows=130 width=8) (actual time=0.006..0.006 rows=0 loops=1) WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Hash Cond: ((t3.eventname)::text = (t4.name)::text) WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Seq Scan on prereqevents t3 (cost=0.00..11.40 rows=140 width=524) (actual time=0.005..0.005 rows=0 loops=1) WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Hash (cost=11.30..11.30 rows=130 width=516) (never executed) WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Seq Scan on events t4 (cost=0.00..11.30 rows=130 width=516) (never executed) WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Index Scan using i1556284697878 on jobqueue t2 (cost=0.56..8.11 rows=1 width=49) (actual time=0.010..0.010 rows=0 loops=1164882) WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Index Cond: ((dochash)::text = (t0.dochash)::text) WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Filter: ((jobid <> t0.jobid) AND ((status = 'A'::bpchar) OR (status = 'F'::bpchar) OR (status = 'a'::bpchar) OR (status = 'f'::bpchar) OR (status = 'D'::bpchar) OR (status = 'd'::bpchar))) WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Rows Removed by Filter: 1 WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Planning time: 1.455 ms WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Execution time: 15518.479 ms WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| WARN 2019-05-27T01:17:11,749 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Stats: n_distinct=3.0 most_common_vals={C,P,A} most_common_freqs={0.779200017,0.220666662,0.000133333335} The same happens if I split the job into several jobs of 1.5M of docs at the difference that if I restart the jobs, some may end but with no guarantee. I don’t really understand what is causing those long running queries and the fact that the job/jobs can fail on an Error : Unexpected status… I tried to add the following parameter to the properties.xml to have better logs but it changed nothing : <property name="org.apache.manifoldcf.diagnostics" value="DEBUG"/> The only thing I did that can explain a lack of performances is that I raised the org.apache.manifoldcf.ui.maxstatuscount from 500k to 2M. But I was only expecting low performances on job status UI. Is it possible that it is responsible of the long running queries ? Thanks for the help, Julien Garanti sans virus. www.avast.com --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus
