Oh, and you might want to check the JDBC driver to be sure it's rated as compatible with the version of the POstgresql database you are using. I imagine that can matter too.
Karl On Tue, May 28, 2019 at 9:35 AM Karl Wright <[email protected]> wrote: > When it fails again, I expect that the diagnostics will demonstrate > conclusively that we had a database transactional integrity failure, as I > have always seen when I look at the generated logs from this debug mode. > In the past this has happened most often with mysql, but once in a great > while with postgresql too. It hasn't been generally worth pursuing because > it is so infrequent, but in your case it sounds like it might be worth > opening a Postgresql ticket, provided we can describe the failure. Be > prepared to furnish a test case they can use to reproduce it. > > Bugs of this kind generally take forever to fix, which is why I'm not > thrilled with the prospect of having to push it through the process with > the Postgresql people. > > Karl > > > On Tue, May 28, 2019 at 9:24 AM Julien <[email protected]> > wrote: > >> For MCF I am using a Postgres 10.1 database which is located on the same >> machine that performs the crawl, so no reason there is a latency problem. >> >> >> >> The MCF threads are configured like this : >> >> <property name="org.apache.manifoldcf.database.maxhandles" value="200"/> >> >> <property name="org.apache.manifoldcf.crawler.threads" value="50"/> >> >> <property name="org.apache.manifoldcf.crawler.expirethreads" >> value="10"/> >> >> <property name="org.apache.manifoldcf.crawler.cleanupthreads" >> value="10"/> >> >> <property name="org.apache.manifoldcf.crawler.deletethreads" >> value="10"/> >> >> >> >> For info, following the recommended settings from the MCF documentation, >> I applied the following Postgres properties : >> >> >> >> max_connections = 405 >> >> shared_buffers = 1024MB >> >> checkpoint_timeout = 900s >> >> max_wal_size = 14GB >> >> autovacuum = off >> >> >> >> Julien >> >> >> >> *De : *Karl Wright <[email protected]> >> *Envoyé le :*mardi 28 mai 2019 15:02 >> *À : *[email protected] >> *Objet :*Re: Long running queries on jobqueue >> >> >> >> Hi Julien, >> >> >> >> "Error : Unexpected jobqueue status - record id 15588697113928, expecting >> active status, saw 0" >> >> As you said, the only thing that can be done here is to turn on >> diagnostic logging. Essentially, the status returned is not possible if >> the database is truly honoring transactional integrity, which is why the >> system gives up at that point. I have noted that this may be due to >> network latency reaching the database from the machine that is doing the >> crawling, but no way to tell for sure. Since it's a database-related >> issue, it's essential also to know which database you are using here. >> >> >> >> "I don’t really understand what is causing those long running queries..." >> >> The plan that is being executed looks OK but not great, which is bad news >> for a database that is large. The solution there is to hand-analyze the >> jobqueue and jobs tables. ManifoldCF does try to keep up with that but it >> is always possible that at some point the plan will go bad; it's heuristic >> at best. >> >> >> >> Karl >> >> >> >> >> >> On Tue, May 28, 2019 at 8:37 AM Julien <[email protected]> >> wrote: >> >> Hi, >> >> >> >> I need some help to better understand what I am experiencing with a job >> using a JDBC connector. The job crawls a table containing approximately 9M >> of documents. I am currently unable to complete this job as it randomly >> fails with the Following error : >> >> >> >> Error : Unexpected jobqueue status - record id 15588697113928, expecting >> active status, saw 0 >> >> >> >> The logs are showing a lot of long running queries, here is an example >> but there are a lot and on different query kind, but all of them concern >> the jobqueue table : >> >> >> >> WARN 2019-05-27T01:16:56,048 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db|Found a long-running query (92568 ms): >> [SELECT t0.id,t0.jobid,t0.dochash,t0.docid,t0.status,t0.failtime,t0.failcount >> FROM jobqueue t0 WHERE t0.docpriority<? AND (t0.status=? OR t0.status=?) >> AND t0.checkaction=? AND t0.checktime<=? AND EXISTS(SELECT 'x' FROM jobs t1 >> WHERE (t1.status=? OR t1.status=?) AND t1.id=t0.jobid AND >> t1.priority=?) AND NOT EXISTS(SELECT 'x' FROM jobqueue t2 WHERE >> t2.dochash=t0.dochash AND (t2.status=? OR t2.status=? OR t2.status=? OR >> t2.status=? OR t2.status=? OR t2.status=?) AND t2.jobid!=t0.jobid) AND NOT >> EXISTS(SELECT 'x' FROM prereqevents t3,events t4 WHERE t0.id=t3.owner >> AND t3.eventname=t4.name) ORDER BY t0.docpriority ASC LIMIT 40000] >> >> WARN 2019-05-27T01:16:56,053 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Parameter 0: '1.000000001E9' >> >> WARN 2019-05-27T01:16:56,053 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Parameter 1: 'P' >> >> WARN 2019-05-27T01:16:56,053 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Parameter 2: 'G' >> >> WARN 2019-05-27T01:16:56,053 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Parameter 3: 'R' >> >> WARN 2019-05-27T01:16:56,053 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Parameter 4: '1558919723463' >> >> WARN 2019-05-27T01:16:56,053 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Parameter 5: 'A' >> >> WARN 2019-05-27T01:16:56,053 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Parameter 6: 'a' >> >> WARN 2019-05-27T01:16:56,053 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Parameter 7: '5' >> >> WARN 2019-05-27T01:16:56,053 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Parameter 8: 'A' >> >> WARN 2019-05-27T01:16:56,053 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Parameter 9: 'F' >> >> WARN 2019-05-27T01:16:56,053 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Parameter 10: 'a' >> >> WARN 2019-05-27T01:16:56,053 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Parameter 11: 'f' >> >> WARN 2019-05-27T01:16:56,053 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Parameter 12: 'D' >> >> WARN 2019-05-27T01:16:56,053 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Parameter 13: 'd' >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: Limit (cost=89494.71..89495.11 >> rows=160 width=91) (actual time=15468.494..15481.412 rows=40000 loops=1) >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Sort >> (cost=89494.71..89495.11 rows=160 width=91) (actual >> time=15468.493..15477.328 rows=40000 loops=1) >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: Sort Key: t0.docpriority >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: Sort Method: external >> sort Disk: 102584kB >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Nested Loop Anti >> Join (cost=42423.35..89488.86 rows=160 width=91) (actual >> time=234.622..13792.727 rows=1164882 loops=1) >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Hash Anti Join >> (cost=42422.79..88190.34 rows=160 width=91) (actual time=234.593..1316.236 >> rows=1164882 loops=1) >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: Hash Cond: ( >> t0.id = t3.owner) >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Nested >> Loop (cost=42395.08..88160.62 rows=160 width=91) (actual >> time=234.568..1002.111 rows=1164882 loops=1) >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: Join >> Filter: (t0.jobid = t1.id) >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Seq >> Scan on jobs t1 (cost=0.00..11.40 rows=1 width=8) (actual >> time=0.015..0.016 rows=1 loops=1) >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: >> Filter: ((priority = '5'::bigint) AND ((status = 'A'::bpchar) OR (status = >> 'a'::bpchar))) >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: >> Rows Removed by Filter: 6 >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: -> >> Bitmap Heap Scan on jobqueue t0 (cost=42395.08..87989.60 rows=12769 >> width=91) (actual time=234.548..656.692 rows=1164882 loops=1) >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: >> Recheck Cond: (((docpriority < '1000000001'::double precision) AND (status >> = 'P'::bpchar) AND (checkaction = 'R'::bpchar) AND (checktime <= >> '1558919723463'::bigint)) OR ((status = 'G'::bpchar) AND (checkaction = >> 'R'::bpchar) AND (checktime <= '1558919723463'::bigint))) >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: >> Filter: (docpriority < '1000000001'::double precision) >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: >> Heap Blocks: exact=20094 >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: >> -> BitmapOr (cost=42395.08..42395.08 rows=12769 width=0) >> (actual time=229.219..229.219 rows=0 loops=1) >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| >> Plan: -> Bitmap Index Scan on >> i1556284697879 (cost=0.00..42384.25 rows=12769 width=0) (actual >> time=229.176..229.176 rows=1164882 loops=1) >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| >> Plan: Index Cond: ((docpriority >> < '1000000001'::double precision) AND (status = 'P'::bpchar) AND >> (checkaction = 'R'::bpchar) AND (checktime <= '1558919723463'::bigint)) >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| >> Plan: -> Bitmap Index Scan on >> i1556284697880 (cost=0.00..4.45 rows=1 width=0) (actual time=0.041..0.041 >> rows=93 loops=1) >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: >> Index Cond: ((status = 'G'::bpchar) >> AND (checkaction = 'R'::bpchar) AND (checktime <= '1558919723463'::bigint)) >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Hash >> (cost=26.08..26.08 rows=130 width=8) (actual time=0.007..0.007 rows=0 >> loops=1) >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: >> Buckets: 1024 Batches: 1 Memory Usage: 8kB >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: -> >> Hash Join (cost=12.93..26.08 rows=130 width=8) (actual time=0.006..0.006 >> rows=0 loops=1) >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: >> Hash Cond: ((t3.eventname)::text = (t4.name)::text) >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: >> -> Seq Scan on prereqevents t3 (cost=0.00..11.40 rows=140 width=524) >> (actual time=0.005..0.005 rows=0 loops=1) >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: >> -> Hash (cost=11.30..11.30 rows=130 width=516) (never executed) >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| >> Plan: -> Seq Scan on events t4 >> (cost=0.00..11.30 rows=130 width=516) (never executed) >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Index Scan >> using i1556284697878 on jobqueue t2 (cost=0.56..8.11 rows=1 width=49) >> (actual time=0.010..0.010 rows=0 loops=1164882) >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: Index Cond: >> ((dochash)::text = (t0.dochash)::text) >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: Filter: >> ((jobid <> t0.jobid) AND ((status = 'A'::bpchar) OR (status = 'F'::bpchar) >> OR (status = 'a'::bpchar) OR (status = 'f'::bpchar) OR (status = >> 'D'::bpchar) OR (status = 'd'::bpchar))) >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: Rows Removed >> by Filter: 1 >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: Planning time: 1.455 ms >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Plan: Execution time: 15518.479 ms >> >> WARN 2019-05-27T01:17:11,635 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| >> >> WARN 2019-05-27T01:17:11,749 (Stuffer thread) - >> MCF|MCF-agent|apache.manifoldcf.db| Stats: n_distinct=3.0 >> most_common_vals={C,P,A} >> most_common_freqs={0.779200017,0.220666662,0.000133333335} >> >> >> >> The same happens if I split the job into several jobs of 1.5M of docs at >> the difference that if I restart the jobs, some may end but with no >> guarantee. >> >> >> >> I don’t really understand what is causing those long running queries and >> the fact that the job/jobs can fail on an Error : Unexpected status… >> >> >> >> I tried to add the following parameter to the properties.xml to have >> better logs but it changed nothing : <property name=" >> org.apache.manifoldcf.diagnostics" value="DEBUG"/> >> >> >> >> The only thing I did that can explain a lack of performances is that I >> raised the org.apache.manifoldcf.ui.maxstatuscount from 500k to 2M. But I >> was only expecting low performances on job status UI. Is it possible that >> it is responsible of the long running queries ? >> >> >> >> Thanks for the help, >> Julien >> >> >> >> [image: >> https://ipmcdn.avast.com/images/icons/icon-envelope-tick-round-orange-animated-no-repeat-v1.gif] >> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient> >> >> Garanti sans virus. www.avast.com >> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient> >> >> >> >> >> >
