Hi Karl,
Thanks for coming back so quickly. Unfortunately I wasn’t using
a JCIFS connection. One of the issues I was seeing was between a crawl of an
intranet site (no explicit throttling other than number of connections) and
scheduled crawl (every 5 mins) to a relational DB via JDBC connector again no
explicit throttling. To simplify things both jobs are using a NULL output
connection. Sometimes both the Web crawl and the JDBC connection can run
together but at other times 1 or both jobs will appear to lock up with just a
few active documents showing. When I get a lock up the mcf log contains errors
like:
“DEBUG 2017-03-03 15:28:20,466 (Worker thread '5') - Exception Database
exception: SQLException doing query (40001): ERROR: could not serialize access
due to read/write dependencies among transactions”
See the attached log extract for a little more detail.
Any view why this might be happening?
Best Regards,
Guy
From: Karl Wright [mailto:[email protected]]
Sent: 03 March 2017 11:27
To: [email protected]
Subject: Re: Advice on which PostgreSQL to use with ManifoldCF 2.6
Hi Guy,
A issue with concurrent jobs is known for jobs sharing the same JCIFS
connection. Is that what you are using? This has nothing to do with the
version of Postgresql you are using; it has to do with what "bins" documents
are thought to come from. There has been a recent improvement for this issue,
which will be released in April. See
https://issues.apache.org/jira/browse/CONNECTORS-1364.
The current version of MCF (2.6) supports Solr 6.x.
Thanks,
Karl
On Fri, Mar 3, 2017 at 5:27 AM, Standen Guy
<[email protected]<mailto:[email protected]>> wrote:
Hi Karl,
I am currently using MCF 2.0.1 with PostgreSQL 9.3.5 on Windows and have had
some issues with multiple jobs running concurrently.
I am considering upgrading to MCF 2.6 and to a newer version of PostgreSQL.
Would you be able to advise which version of PostgreSQL I should consider using
with MCF 2.6 (e.g. PostgreSQL 9.3.16 or 9.6.2)
I am also considering upgrading from SOLR 4.10.3 to a newer version. The MCF
compatibility matrix mentions that compatibility has been tested to SOLR
version 4.5.1. Do you have any advice about compatibility with the newer
versions of SOLR e.g. 6.4.1.
Best Regards
Guy
Unless otherwise stated, this email has been sent from Fujitsu Services Limited
(registered in England No 96056); Fujitsu EMEA PLC (registered in England No
2216100) both with registered offices at: 22 Baker Street, London W1U 3BW; PFU
(EMEA) Limited, (registered in England No 1578652) and Fujitsu Laboratories of
Europe Limited (registered in England No. 4153469) both with registered offices
at: Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE.
This email is only for the use of its intended recipient. Its contents are
subject to a duty of confidence and may be privileged. Fujitsu does not
guarantee that this email has not been intercepted and amended or that it is
virus-free.
Unless otherwise stated, this email has been sent from Fujitsu Services Limited
(registered in England No 96056); Fujitsu EMEA PLC (registered in England No
2216100) both with registered offices at: 22 Baker Street, London W1U 3BW; PFU
(EMEA) Limited, (registered in England No 1578652) and Fujitsu Laboratories of
Europe Limited (registered in England No. 4153469) both with registered offices
at: Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE.
This email is only for the use of its intended recipient. Its contents are
subject to a duty of confidence and may be privileged. Fujitsu does not
guarantee that this email has not been intercepted and amended or that it is
virus-free.
DEBUG 2017-03-03 15:28:20,466 (Worker thread '0') - Successfully obtained
multiple critical sections!
DEBUG 2017-03-03 15:28:20,466 (Thread-1760) - Actual query: [SELECT
id,status,checktime FROM jobqueue WHERE dochash=? AND jobid=? FOR UPDATE]
DEBUG 2017-03-03 15:28:20,466 (Thread-1760) - Parameter 0:
'EA77DCC9751FA69A1A2612E31AFD38ADD509D639'
DEBUG 2017-03-03 15:28:20,466 (Thread-1760) - Parameter 1: '1488381381650'
DEBUG 2017-03-03 15:28:20,466 (Worker thread '5') - Reinterpreting exception
'Database exception: SQLException doing query (40001): ERROR: could not
serialize access due to read/write dependencies among transactions
Detail: Reason code: Canceled on identification as a pivot, during commit
attempt.
Hint: The transaction might succeed if retried.'. The exception type is 4
DEBUG 2017-03-03 15:28:20,466 (Worker thread '5') - Exception Database
exception: SQLException doing query (40001): ERROR: could not serialize access
due to read/write dependencies among transactions
Detail: Reason code: Canceled on identification as a pivot, during commit
attempt.
Hint: The transaction might succeed if retried. is possibly a transaction
abort signal
DEBUG 2017-03-03 15:28:20,466 (Worker thread '5') - Aborted transaction adding
20 docs and hopcounts for job 1488381381650 parent identifier hash
F7538448F9FCBDCA64265BC89FAAB332F8A2D9A2: ERROR: could not serialize access due
to read/write dependencies among transactions
Detail: Reason code: Canceled on identification as a pivot, during commit
attempt.
Hint: The transaction might succeed if retried.; sleeping for 11610 ms
org.apache.manifoldcf.core.interfaces.ManifoldCFException: ERROR: could not
serialize access due to read/write dependencies among transactions
Detail: Reason code: Canceled on identification as a pivot, during commit
attempt.
Hint: The transaction might succeed if retried.
at
org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.reinterpretException(DBInterfacePostgreSQL.java:625)
at
org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.commitCurrentTransaction(DBInterfacePostgreSQL.java:1275)
at
org.apache.manifoldcf.core.database.Database.performCommit(Database.java:313)
at
org.apache.manifoldcf.crawler.jobs.JobManager.addDocuments(JobManager.java:5116)
at
org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.processDocumentReferences(WorkerThread.java:1863)
at
org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.addDocumentReference(WorkerThread.java:1278)
at
org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector$ProcessActivityLinkHandler.noteDiscoveredLink(WebcrawlerConnector.java:6002)
at
org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector$ProcessActivityHTMLHandler.noteAHREF(WebcrawlerConnector.java:6126)
at
org.apache.manifoldcf.crawler.connectors.webcrawler.LinkParseState.noteNonscriptTag(LinkParseState.java:47)
at
org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState.noteNonscriptTag(FormParseState.java:53)
at
org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState.noteTag(ScriptParseState.java:53)
at
org.apache.manifoldcf.connectorcommon.fuzzyml.HTMLParseState.noteTag(HTMLParseState.java:53)
at
org.apache.manifoldcf.connectorcommon.fuzzyml.TagParseState.dealWithCharacter(TagParseState.java:626)
at
org.apache.manifoldcf.connectorcommon.fuzzyml.SingleCharacterReceiver.dealWithCharacters(SingleCharacterReceiver.java:51)
at
org.apache.manifoldcf.connectorcommon.fuzzyml.DecodingByteReceiver.dealWithBytes(DecodingByteReceiver.java:48)
at
org.apache.manifoldcf.connectorcommon.fuzzyml.Parser.parseWithoutCharsetDetection(Parser.java:99)
at
org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.handleHTML(WebcrawlerConnector.java:7017)
at
org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.extractLinks(WebcrawlerConnector.java:5959)
at
org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.processDocuments(WebcrawlerConnector.java:741)
at
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:379)
Caused by: org.postgresql.util.PSQLException: ERROR: could not serialize access
due to read/write dependencies among transactions
Detail: Reason code: Canceled on identification as a pivot, during commit
attempt.
Hint: The transaction might succeed if retried.
at
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102)
at
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835)
at
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
at
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500)
at
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:374)
at
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:366)
at
org.apache.manifoldcf.core.database.Database.execute(Database.java:863)
at
org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:683)
DEBUG 2017-03-03 15:28:20,467 (Worker thread '5') - Ending transaction
DEBUG 2017-03-03 15:28:20,467 (Worker thread '5') - Rolling transaction back!