[
https://issues.apache.org/jira/browse/CONNECTORS-764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13739451#comment-13739451
]
Karl Wright commented on CONNECTORS-764:
----------------------------------------
Hmm, my tests are indicating a problem:
{code}
ERROR 2013-08-14 05:38:48,455 (Startup thread) - Startup thread aborting and
restarting due to database connection reset: Database exception: SQLException
doing query (23503): ERROR: update or delete on table "jobqueue" violates
foreign key constraint "prereqevents_owner_fkey" on table "prereqevents"
Detail: Key (id)=(1376473092294) is still referenced from table
"prereqevents".
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Database exception:
SQLException doing query (23503): ERROR: update or delete on table "jobqueue"
violates foreign key constraint "prereqevents_owner_fkey" on table
"prereqevents"
Detail: Key (id)=(1376473092294) is still referenced from table
"prereqevents".
at
org.apache.manifoldcf.core.database.Database.executeViaThread(Database.java:717)
at
org.apache.manifoldcf.core.database.Database.executeUncachedQuery(Database.java:745)
at
org.apache.manifoldcf.core.database.Database$QueryCacheExecutor.create(Database.java:1430)
at
org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:144)
at
org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:186)
at
org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:646)
at
org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performDelete(DBInterfacePostgreSQL.java:280)
at
org.apache.manifoldcf.core.database.BaseTable.performDelete(BaseTable.java:91)
at
org.apache.manifoldcf.crawler.jobs.JobQueue.prepareFullScan(JobQueue.java:577)
at
org.apache.manifoldcf.crawler.jobs.JobManager.prepareFullScan(JobManager.java:5592)
at
org.apache.manifoldcf.crawler.jobs.JobManager.prepareJobScan(JobManager.java:5506)
at
org.apache.manifoldcf.crawler.system.StartupThread.run(StartupThread.java:142)
Caused by: org.postgresql.util.PSQLException: ERROR: update or delete on table
"jobqueue" violates foreign key constraint "prereqevents_owner_fkey" on table
"prereqevents"
Detail: Key (id)=(1376473092294) is still referenced from table
"prereqevents".
at
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102)
at
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835)
at
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
at
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500)
at
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
at
org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:334)
at
org.apache.manifoldcf.core.database.Database.execute(Database.java:876)
at
org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:677)
{code}
So I think there's another problem buried here as well. Digging now...
> Hopcount logic fails to notice when the max number of hops is increased
> between crawls
> --------------------------------------------------------------------------------------
>
> Key: CONNECTORS-764
> URL: https://issues.apache.org/jira/browse/CONNECTORS-764
> Project: ManifoldCF
> Issue Type: Bug
> Components: Framework crawler agent
> Affects Versions: ManifoldCF 1.3
> Reporter: Karl Wright
> Assignee: Karl Wright
> Fix For: ManifoldCF 1.4
>
>
> When you do something like the following:
> (1) Set the max hops for a job relatively low
> (2) Crawl
> (3) Increase the max hops
> (4) Crawl again
> ... the documents that are labeled with the state "Hop count exceeded" at the
> end of the first crawl are never touched again. This is because there are no
> additional links added to the intrinsiclink table during the second crawl,
> and thus the method reactivateHopcountRemovedRecords() is never called,
> leaving the documents in an incorrect state.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira