Hi Markus, I've created a ticket for the exception. CONNECTORS-1114.
As for removal of a primary document that is not mentioned, do you mean that within processDocuments(), if you don't call any disposition method for a primary document, then that document is left around? If so, that behavior is intended -- it was necessary for backwards compatibility. The document should, of course, be cleaned up at the end of the job, as long as you are not doing a minimal crawl. If you are seeing some other kind of behavior, please try to describe it more completely so that I have a better idea what you mean. Thanks, Karl On Tue, Nov 25, 2014 at 3:25 AM, Markus Schuch <[email protected]> wrote: > Hi Karl, > > the patch for CONNECTORS-1111 fixes the cleanup issue. > > Another question about primary documents and their components: > > I have ingested a primary document with some components. > During the next processing the primary document should no longer be > indexed, but the sub components of it should still be indexed. > > My understanding is, that not mentioned components are automatically > removed. > Since the primary document is the "null" component, i expected the > framework would remove the primary document component if not mentioned, too. > > But this is not the case. Is this another bug or do i have to remove the > primary document somehow manually? > > There is an activity method removeDocument(identifier) which seems related. > But i do not fully understand the described usage scenario in the method's > javadoc. > > I tried the method. The result was the following database exception: > (Patches for CONNECTORS-1110 and CONNECTORS-1111 are applied) > > 2014-11-25 08:30:07,868 ERROR [Worker thread '1'] > org.apache.manifoldcf.crawlerthreads: Worker thread aborting and restarting > due to database connection reset: Database exception: SQLException doing > query (HY0000): You need to set exactly 3 parameters on the prepared > statement > org.apache.manifoldcf.core.interfaces.ManifoldCFException: Database > exception: SQLException doing query (HY0000): You need to set exactly 3 > parameters on the prepared statement > at > org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.finishUp(Database.java:702) > at > org.apache.manifoldcf.core.database.Database.executeViaThread(Database.java:728) > at > org.apache.manifoldcf.core.database.Database.executeUncachedQuery(Database.java:762) > at > org.apache.manifoldcf.core.database.Database$QueryCacheExecutor.create(Database.java:1435) > at > org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:146) > at > org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:191) > at > org.apache.manifoldcf.core.database.DBInterfaceMySQL.performQuery(DBInterfaceMySQL.java:875) > at > org.apache.manifoldcf.core.database.BaseTable.performQuery(BaseTable.java:221) > at > org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.findRowIdsForDocIds(IncrementalIngester.java:1518) > at > org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentRemoveMultiple(IncrementalIngester.java:1377) > at > org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentRemove(IncrementalIngester.java:803) > at > org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.removeDocument(WorkerThread.java:1674) > at > com.example.mcf.TestConnector.processDocuments(TestConnector.java:278) > at > org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:670) > at > org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:649) > at > org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:402) > at > org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:380) > Caused by: java.sql.SQLException: You need to set exactly 3 parameters on > the prepared statement > at > org.mariadb.jdbc.internal.SQLExceptionMapper.get(SQLExceptionMapper.java:149) > at > org.mariadb.jdbc.internal.SQLExceptionMapper.throwException(SQLExceptionMapper.java:106) > at > org.mariadb.jdbc.MySQLStatement.executeQueryEpilog(MySQLStatement.java:264) > at org.mariadb.jdbc.MySQLStatement.execute(MySQLStatement.java:288) > at > org.mariadb.jdbc.MySQLStatement.executeQuery(MySQLStatement.java:302) > at > org.mariadb.jdbc.MySQLPreparedStatement.executeQuery(MySQLPreparedStatement.java:112) > at > org.apache.manifoldcf.core.database.Database.execute(Database.java:880) > at > org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:683) > Caused by: org.mariadb.jdbc.internal.common.QueryException: You need to > set exactly 3 parameters on the prepared statement > at > org.mariadb.jdbc.internal.common.query.MySQLParameterizedQuery.validate(MySQLParameterizedQuery.java:117) > at > org.mariadb.jdbc.internal.mysql.MySQLProtocol.executeQuery(MySQLProtocol.java:976) > at org.mariadb.jdbc.MySQLStatement.execute(MySQLStatement.java:281) > > Regards, > Markus >
