Hi, Have you been modifying the framework code? If so, I really cannot help you.
If you haven't -- it looks like you've got code that is injecting document identifiers that are incorrect. But I will need to see a full stack trace to be sure of that. Thanks, Karl On Mon, Nov 12, 2018 at 4:06 AM Cheng Zeng <[email protected]> wrote: > Hi Karl, > > I am developing my own repository where I borrowed some code from the file > repository connector. I use my repository connector to crawling documents > from IBM domino system. I managed to retrieve all the files in the domino, > however, when I restart my job to recrawl the database in the domino, I've > got problems with the following code where > previousDocuments.get(documentIdentifierHash) > in the WorkerThread.java(org.apache.manifoldcf.crawler.system) return null > for some of the document ids. As a result, the job got stuck with the > specific document id. > > Could you please tell me how I could fix the problem? > > protected IPipelineSpecificationWithVersions > computePipelineSpecificationWithVersions(String documentIdentifierHash, > String componentIdentifierHash, > String documentIdentifier) > { > QueuedDocument qd = previousDocuments.get(documentIdentifierHash); > // return null. The problem is here. > if (qd == null) > throw new IllegalArgumentException("Unrecognized document > identifier: '"+documentIdentifier+"'"); > return new > PipelineSpecificationWithVersions(pipelineSpecification,qd,componentIdentifierHash); > } > > > Thanks a lot. > > Cheng >
