Hi Karl,

I am developing my own repository where I borrowed some code from the file 
repository connector. I use my repository connector to crawling documents from 
IBM domino system. I managed to retrieve all the files in the domino, however, 
when I restart my job to recrawl the database in the domino, I've got problems 
with the following code where previousDocuments.get(documentIdentifierHash) in 
the WorkerThread.java(org.apache.manifoldcf.crawler.system) return null for 
some of the document ids. As a result, the job got stuck with the specific 
document id.

Could you please tell me how I could fix the problem?

 protected IPipelineSpecificationWithVersions 
computePipelineSpecificationWithVersions(String documentIdentifierHash,
      String componentIdentifierHash,
      String documentIdentifier)
    {
      QueuedDocument qd = previousDocuments.get(documentIdentifierHash);  // 
return null. The problem is here.
      if (qd == null)
        throw new IllegalArgumentException("Unrecognized document identifier: 
'"+documentIdentifier+"'");
      return new 
PipelineSpecificationWithVersions(pipelineSpecification,qd,componentIdentifierHash);
    }


Thanks a lot.

Cheng

Reply via email to