It sounds like you're sorting a segment index after dedup, rather than a merged index. It also looks like there's a bug in IndexSorter. But you should be able to work around it by merging your segment indexes after deduping, so there are no deletions.

Please file a bug in Jira.

Doug

Michael wrote:
When i'm trying to use IndexSorter, i'm getting this error:

Exception in thread "main" java.lang.IllegalArgumentException: attempt to 
access a deleted document
        at 
org.apache.lucene.index.SegmentReader.document(SegmentReader.java:282)
        at 
org.apache.lucene.index.FilterIndexReader.document(FilterIndexReader.java:104)
        at 
org.apache.nutch.indexer.IndexSorter$SortingReader.document(IndexSorter.java:170)
        at 
org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:186)
        at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:88)
        at org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:579)
        at org.apache.nutch.indexer.IndexSorter.sort(IndexSorter.java:240)
        at org.apache.nutch.indexer.IndexSorter.main(IndexSorter.java:291)
Anyone knows how to fix this?
Michael

Reply via email to