On Thursday 26 January 2006 09:15, Chun Wei Ho wrote:
> I am attempting to prune an index by getting each document in turn and
> then checking/deleting it:
>
> IndexReader ir = IndexReader.open(path);
> for(int i=0;i<ir.numDocs();i++) {
> Document doc = ir.document(i);
> if(thisDocShouldBeDeleted(doc)) {
> ir.delete(docNum); // <- I need the docNum for doc.
> }
> }
>
> How do I get the docNum for IndexReader.delete() function in the above
> case? Is there a API function I am missing? I am working with a merged
The document number is the variable i in this case.
> index over different segments so the docNum might not be in running
> sequence with the counter i.
>
> In general, is there a better way to do this sort of thing?
This code:
Document doc = ir.document(i);
normally retrieves all the stored fields of the document and that is
quite costly. In case you know that the document(s) to be deleted
match(es) a Term, it's better to use IndexReader.delete(Term).
Regards,
Paul Elschot
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]