[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12666694#action_12666694 ]
Michael McCandless commented on LUCENE-1476: -------------------------------------------- For the perf tests, I would use an optimized index with maybe 2M wikipedia docs in it. Then test with maybe 0, 1, 5, 10, 25, 50, 75 percent deletions, across various kinds of queries (single term, OR, AND, phrase/span). Baseline w/ trunk, and then test w/ this patch (keeps deletion access low (@ SegmentTermDocs) but switches to iterator API). I'd love to also see numbers for deletion-applied-as-filter (high) eventually. [Actually if ever deletion %tg is > 50%, we should presumably invert the bit set and work with that instead. And same with filters.] You might want to start with the Python scripts attached to LUCENE-1483; with some small mods you could easily fix them to run these tests. > BitVector implement DocIdSet, IndexReader returns DocIdSet deleted docs > ----------------------------------------------------------------------- > > Key: LUCENE-1476 > URL: https://issues.apache.org/jira/browse/LUCENE-1476 > Project: Lucene - Java > Issue Type: Improvement > Components: Index > Affects Versions: 2.4 > Reporter: Jason Rutherglen > Priority: Trivial > Attachments: LUCENE-1476.patch, LUCENE-1476.patch, LUCENE-1476.patch, > quasi_iterator_deletions.diff, quasi_iterator_deletions_r2.diff > > Original Estimate: 12h > Remaining Estimate: 12h > > Update BitVector to implement DocIdSet. Expose deleted docs DocIdSet from > IndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org