[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668675#action_12668675 ]
Michael McCandless commented on LUCENE-1476: -------------------------------------------- {quote} The main problem we're trying to solve is potential allocation of a large del docs BV byte array for the copy on write of a cloned reader. {quote} Right, as long as normal search performance does not get worse. Actually, I was hoping "deletes as iterator" and "deletes higher up as filter" might give us some gains in search performance. {quote} An option we haven't looked at is a MultiByteArray where multiple byte arrays make up a virtual byte array checked by BV.get. On deleteDocument, only the byte array chunks that are changed are replaced in the new version, while the previously copied chunks are kept. The overhead of the BV.get can be minimal, though in our tests with an int array version the performance can either be equal to or double based on factors we are not aware of. {quote} I think that'd be a good approach (it amortizes the copy on write cost), though it'd be a double deref per lookup with the straightforward impl so I think it'll hurt normal search perf too. And I don't think we should give up on iterator access just yet... I think we should try list-of-sorted-ints? > BitVector implement DocIdSet, IndexReader returns DocIdSet deleted docs > ----------------------------------------------------------------------- > > Key: LUCENE-1476 > URL: https://issues.apache.org/jira/browse/LUCENE-1476 > Project: Lucene - Java > Issue Type: Improvement > Components: Index > Affects Versions: 2.4 > Reporter: Jason Rutherglen > Priority: Trivial > Attachments: LUCENE-1476.patch, LUCENE-1476.patch, LUCENE-1476.patch, > LUCENE-1476.patch, LUCENE-1476.patch, quasi_iterator_deletions.diff, > quasi_iterator_deletions_r2.diff, quasi_iterator_deletions_r3.diff, > searchdeletes.alg, sortBench2.py, sortCollate2.py, TestDeletesDocIdSet.java > > Original Estimate: 12h > Remaining Estimate: 12h > > Update BitVector to implement DocIdSet. Expose deleted docs DocIdSet from > IndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org