[
https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664570#action_12664570
]
Michael McCandless commented on LUCENE-1476:
--------------------------------------------
bq. Why? The returned iterator can traverse the multiple bitvectors.
Woops, sorry: I missed that it would return a DocIdSet (iterator only) vs
underlying (current) BitVector. So then MultiReader could return a DocIdSet.
bq. If the segment is large, tombstones can solve this.
Right; I was saying, as things are today (single BitVector holds all deleted
docs), one limitation of the realtime approach we are moving towards is the
copy-on-write cost of the first delete on a freshly cloned reader for a large
segment.
If we moved to using only iterator API for accessing deleted docs within Lucene
then we could explore fixes for the copy-on-write cost w/o changing on-disk
representation of deletes. IE tombstones are perhaps overkill for Lucene,
since we're not using the filesystem as the intermediary for communicating
deletes to a reopened reader. We only need an in-RAM incremental solution.
> BitVector implement DocIdSet
> ----------------------------
>
> Key: LUCENE-1476
> URL: https://issues.apache.org/jira/browse/LUCENE-1476
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Affects Versions: 2.4
> Reporter: Jason Rutherglen
> Priority: Trivial
> Attachments: LUCENE-1476.patch, quasi_iterator_deletions.diff
>
> Original Estimate: 12h
> Remaining Estimate: 12h
>
> BitVector can implement DocIdSet. This is for making
> SegmentReader.deletedDocs pluggable.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]