[
https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653889#action_12653889
]
Michael McCandless commented on LUCENE-1476:
--------------------------------------------
bq. It seemed wrong to pay the method call overhead for IndexReader.isDeleted()
on each iter in NOTScorer.next() or MatchAllScorer.next(), when we could just
store the next deletion:
Nice! This is what I had in mind.
I think we could [almost] do this across the board for Lucene.
SegmentTermDocs would similarly store nextDeleted and apply the same
"AND NOT" logic.
bq. that's because IndexReader.isDeleted() isn't exposed and because
IndexReader.fetchDoc(int docNum) returns the doc even if it's deleted
Hmm -- that is very nicely enabling.
bq. I've actually been trying to figure out a new design for deletions because
writing them out for big segments is our last big write bottleneck
One approach would be to use a "segmented" model. IE, if a few
deletions are added, write that to a new "deletes segment", ie a
single "normal segment" would then have multiple deletion files
associated with it. These would have to be merged (iterator) when
used during searching, and, periodically coalesced.
bq. if we only need iterator access, we can use vbyte encoding instead
Right: if there are relatively few deletes against a segment, encoding
the "on bits" directly (or deltas) should be a decent win since
iteration is much faster.
> BitVector implement DocIdSet
> ----------------------------
>
> Key: LUCENE-1476
> URL: https://issues.apache.org/jira/browse/LUCENE-1476
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Affects Versions: 2.4
> Reporter: Jason Rutherglen
> Priority: Trivial
> Attachments: LUCENE-1476.patch
>
> Original Estimate: 12h
> Remaining Estimate: 12h
>
> BitVector can implement DocIdSet. This is for making
> SegmentReader.deletedDocs pluggable.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]