[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet, IndexReader returns DocIdSet deleted docs

Michael McCandless (JIRA) Fri, 23 Jan 2009 12:46:23 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12666694#action_12666694
 ]


Michael McCandless commented on LUCENE-1476:
--------------------------------------------


For the perf tests, I would use an optimized index with maybe 2M
wikipedia docs in it.

Then test with maybe 0, 1, 5, 10, 25, 50, 75 percent deletions, across
various kinds of queries (single term, OR, AND, phrase/span).
Baseline w/ trunk, and then test w/ this patch (keeps deletion access
low (@ SegmentTermDocs) but switches to iterator API).  I'd love to
also see numbers for deletion-applied-as-filter (high) eventually.

[Actually if ever deletion %tg is > 50%, we should presumably invert
the bit set and work with that instead.  And same with filters.]

You might want to start with the Python scripts attached to
LUCENE-1483; with some small mods you could easily fix them to run
these tests.


> BitVector implement DocIdSet, IndexReader returns DocIdSet deleted docs
> -----------------------------------------------------------------------
>
>                 Key: LUCENE-1476
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1476
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: 2.4
>            Reporter: Jason Rutherglen
>            Priority: Trivial
>         Attachments: LUCENE-1476.patch, LUCENE-1476.patch, LUCENE-1476.patch, 
> quasi_iterator_deletions.diff, quasi_iterator_deletions_r2.diff
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> Update BitVector to implement DocIdSet.  Expose deleted docs DocIdSet from 
> IndexReader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet, IndexReader returns DocIdSet deleted docs

Reply via email to