[
https://issues.apache.org/jira/browse/LUCENE-1485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12655849#action_12655849
]
Jason Rutherglen commented on LUCENE-1485:
------------------------------------------
Grant write: "Is the "slightly" in the noise? "
Seems to be. Perhaps it needs more performance tests. It is somewhat
surprising given OpenBitSet is supposed to be the "fastest" bitset. It seems
that Lucene should have ways to incorporate new bitset implementations in the
future using interfaces and things. That being said it would be great if in
Lucene 3.0 the entire IndexReader class tree was rewritten to not be such as
mess with the locking, reopen, and ref counting. Marvin is proposing some good
ideas to make it all more pluggable. I need to spend some time with folks
figuring what APIs would be optimal for not tying all the APIs together like
the twisted mess it is now. For example, IndexReader shouldn't have a static
open method attached to it. It seems like new index features like column
stride fields implemented in todays system would exacerbate the problem because
then there's more code that is impossible to customize if desired.
SegmentMerger needs to be pluggable as today it cannot be customized without
possibly breaking the entirety of Lucene, and the custom code cannot be checked
in as a contrib. There more to write but I should save it for a more
structured and timely discussion.
> Use OpenBitSet instead of BitVector in SegmentReader
> ----------------------------------------------------
>
> Key: LUCENE-1485
> URL: https://issues.apache.org/jira/browse/LUCENE-1485
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Affects Versions: 2.4
> Reporter: Jason Rutherglen
> Priority: Minor
> Attachments: TestDeletedDocsSpeed.java
>
> Original Estimate: 96h
> Remaining Estimate: 96h
>
> Tried out BitVector.get vs OpenBitSet.get here's the results which are about
> the same after running 25 times in milliseconds. It is assumed that
> implementing DocIdSetIterator in SegmentTermDocs will speed things up more.
> bit set size: 10,485,760
> set bits count: 524,032
> openbitset: 68
> bitvector: 89
> 24% speed increase.
> I will implement a patch that adds the WriteableBitSet interface and make a
> subclass of OpenBitSet that is writeable to disk. We're working on an
> isSparse method for OpenBitSet.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]