[ 
https://issues.apache.org/jira/browse/LUCENE-6645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14621959#comment-14621959
 ] 

Adrien Grand commented on LUCENE-6645:
--------------------------------------

Sorry David, indeed I should have mentionned it indeed. When I looked at 
IntersectsRPTVerifyQuery, I saw it was using the produced bits so I thought it 
actually need bit sets, but maybe it doesn't and we could just use advance()? 
Regarding isDefinitelyEmpty, I'm wondering if we could keep the builders 
initially empty and then instantiate them on the first time than we need to add 
data? Then we could use a null check to know whether they have any content at 
all, would it work?

bq. Shouldn't QueryBitSetProducer in the "join" module use RoaringDocIdSet for 
it's cached docIdSets instead of the Fixed/Sparse choice chosen by the new 
BitSet.of method added in this patch? RoaringDocIdSet is ideal for caches; no?

RoaringDocIdSet is indeed our best option for caching. However, the join module 
needs random access and in particular nextSetBit/prevSetBit operations which we 
can't provide with RoaringDocIdSet. RoaringDocIdSet could potentially use 
binary search on blocks that are represented with a short[] for random-access, 
which ought to be fast given that short[] blocks can only contain 4096 
documents at most (when then are more docs, we use a bit set), but it was still 
much slower than random-access on SparseFixedBitSet and FixedBitSet so I 
preferred not to expose random access which might be a performance trap.

> BKD tree queries should use BitDocIdSet.Builder
> -----------------------------------------------
>
>                 Key: LUCENE-6645
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6645
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>             Fix For: 5.3, Trunk
>
>         Attachments: LUCENE-6645.patch, LUCENE-6645.patch, LUCENE-6645.patch, 
> LUCENE-6645.patch, LUCENE-6645.patch, LUCENE-6645.patch
>
>
> When I was iterating on BKD tree originally I remember trying to use this 
> builder (which makes a sparse bit set at first and then upgrades to dense if 
> enough bits get set) and being disappointed with its performance.
> I wound up just making a FixedBitSet every time, but this is obviously 
> wasteful for small queries.
> It could be the perf was poor because I was always .or'ing in DISIs that had 
> 512 - 1024 hits each time (the size of each leaf cell in the BKD tree)?  I 
> also had to make my own DISI wrapper around each leaf cell... maybe that was 
> the source of the slowness, not sure.
> I also sort of wondered whether the SmallDocSet in spatial module (backed by 
> a SentinelIntSet) might be faster ... though it'd need to be sorted in the 
> and after building before returning to Lucene.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to