[jira] [Commented] (LUCENE-7254) DocIDSetBuilder is no good for points

Robert Muir (JIRA) Tue, 26 Apr 2016 05:35:34 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-7254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15257989#comment-15257989
 ]


Robert Muir commented on LUCENE-7254:
-------------------------------------

These are structures geared at range queries for spatial data. They are not for 
unique IDs or tuned for that.

If you want to make them tuned for unique IDs you must do much more, its way 
more than fucking around with a bitset, much much more than specializing 
exactQuery. For example remove per-segment clone() in intersect. That is enough 
to make ID performance too slow by itself.

But i'm sure you guys know all this already.

> DocIDSetBuilder is no good for points
> -------------------------------------
>
>                 Key: LUCENE-7254
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7254
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Robert Muir
>         Attachments: LUCENE-7254.patch, LUCENE-7254.patch
>
>
> For the postings lists, I think this approach works well in dense cases (e.g. 
> whole DISI's are added, things are coming in order, etc).
> However in the points case, it holds back range performance significantly. 
> There are a couple of problems here:
> * expensive cardinality computation (this is a 2% hit) when its totally 
> unnecessary. we can use index statistics to help here.
> * lots of conditional stuff in add(). This includes growing checks / bitset 
> switching checks and so on (which happens even if you are smart and call 
> grow, but this stuff all adds up). 
> I dont think we should try to create a magical shared API that is both 
> efficient for postings lists of unstructured stuff and at the same time point 
> collection for structured fields, instead we should just do things 
> differently for points and iterate from there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-7254) DocIDSetBuilder is no good for points

Reply via email to