[jira] [Commented] (LUCENE-7304) Doc values based block join implementation

Paul Elschot (JIRA) Mon, 30 May 2016 12:11:38 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15306894#comment-15306894
 ]


Paul Elschot commented on LUCENE-7304:
--------------------------------------

When heap space is the only problem, one could also leave the index unchanged 
and create an EliasFanoSequence based on long[]'s because that is a little 
faster than the one based on a BytesRef.
One sequence per block level would be needed.

> Doc values based block join implementation
> ------------------------------------------
>
>                 Key: LUCENE-7304
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7304
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Martijn van Groningen
>            Priority: Minor
>         Attachments: LUCENE-5092-20140313.patch, LUCENE_7304.patch
>
>
> At query time the block join relies on a bitset for finding the previous 
> parent doc during advancing the doc id iterator. On large indices these 
> bitsets can consume large amounts of jvm heap space.  Also typically due the 
> nature how these bitsets are set, the 'FixedBitSet' implementation is used.
> The idea I had was to replace the bitset usage by a numeric doc values field 
> that stores offsets. Each child doc stores how many docids it is from its 
> parent doc and each parent stores how many docids it is apart from its first 
> child. At query time this information can be used to perform the block join.
> I think another benefit of this approach is that external tools can now 
> easily determine if a doc is part of a block of documents and perhaps this 
> also helps index time sorting?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-7304) Doc values based block join implementation

Reply via email to