Unfortunately, the terms dict is quite costly, so e.g. doing a
TermsEnum.seekCeil inside a DocsEnum.advance will probably really hurt
performance?

Mike McCandless

http://blog.mikemccandless.com


On Wed, Feb 12, 2014 at 4:12 PM, Mikhail Khludnev
<[email protected]> wrote:
> Hello,
>
> Some time ago Uwe defined the problem of making block-join more cute.
> https://issues.apache.org/jira/browse/LUCENE-5092?focusedCommentId=13736713&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13736713
> I'm not sure I got him right, but recently I thought (what to talk about in
> Washington) about comprehensive relations modeling cases. Anyway, I started
> from simple test for alternative block join implementation. The overall idea
> is:
>  - to keep blocks as-is, they are cute;
>  - to use term enum for looping parents while enumerating children on
> nextDoc(), hence these terms should be equal to docnums;
>  - to use a single element doclist to jump back to the previous parent for
> advance().
>
> Now you can see that I just tried to reuse trendy Lucene data-structures to
> get rid of rewindable bit-set. Right now, the code is ugly because I reusing
> them by plain document indexing, later it can be done better with a
> specialized codec/enum api. It makes no sense as just a block join
> replacement, but it might work out as general modelling approach.
>
> Here is the code
> https://github.com/m-khl/solr-patches/blob/af089475ec122630e231dbba397d5639013668e5/lucene/join/src/test/org/apache/lucene/search/join/TestBlockRelations.java?source=cc#L131
>
> Here it the scratches which might explain the current implementation
> http://goo.gl/yS1VZN
>
> Your feedback is appreciated.
> --
> Sincerely yours
> Mikhail Khludnev
> Principal Engineer,
> Grid Dynamics
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to