Unfortunately, the terms dict is quite costly, so e.g. doing a TermsEnum.seekCeil inside a DocsEnum.advance will probably really hurt performance?
Mike McCandless http://blog.mikemccandless.com On Wed, Feb 12, 2014 at 4:12 PM, Mikhail Khludnev <[email protected]> wrote: > Hello, > > Some time ago Uwe defined the problem of making block-join more cute. > https://issues.apache.org/jira/browse/LUCENE-5092?focusedCommentId=13736713&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13736713 > I'm not sure I got him right, but recently I thought (what to talk about in > Washington) about comprehensive relations modeling cases. Anyway, I started > from simple test for alternative block join implementation. The overall idea > is: > - to keep blocks as-is, they are cute; > - to use term enum for looping parents while enumerating children on > nextDoc(), hence these terms should be equal to docnums; > - to use a single element doclist to jump back to the previous parent for > advance(). > > Now you can see that I just tried to reuse trendy Lucene data-structures to > get rid of rewindable bit-set. Right now, the code is ugly because I reusing > them by plain document indexing, later it can be done better with a > specialized codec/enum api. It makes no sense as just a block join > replacement, but it might work out as general modelling approach. > > Here is the code > https://github.com/m-khl/solr-patches/blob/af089475ec122630e231dbba397d5639013668e5/lucene/join/src/test/org/apache/lucene/search/join/TestBlockRelations.java?source=cc#L131 > > Here it the scratches which might explain the current implementation > http://goo.gl/yS1VZN > > Your feedback is appreciated. > -- > Sincerely yours > Mikhail Khludnev > Principal Engineer, > Grid Dynamics > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
