[ 
https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715117#action_12715117
 ] 

Michael McCandless commented on LUCENE-1614:
--------------------------------------------


bq. But then, I don't see the benefit of adding that advance() call, since in 
the while loop you'd still need to check whether filterDoc > scorerDoc. So why 
do it? Is it just a matter of 'not relying on the above instruction, since at 
that point I know both should be advanced'?

Since you know both should be advanced, why loop back and do an
unecessary if check?  Ie, I the CPU will do more work (computing a
redundant if) if we don't advance both.

However, my solution (also advancing the scorer) is still not right,
because on cycling back you know scorerDoc >= filterDoc, so we've also
wasted an if check there... hmmm.

So actually let's just stick w/ your current approach, since it's a
straight conversion of what trunk currently does, except can we stop
checking NO_MORE_DOCS on each advance/nextDoc, and only check it when
we're about to collect a doc?

And, anyway, maybe we shouldn't fret so much about it; we know this
filter loop needs to change for 2.9 anyway.  I think we want something
like this:

  * If filter is random access, it should be pushed all the way down
    and applied just like deleted docs

  * If filter is "relatively" sparse compared to query, then we should
    to use the (to-be-added) DISI.check() method

  * Else, add filter as clause on BQ.  If incoming Query is already a
    BQ, clone that and tack filter on as clause; else, make a BQ and
    add both.

bq. Can you explain me what that means? Maybe I should also change it in my 
eclipse?

We're supposed to do this (tell svn to use native EOL -- "svn propset
svn:eol-style native") for all our sources, to avoid issues like
this.

bq. What if we document that you shouldn't call nextDoc() after it return 
NO_MORE_DOCS for now

OK, let's do that.  We can leave "taking advantage of this" (start
method) to another issue.  But let's be crystal clear on the semantics
of DISI for this issue.


> Add next() and skipTo() variants to DocIdSetIterator that return the current 
> doc, instead of boolean
> ----------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1614
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1614
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Shai Erera
>             Fix For: 2.9
>
>         Attachments: LUCENE-1614.patch, LUCENE-1614.patch, LUCENE-1614.patch, 
> LUCENE-1614.patch, LUCENE-1614.patch, LUCENE-1614.patch, LUCENE-1614.patch, 
> LUCENE-1614.patch, LUCENE-1614.patch, LUCENE-1614.patch
>
>
> See 
> http://www.nabble.com/Another-possible-optimization---now-in-DocIdSetIterator-p23223319.html
>  for the full discussion. The basic idea is to add variants to those two 
> methods that return the current doc they are at, to save successive calls to 
> doc(). If there are no more docs, return -1. A summary of what was discussed 
> so far:
> # Deprecate those two methods.
> # Add nextDoc() and skipToDoc(int) that return doc, with default impl in DISI 
> (calls next() and skipTo() respectively, and will be changed to abstract in 
> 3.0).
> #* I actually would like to propose an alternative to the names: advance() 
> and advance(int) - the first advances by one, the second advances to target.
> # Wherever these are used, do something like '(doc = advance()) >= 0' instead 
> of comparing to -1 for improved performance.
> I will post a patch shortly

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to