Re: Why does TermFreqDoubleValuesSource checks for PostingsEnum docId?

2019-01-28 Thread MarcoR
Thanks a lot Mikhail, that clarifies it for me. 

Cheers
M




--
Sent from: http://lucene.472066.n3.nabble.com/Lucene-Java-Developer-f564358.html

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Why does TermFreqDoubleValuesSource checks for PostingsEnum docId?

2019-01-28 Thread Mikhail Khludnev
Here is the case: the first doc under pe is 100.
- the first call advanceExact(10) returns false and pe is advanced to 100.
- the second call advanceExact(20) returns false without attempting to
advance pe, but just checking current doc.
another edge case, when pe is exhausted and docId() returns NO_MORE_DOCS
(Integer.MAX_VALUE), thus there will be no more attempts to advance it
again.

On Sun, Jan 27, 2019 at 10:35 PM MarcoR 
wrote:

> Hi,
>
> TermFreqDoubleValuesSource is used to expose a particular indexreader stat
> as per https://issues.apache.org/jira/browse/LUCENE-7736
>
> I noticed that the advanceExact method is checking for the condition in
> which the postingsenum may have moved ahead of the doc Id of the
> FilterScorer which calls advanceExact method (I'm referring to
> implementation of FunctionScoreWeight.scorer method)
>
> The code I'm talking about is this:
>
> return new DoubleValues() {
> @Override
> public double doubleValue() throws IOException {
>   return pe.freq();
> }
>
> @Override
> public boolean advanceExact(int doc) throws IOException {
>   if (pe.docID() > doc)
> return false;
>   return pe.docID() == doc || pe.advance(doc) == doc;
> }
>   };
>
> What I'm curious about is why the check on pe.docId() is necessary? Given
> the postings enum is created specifically for the purpose of building the
> DoubleValues instance and its reference does not escape the parent
> TermFreqDoubleValuesSource .getValues(..) method, how can it get ahead of
> the scorer's docId?
> I can imagine the scorer's doc id getting ahead of the postings enum since
> the scorer my skip documents that does not fit some criteria but since
> there
> is nothing else that could be moving the postings enum ahead, why this
> check?
>
> An obvious extension of the question is, why do we return false if the
> checked condition is true? Is it really the case that DoubleValues cannot
> be
> provided if postingsEnum's docId is ahead? How do we safely assume that
> this
> is the case until scorer catches up with postings enum?
>
> I'm working on implementing some custom logic based on
> TermFreqDoubleValuesSource and I would like to understand its mechanics
> properly.
>
>
>
> --
> Sent from:
> http://lucene.472066.n3.nabble.com/Lucene-Java-Developer-f564358.html
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-- 
Sincerely yours
Mikhail Khludnev


Why does TermFreqDoubleValuesSource checks for PostingsEnum docId?

2019-01-27 Thread MarcoR
Hi, 

TermFreqDoubleValuesSource is used to expose a particular indexreader stat
as per https://issues.apache.org/jira/browse/LUCENE-7736 

I noticed that the advanceExact method is checking for the condition in
which the postingsenum may have moved ahead of the doc Id of the
FilterScorer which calls advanceExact method (I'm referring to
implementation of FunctionScoreWeight.scorer method)

The code I'm talking about is this:

return new DoubleValues() {
@Override
public double doubleValue() throws IOException {
  return pe.freq();
}

@Override
public boolean advanceExact(int doc) throws IOException {
  if (pe.docID() > doc)
return false;
  return pe.docID() == doc || pe.advance(doc) == doc;
}
  };

What I'm curious about is why the check on pe.docId() is necessary? Given
the postings enum is created specifically for the purpose of building the
DoubleValues instance and its reference does not escape the parent
TermFreqDoubleValuesSource .getValues(..) method, how can it get ahead of
the scorer's docId? 
I can imagine the scorer's doc id getting ahead of the postings enum since
the scorer my skip documents that does not fit some criteria but since there
is nothing else that could be moving the postings enum ahead, why this
check?

An obvious extension of the question is, why do we return false if the
checked condition is true? Is it really the case that DoubleValues cannot be
provided if postingsEnum's docId is ahead? How do we safely assume that this
is the case until scorer catches up with postings enum? 

I'm working on implementing some custom logic based on
TermFreqDoubleValuesSource and I would like to understand its mechanics
properly. 



--
Sent from: http://lucene.472066.n3.nabble.com/Lucene-Java-Developer-f564358.html

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org