[jira] [Commented] (LUCENE-9114) Add FunctionValues.cost

2020-05-16 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17109086#comment-17109086
 ] 

David Smiley commented on LUCENE-9114:
--

Yep I know we tried then avoided making ValueSourceScorer mutable for the cost 
and ultimately stopped at the way it is.  I should have tried it myself as a 
final Q/A at that time.  I can be forgiven; I was on vacation at the time :)  

RE "returning a weird Float.NEGATIVE_INFINITY" I don't see how this comes into 
play if there's an optional interface.  If there is no optional interface, then 
ValueSourceScorer would have to cast the weight to FunctionRangeWeight in 
particular, which isn't cool.

> Add FunctionValues.cost
> ---
>
> Key: LUCENE-9114
> URL: https://issues.apache.org/jira/browse/LUCENE-9114
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/query
>Reporter: David Smiley
>Assignee: Atri Sharma
>Priority: Major
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> The FunctionRangeQuery uses FunctionValues.getRangeScorer which returns a 
> subclass of  ValueSourceScorer.  VSC's TwoPhaseIterator has a matchCost impl 
> that returns a constant 100.  This is pretty terrible; the cost should vary 
> based on the complexity of the ValueSource provided to FRQ.  ValueSource's 
> are typically nested a number of levels, so they should aggregate.
> BTW there is a parallel concern for FunctionMatchQuery which works with 
> DoubleValuesSource which doesn't have a cost either, and unsurprisingly there 
> is a TPI with matchCost 100 there.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9114) Add FunctionValues.cost

2020-05-16 Thread Atri Sharma (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17108885#comment-17108885
 ] 

Atri Sharma commented on LUCENE-9114:
-

[~dsmiley] If you recall, that was one of the ways that I had done in the 
iteration for this PR :)

 

I agree with allowing the Weight to define an internal cost that 
ValueSourceScorer.matchCost can delegate to – it can be return a weird value 
(Float.NEGATIVE_INFINITY) to define that it is not implemented and then it is 
the matchCost's job to ensure that it does the right thing?

> Add FunctionValues.cost
> ---
>
> Key: LUCENE-9114
> URL: https://issues.apache.org/jira/browse/LUCENE-9114
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/query
>Reporter: David Smiley
>Assignee: Atri Sharma
>Priority: Major
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> The FunctionRangeQuery uses FunctionValues.getRangeScorer which returns a 
> subclass of  ValueSourceScorer.  VSC's TwoPhaseIterator has a matchCost impl 
> that returns a constant 100.  This is pretty terrible; the cost should vary 
> based on the complexity of the ValueSource provided to FRQ.  ValueSource's 
> are typically nested a number of levels, so they should aggregate.
> BTW there is a parallel concern for FunctionMatchQuery which works with 
> DoubleValuesSource which doesn't have a cost either, and unsurprisingly there 
> is a TPI with matchCost 100 there.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9114) Add FunctionValues.cost

2020-05-15 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17108878#comment-17108878
 ] 

David Smiley commented on LUCENE-9114:
--

[~atris] I noticed you forgot the "fix version" here which apparently should be 
"master (9.0)" based on the fact that you didn't port to branch_8x.  But why 
not back-port?  AFAICT it's backwards compatible.

I was looking at this tonight to see how difficult it may be to supply the cost 
to FunctionRangeQuery (in Lucene) somehow.  It's pretty difficult – requiring a 
delegating ValueSource and worse a delegating FunctionValues which is a huge 
interface and would add some overhead to per-document evaluation.  I thought of 
another approach to customize the cost:  What if there was an interface 
HasMatchCost (perhaps declared within the ValueSourceScorer interface to 
clearly associate where it's used) that can be implemented by the Weight, in 
this case, FunctionRangeWeight.  ValueSourceRangeScorer.matchCost could check 
its Weight to see if it implements this, and if so then cast and call matchCost 
on that to return.

I took a peek at the similar FunctionMatchQuery class to see what's different 
there, and I see the matchCost is defined within this file and thus should be 
very easy to customize the cost.  Given that the ValueSource API is a big 
Legacy and "DoubleValueSource" & "LongValueSource" (used by FunctionMatchQuery) 
is the future, maybe I should just go this route instead.

 

> Add FunctionValues.cost
> ---
>
> Key: LUCENE-9114
> URL: https://issues.apache.org/jira/browse/LUCENE-9114
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/query
>Reporter: David Smiley
>Assignee: Atri Sharma
>Priority: Major
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> The FunctionRangeQuery uses FunctionValues.getRangeScorer which returns a 
> subclass of  ValueSourceScorer.  VSC's TwoPhaseIterator has a matchCost impl 
> that returns a constant 100.  This is pretty terrible; the cost should vary 
> based on the complexity of the ValueSource provided to FRQ.  ValueSource's 
> are typically nested a number of levels, so they should aggregate.
> BTW there is a parallel concern for FunctionMatchQuery which works with 
> DoubleValuesSource which doesn't have a cost either, and unsurprisingly there 
> is a TPI with matchCost 100 there.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9114) Add FunctionValues.cost

2020-03-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17051790#comment-17051790
 ] 

ASF subversion and git services commented on LUCENE-9114:
-

Commit d751cf626ec639d38b955d3962ae347aea00c0ac in lucene-solr's branch 
refs/heads/master from Atri Sharma
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d751cf6 ]

LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation (#1303)

This commit makes ValueSourceScorer's costing algorithm also take the delegated 
FunctionValues's cost into consideration when calculating its cost. 
FunctionValues now exposes a cost method which is used by ValueSourceScorer's 
default matchCost method. In addition, ValueSourceScorer exposes a matchCost 
method which can be overridden to specify a custom costing mechanism

> Add FunctionValues.cost
> ---
>
> Key: LUCENE-9114
> URL: https://issues.apache.org/jira/browse/LUCENE-9114
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/query
>Reporter: David Smiley
>Assignee: Atri Sharma
>Priority: Major
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> The FunctionRangeQuery uses FunctionValues.getRangeScorer which returns a 
> subclass of  ValueSourceScorer.  VSC's TwoPhaseIterator has a matchCost impl 
> that returns a constant 100.  This is pretty terrible; the cost should vary 
> based on the complexity of the ValueSource provided to FRQ.  ValueSource's 
> are typically nested a number of levels, so they should aggregate.
> BTW there is a parallel concern for FunctionMatchQuery which works with 
> DoubleValuesSource which doesn't have a cost either, and unsurprisingly there 
> is a TPI with matchCost 100 there.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9114) Add FunctionValues.cost

2020-03-01 Thread Atri Sharma (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17048492#comment-17048492
 ] 

Atri Sharma commented on LUCENE-9114:
-

[~dsmiley] I have raised a PR for the same -- it is a minimalistic change to 
allow VSS to incorporate the delegated FunctionValues's cost into its cost. 
Would that help you get unblocked by adding stacked FunctionValues with custom 
costing functions?

> Add FunctionValues.cost
> ---
>
> Key: LUCENE-9114
> URL: https://issues.apache.org/jira/browse/LUCENE-9114
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/query
>Reporter: David Smiley
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The FunctionRangeQuery uses FunctionValues.getRangeScorer which returns a 
> subclass of  ValueSourceScorer.  VSC's TwoPhaseIterator has a matchCost impl 
> that returns a constant 100.  This is pretty terrible; the cost should vary 
> based on the complexity of the ValueSource provided to FRQ.  ValueSource's 
> are typically nested a number of levels, so they should aggregate.
> BTW there is a parallel concern for FunctionMatchQuery which works with 
> DoubleValuesSource which doesn't have a cost either, and unsurprisingly there 
> is a TPI with matchCost 100 there.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9114) Add FunctionValues.cost

2020-02-29 Thread Atri Sharma (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17048261#comment-17048261
 ] 

Atri Sharma commented on LUCENE-9114:
-

I strongly believe that this is the right approach and we should be pursuing 
this. I am actively working on this and will post a patch by Monday morning

> Add FunctionValues.cost
> ---
>
> Key: LUCENE-9114
> URL: https://issues.apache.org/jira/browse/LUCENE-9114
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/query
>Reporter: David Smiley
>Priority: Major
>
> The FunctionRangeQuery uses FunctionValues.getRangeScorer which returns a 
> subclass of  ValueSourceScorer.  VSC's TwoPhaseIterator has a matchCost impl 
> that returns a constant 100.  This is pretty terrible; the cost should vary 
> based on the complexity of the ValueSource provided to FRQ.  ValueSource's 
> are typically nested a number of levels, so they should aggregate.
> BTW there is a parallel concern for FunctionMatchQuery which works with 
> DoubleValuesSource which doesn't have a cost either, and unsurprisingly there 
> is a TPI with matchCost 100 there.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9114) Add FunctionValues.cost

2020-02-28 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047928#comment-17047928
 ] 

David Smiley commented on LUCENE-9114:
--

If this is too time-consuming, we could reduce the scope to merely make the 
cost settable by a query parser (this issue not touching any query parser, 
however).  That's the minimum I need to unblock using the costs on the Solr 
side (it's QParsers), which I want to do after this.

> Add FunctionValues.cost
> ---
>
> Key: LUCENE-9114
> URL: https://issues.apache.org/jira/browse/LUCENE-9114
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/query
>Reporter: David Smiley
>Priority: Major
>
> The FunctionRangeQuery uses FunctionValues.getRangeScorer which returns a 
> subclass of  ValueSourceScorer.  VSC's TwoPhaseIterator has a matchCost impl 
> that returns a constant 100.  This is pretty terrible; the cost should vary 
> based on the complexity of the ValueSource provided to FRQ.  ValueSource's 
> are typically nested a number of levels, so they should aggregate.
> BTW there is a parallel concern for FunctionMatchQuery which works with 
> DoubleValuesSource which doesn't have a cost either, and unsurprisingly there 
> is a TPI with matchCost 100 there.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9114) Add FunctionValues.cost

2020-01-29 Thread Atri Sharma (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17026083#comment-17026083
 ] 

Atri Sharma commented on LUCENE-9114:
-

+1 - I am on it. I am a bit groggy from sleep right now -- will post my 
thoughts tomorrow.

> Add FunctionValues.cost
> ---
>
> Key: LUCENE-9114
> URL: https://issues.apache.org/jira/browse/LUCENE-9114
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/query
>Reporter: David Smiley
>Priority: Major
>
> The FunctionRangeQuery uses FunctionValues.getRangeScorer which returns a 
> subclass of  ValueSourceScorer.  VSC's TwoPhaseIterator has a matchCost impl 
> that returns a constant 100.  This is pretty terrible; the cost should vary 
> based on the complexity of the ValueSource provided to FRQ.  ValueSource's 
> are typically nested a number of levels, so they should aggregate.
> BTW there is a parallel concern for FunctionMatchQuery which works with 
> DoubleValuesSource which doesn't have a cost either, and unsurprisingly there 
> is a TPI with matchCost 100 there.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9114) Add FunctionValues.cost

2020-01-29 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17026082#comment-17026082
 ] 

David Smiley commented on LUCENE-9114:
--

I'd love it if you take it [~atris]; thanks!  I have no WIP.

I was thinking that the typical cost could be something like DEF_COST + the sum 
of costs of delegated FunctionValues.  DEF_COST being say 5, perhaps less.  
This way you don't think to much; you can mechanically add a bunch of cost 
impls in this way.  And if you or someone writes a different more thoughtful 
cost, it'll be more apparent that it's not the default.  

Maybe DoubleFieldSource should have an impl defaulting to its NumericDocValue's 
cost(), maybe plus some constant.  But no, NumericDocValues is a 
DocIdSetIterator and the cost of that is the # of docs (100's of thousands 
maybe), not the cost-per-lookup.  Be mindful of the distinction.

I think the PR should be against master and make cost abstract, thus forcing 
implementations to choose something sensible.  The 8x backport should provide a 
default implementation, though, for back-compat.

> Add FunctionValues.cost
> ---
>
> Key: LUCENE-9114
> URL: https://issues.apache.org/jira/browse/LUCENE-9114
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/query
>Reporter: David Smiley
>Priority: Major
>
> The FunctionRangeQuery uses FunctionValues.getRangeScorer which returns a 
> subclass of  ValueSourceScorer.  VSC's TwoPhaseIterator has a matchCost impl 
> that returns a constant 100.  This is pretty terrible; the cost should vary 
> based on the complexity of the ValueSource provided to FRQ.  ValueSource's 
> are typically nested a number of levels, so they should aggregate.
> BTW there is a parallel concern for FunctionMatchQuery which works with 
> DoubleValuesSource which doesn't have a cost either, and unsurprisingly there 
> is a TPI with matchCost 100 there.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9114) Add FunctionValues.cost

2020-01-29 Thread Atri Sharma (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17026048#comment-17026048
 ] 

Atri Sharma commented on LUCENE-9114:
-

[~dsmiley] This looks interesting -- I am happy to hack on this one unless you 
are planning to. Please let me know.

> Add FunctionValues.cost
> ---
>
> Key: LUCENE-9114
> URL: https://issues.apache.org/jira/browse/LUCENE-9114
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/query
>Reporter: David Smiley
>Priority: Major
>
> The FunctionRangeQuery uses FunctionValues.getRangeScorer which returns a 
> subclass of  ValueSourceScorer.  VSC's TwoPhaseIterator has a matchCost impl 
> that returns a constant 100.  This is pretty terrible; the cost should vary 
> based on the complexity of the ValueSource provided to FRQ.  ValueSource's 
> are typically nested a number of levels, so they should aggregate.
> BTW there is a parallel concern for FunctionMatchQuery which works with 
> DoubleValuesSource which doesn't have a cost either, and unsurprisingly there 
> is a TPI with matchCost 100 there.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org