[jira] [Commented] (SOLR-12581) add a "min_popularity" option to relatedness() aggregation that forces scores to -Inf if fg/bg pops don't meet a threshold

2018-07-25 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16556036#comment-16556036
 ] 

ASF subversion and git services commented on SOLR-12581:


Commit 71c0bddd149b7c0364fbba8d31494dcd9f57f1ef in lucene-solr's branch 
refs/heads/master from Chris Hostetter
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=71c0bdd ]

SOLR-12581: the JSON Facet 'relatedness()' aggregate function now supports a 
'min_popularity' option using the extended type:func syntax


> add a "min_popularity" option to relatedness() aggregation that forces scores 
> to -Inf if fg/bg pops don't meet a threshold
> --
>
> Key: SOLR-12581
> URL: https://issues.apache.org/jira/browse/SOLR-12581
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Major
> Attachments: SOLR-12581.patch, SOLR-12581.patch
>
>
> as discussed in SOLR-9480 and noted in TODO comments, the original SKG code 
> base had a "min_pop" option that would completely ignore "terms" if the fg/bg 
> popularities weren't above a user specified threshold.  with the 
> implementation of SKG as a {{relatedness()}} aggregation function, we need to 
> leave any actual filtering of buckets by an aggregation result to a future 
> generalized JSON facet enhancement, but what we can do today is implement an 
> optional {{min_popularity}} option on {{relatedness()}} that forces the 
> {{relatedness}} score to -Infinity so buckets that don't meat the threshold 
> at least score "as low as possible"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12581) add a "min_popularity" option to relatedness() aggregation that forces scores to -Inf if fg/bg pops don't meet a threshold

2018-07-25 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16556035#comment-16556035
 ] 

ASF subversion and git services commented on SOLR-12581:


Commit cf9c3c11a28deff188f4edb5ee5cdd0637cdb958 in lucene-solr's branch 
refs/heads/branch_7x from Chris Hostetter
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=cf9c3c1 ]

SOLR-12581: the JSON Facet 'relatedness()' aggregate function now supports a 
'min_popularity' option using the extended type:func syntax

(cherry picked from commit 71c0bddd149b7c0364fbba8d31494dcd9f57f1ef)


> add a "min_popularity" option to relatedness() aggregation that forces scores 
> to -Inf if fg/bg pops don't meet a threshold
> --
>
> Key: SOLR-12581
> URL: https://issues.apache.org/jira/browse/SOLR-12581
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Major
> Attachments: SOLR-12581.patch, SOLR-12581.patch
>
>
> as discussed in SOLR-9480 and noted in TODO comments, the original SKG code 
> base had a "min_pop" option that would completely ignore "terms" if the fg/bg 
> popularities weren't above a user specified threshold.  with the 
> implementation of SKG as a {{relatedness()}} aggregation function, we need to 
> leave any actual filtering of buckets by an aggregation result to a future 
> generalized JSON facet enhancement, but what we can do today is implement an 
> optional {{min_popularity}} option on {{relatedness()}} that forces the 
> {{relatedness}} score to -Infinity so buckets that don't meat the threshold 
> at least score "as low as possible"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12581) add a "min_popularity" option to relatedness() aggregation that forces scores to -Inf if fg/bg pops don't meet a threshold

2018-07-23 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553510#comment-16553510
 ] 

Hoss Man commented on SOLR-12581:
-

bq. Sort of a side-question, but this work seems to overlap/compliment the 
significantTerms ..

That does seem very tangential, but important to consider – so i spun it off 
into SOLR-12582. One key aspect of my comments there that is directly 
signifincat to this issue is the question of the name "min_popularity" and 
wether it makes sense to re-think that...

{quote}as far as trying to maintain the same option names – i don't know that 
that is feasible or really makes sense – at least in so much as adding new 
options to relatedness() using hte same names as the existing options on 
significantTerms.  notably the existing {{minDocFreq}} option on 
significantTerms is similar _in concept_ to the {{min_popularity}} option 
proposed in SOLR-12581 for relatedness(), but it would not really make sense to 
use {{minDocFreq}} as the option name in SOLR-12581 since the relatedness() 
function isn't tied to "terms" the way significantTerms is – so "docFreq" has 
no real meaning, and the more general "popularity" makes more sense (i suppose 
we could change the option name in signifncatTerms – but even then 
significantTerms doesn't produce the same concept of "popularity" that 
relatedness() does, andeven if it did because that that expression focuses 
exclusively on "terms" the concept of "DocFreq" is very appropriate.
{quote}
 

 

 

> add a "min_popularity" option to relatedness() aggregation that forces scores 
> to -Inf if fg/bg pops don't meet a threshold
> --
>
> Key: SOLR-12581
> URL: https://issues.apache.org/jira/browse/SOLR-12581
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Major
> Attachments: SOLR-12581.patch
>
>
> as discussed in SOLR-9480 and noted in TODO comments, the original SKG code 
> base had a "min_pop" option that would completely ignore "terms" if the fg/bg 
> popularities weren't above a user specified threshold.  with the 
> implementation of SKG as a {{relatedness()}} aggregation function, we need to 
> leave any actual filtering of buckets by an aggregation result to a future 
> generalized JSON facet enhancement, but what we can do today is implement an 
> optional {{min_popularity}} option on {{relatedness()}} that forces the 
> {{relatedness}} score to -Infinity so buckets that don't meat the threshold 
> at least score "as low as possible"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12581) add a "min_popularity" option to relatedness() aggregation that forces scores to -Inf if fg/bg pops don't meet a threshold

2018-07-23 Thread Alexandre Rafalovitch (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553241#comment-16553241
 ] 

Alexandre Rafalovitch commented on SOLR-12581:
--

Sort of a side-question, but this work seems to overlap/compliment the 
significantTerms work done for streaming/QueryParser: 
[http://lucene.apache.org/solr/guide/7_4/stream-source-reference.html#significantterms]

Are we saying SignificantTerms is for simpler use cases (as fore/back queries 
are corpus-wide) and then go into relatedness() for more complex analysis? 

Should the options be roughly compatible where it makes sense and/or similarly 
named?

Just wondering because I could see this confusing newbies trying to see when to 
use which option.

> add a "min_popularity" option to relatedness() aggregation that forces scores 
> to -Inf if fg/bg pops don't meet a threshold
> --
>
> Key: SOLR-12581
> URL: https://issues.apache.org/jira/browse/SOLR-12581
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Major
> Attachments: SOLR-12581.patch
>
>
> as discussed in SOLR-9480 and noted in TODO comments, the original SKG code 
> base had a "min_pop" option that would completely ignore "terms" if the fg/bg 
> popularities weren't above a user specified threshold.  with the 
> implementation of SKG as a {{relatedness()}} aggregation function, we need to 
> leave any actual filtering of buckets by an aggregation result to a future 
> generalized JSON facet enhancement, but what we can do today is implement an 
> optional {{min_popularity}} option on {{relatedness()}} that forces the 
> {{relatedness}} score to -Infinity so buckets that don't meat the threshold 
> at least score "as low as possible"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org