[jira] [Commented] (SOLR-12839) add a 'resort' option to JSON faceting

2018-11-30 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705390#comment-16705390
 ] 

Hoss Man commented on SOLR-12839:
-

{quote}my main objection was going to be the name "approximate" which highly 
suggests that an estimate is fine. "prelim_sort" seems fine.
{quote}
+1 ... i hadn't considered that.  I'm glad i disliked it for other reasons

> add a 'resort' option to JSON faceting
> --
>
> Key: SOLR-12839
> URL: https://issues.apache.org/jira/browse/SOLR-12839
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Major
> Fix For: master (8.0), 7.7
>
> Attachments: SOLR-12839.patch, SOLR-12839.patch, SOLR-12839.patch, 
> SOLR-12839.patch
>
>
> As discusssed in SOLR-9480 ...
> bq. Similar to how the {{rerank}} request param allows people to collect & 
> score documents using a "cheap" query, and then re-score the top N using a 
> ore expensive query, I think it would be handy if JSON Facets supported a 
> {{resort}} option that could be used on any FacetRequestSorted instance right 
> along side the {{sort}} param, using the same JSON syntax, so that clients 
> could have Solr internaly sort all the facet buckets by something simple 
> (like count) and then "Re-Sort" the top N=limit (or maybe ( 
> N=limit+overrequest ?) using a more expensive function like skg()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12839) add a 'resort' option to JSON faceting

2018-11-30 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705386#comment-16705386
 ] 

ASF subversion and git services commented on SOLR-12839:


Commit 5dc988f5eeff78464d852f54ce7f06a801dcbfee in lucene-solr's branch 
refs/heads/master from Chris Hostetter
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5dc988f ]

SOLR-12839: JSON 'terms' Faceting now supports a 'prelim_sort' option to use 
when initially selecting the top ranking buckets, prior to the final 'sort' 
option used after refinement.


> add a 'resort' option to JSON faceting
> --
>
> Key: SOLR-12839
> URL: https://issues.apache.org/jira/browse/SOLR-12839
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Major
> Attachments: SOLR-12839.patch, SOLR-12839.patch, SOLR-12839.patch, 
> SOLR-12839.patch
>
>
> As discusssed in SOLR-9480 ...
> bq. Similar to how the {{rerank}} request param allows people to collect & 
> score documents using a "cheap" query, and then re-score the top N using a 
> ore expensive query, I think it would be handy if JSON Facets supported a 
> {{resort}} option that could be used on any FacetRequestSorted instance right 
> along side the {{sort}} param, using the same JSON syntax, so that clients 
> could have Solr internaly sort all the facet buckets by something simple 
> (like count) and then "Re-Sort" the top N=limit (or maybe ( 
> N=limit+overrequest ?) using a more expensive function like skg()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12839) add a 'resort' option to JSON faceting

2018-11-30 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705385#comment-16705385
 ] 

ASF subversion and git services commented on SOLR-12839:


Commit 5763d8b194009cdf602e983d56115be9480f69b8 in lucene-solr's branch 
refs/heads/branch_7x from Chris Hostetter
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5763d8b ]

SOLR-12839: JSON 'terms' Faceting now supports a 'prelim_sort' option to use 
when initially selecting the top ranking buckets, prior to the final 'sort' 
option used after refinement.

(cherry picked from commit 5dc988f5eeff78464d852f54ce7f06a801dcbfee)

Conflicts:
solr/CHANGES.txt


> add a 'resort' option to JSON faceting
> --
>
> Key: SOLR-12839
> URL: https://issues.apache.org/jira/browse/SOLR-12839
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Major
> Attachments: SOLR-12839.patch, SOLR-12839.patch, SOLR-12839.patch, 
> SOLR-12839.patch
>
>
> As discusssed in SOLR-9480 ...
> bq. Similar to how the {{rerank}} request param allows people to collect & 
> score documents using a "cheap" query, and then re-score the top N using a 
> ore expensive query, I think it would be handy if JSON Facets supported a 
> {{resort}} option that could be used on any FacetRequestSorted instance right 
> along side the {{sort}} param, using the same JSON syntax, so that clients 
> could have Solr internaly sort all the facet buckets by something simple 
> (like count) and then "Re-Sort" the top N=limit (or maybe ( 
> N=limit+overrequest ?) using a more expensive function like skg()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12839) add a 'resort' option to JSON faceting

2018-11-30 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705178#comment-16705178
 ] 

Yonik Seeley commented on SOLR-12839:
-

Yeah, I think this is OK - my main objection was going to be the name 
"approximate" which highly suggests that an estimate is fine. "prelim_sort" 
seems fine.

> add a 'resort' option to JSON faceting
> --
>
> Key: SOLR-12839
> URL: https://issues.apache.org/jira/browse/SOLR-12839
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Major
> Attachments: SOLR-12839.patch, SOLR-12839.patch, SOLR-12839.patch, 
> SOLR-12839.patch
>
>
> As discusssed in SOLR-9480 ...
> bq. Similar to how the {{rerank}} request param allows people to collect & 
> score documents using a "cheap" query, and then re-score the top N using a 
> ore expensive query, I think it would be handy if JSON Facets supported a 
> {{resort}} option that could be used on any FacetRequestSorted instance right 
> along side the {{sort}} param, using the same JSON syntax, so that clients 
> could have Solr internaly sort all the facet buckets by something simple 
> (like count) and then "Re-Sort" the top N=limit (or maybe ( 
> N=limit+overrequest ?) using a more expensive function like skg()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12839) add a 'resort' option to JSON faceting

2018-11-30 Thread Lucene/Solr QA (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705018#comment-16705018
 ] 

Lucene/Solr QA commented on SOLR-12839:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
40s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Release audit (RAT) {color} | 
{color:green}  1m 43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Check forbidden APIs {color} | 
{color:green}  1m 37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Validate source patterns {color} | 
{color:green}  1m 37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Validate ref guide {color} | 
{color:green}  1m 37s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 35m 
57s{color} | {color:green} core in the patch passed. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 42m 56s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | SOLR-12839 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12950083/SOLR-12839.patch |
| Optional Tests |  compile  javac  unit  ratsources  checkforbiddenapis  
validatesourcepatterns  validaterefguide  |
| uname | Linux lucene1-us-west 4.4.0-137-generic #163~14.04.1-Ubuntu SMP Mon 
Sep 24 17:14:57 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | ant |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-SOLR-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
 |
| git revision | master / 0491623 |
| ant | version: Apache Ant(TM) version 1.9.3 compiled on July 24 2018 |
| Default Java | 1.8.0_191 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-SOLR-Build/237/testReport/ |
| modules | C: solr/core solr/solr-ref-guide U: solr |
| Console output | 
https://builds.apache.org/job/PreCommit-SOLR-Build/237/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> add a 'resort' option to JSON faceting
> --
>
> Key: SOLR-12839
> URL: https://issues.apache.org/jira/browse/SOLR-12839
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Major
> Attachments: SOLR-12839.patch, SOLR-12839.patch, SOLR-12839.patch, 
> SOLR-12839.patch
>
>
> As discusssed in SOLR-9480 ...
> bq. Similar to how the {{rerank}} request param allows people to collect & 
> score documents using a "cheap" query, and then re-score the top N using a 
> ore expensive query, I think it would be handy if JSON Facets supported a 
> {{resort}} option that could be used on any FacetRequestSorted instance right 
> along side the {{sort}} param, using the same JSON syntax, so that clients 
> could have Solr internaly sort all the facet buckets by something simple 
> (like count) and then "Re-Sort" the top N=limit (or maybe ( 
> N=limit+overrequest ?) using a more expensive function like skg()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12839) add a 'resort' option to JSON faceting

2018-11-26 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699252#comment-16699252
 ] 

Hoss Man commented on SOLR-12839:
-

Pinging [~ysee...@gmail.com] – any thoughts/objections to the idea i mentioned 
on Oct 22 ...
{quote}... maybe the best way to more forward _now_ with a short term 
improvement for the sorting/resorting on an {{expensive_function()}} vs 
{{cheaper_aproximation_of_expensive_function()}} performance hint/optimization 
– in a way that wouldn't hinder us down the road with more full featured bucket 
processing – would be to "invert" the API i proposed/implemented and renamed 
the option...
 * instead of adding a {{resort}} option, add an {{approximate_sort}} option 
and flip the meaning from what i was originally thinking...
{noformat}
json.facet={
  categories : {
type : terms,
field : cat,
limit : 5,
sort : "consumer_value desc",
approximate_sort : "count desc",
facet : { consumer_value : "sum(div(popularity,price))" }
  }
}
{noformat}

 * instead of using {{sort}} in phase#1, and resorting on the {{resort}} option 
after merging, we would use {{approximate_sort}} in phase#1, and {{sort}} in 
phase#2
 * from an end user perspective {{sort}} is the most important thing:
 ** the buckets to be returned will be in {{sort}} order
 ** {{approximate_sort}} is an expert level option only used as an inexpensive 
approximation when picking the buckets to be returned, where the exact number 
of buckets considered in this way is essentially an implementation detail
 * down the road, when we add "tiebreaker" support to "sort type options", it 
should be fairly trivial to support it for both options (with some 
generalization of the helper methods added in the patch)
 * farther down the road, if we want a more robust "sort, resort, process, 
filter the buckets returned" type option/syntax to replace the {{sort}} param – 
then we can add that independent of the {{approximate_sort}} option, and 
{{approximate_sort}} can still be used as a way to indicate an "inexpensive" 
way to initially rank the buckets to consider before any of that new 
functionality.{quote}
...this would be a fairly meaty re-working of the existing patch i've got in 
progress (sort->approx_sort, resort->sort, & flipping the syntax in all the 
test queries) so i'd rather not bite it off if it's something you have 
objections to in principle.

> add a 'resort' option to JSON faceting
> --
>
> Key: SOLR-12839
> URL: https://issues.apache.org/jira/browse/SOLR-12839
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Major
> Attachments: SOLR-12839.patch, SOLR-12839.patch, SOLR-12839.patch
>
>
> As discusssed in SOLR-9480 ...
> bq. Similar to how the {{rerank}} request param allows people to collect & 
> score documents using a "cheap" query, and then re-score the top N using a 
> ore expensive query, I think it would be handy if JSON Facets supported a 
> {{resort}} option that could be used on any FacetRequestSorted instance right 
> along side the {{sort}} param, using the same JSON syntax, so that clients 
> could have Solr internaly sort all the facet buckets by something simple 
> (like count) and then "Re-Sort" the top N=limit (or maybe ( 
> N=limit+overrequest ?) using a more expensive function like skg()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12839) add a 'resort' option to JSON faceting

2018-10-24 Thread Lucene/Solr QA (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662398#comment-16662398
 ] 

Lucene/Solr QA commented on SOLR-12839:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
32s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Release audit (RAT) {color} | 
{color:green}  3m 46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Check forbidden APIs {color} | 
{color:green}  3m 46s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} Validate source patterns {color} | 
{color:red}  3m 47s{color} | {color:red} Validate source patterns 
validate-source-patterns failed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m 56s{color} 
| {color:red} core in the patch failed. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 93m 44s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | solr.search.QueryEqualityTest |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | SOLR-12839 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945318/SOLR-12839.patch |
| Optional Tests |  compile  javac  unit  ratsources  checkforbiddenapis  
validatesourcepatterns  |
| uname | Linux lucene2-us-west.apache.org 4.4.0-112-generic #135-Ubuntu SMP 
Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | ant |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-SOLR-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
 |
| git revision | master / e083b15 |
| ant | version: Apache Ant(TM) version 1.9.6 compiled on July 20 2018 |
| Default Java | 1.8.0_172 |
| Validate source patterns | 
https://builds.apache.org/job/PreCommit-SOLR-Build/211/artifact/out/patch-validate-source-patterns-root.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-SOLR-Build/211/artifact/out/patch-unit-solr_core.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-SOLR-Build/211/testReport/ |
| modules | C: solr/core U: solr/core |
| Console output | 
https://builds.apache.org/job/PreCommit-SOLR-Build/211/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> add a 'resort' option to JSON faceting
> --
>
> Key: SOLR-12839
> URL: https://issues.apache.org/jira/browse/SOLR-12839
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Major
> Attachments: SOLR-12839.patch, SOLR-12839.patch, SOLR-12839.patch
>
>
> As discusssed in SOLR-9480 ...
> bq. Similar to how the {{rerank}} request param allows people to collect & 
> score documents using a "cheap" query, and then re-score the top N using a 
> ore expensive query, I think it would be handy if JSON Facets supported a 
> {{resort}} option that could be used on any FacetRequestSorted instance right 
> along side the {{sort}} param, using the same JSON syntax, so that clients 
> could have Solr internaly sort all the facet buckets by something simple 
> (like count) and then "Re-Sort" the top N=limit (or maybe ( 
> N=limit+overrequest ?) using a more expensive function like skg()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12839) add a 'resort' option to JSON faceting

2018-10-22 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659854#comment-16659854
 ] 

Hoss Man commented on SOLR-12839:
-

{quote}"foo desc, bar asc 50" was an example of a single sort with tiebreak and 
a limit (no resort). If one wanted a single string version ";" would be the 
divider. For example adding a resort with a tiebreak: "foo desc, bar asc 50; 
baz desc, qux asc 10"
{quote}
Ok ... i realize now that you were discussing 2 diff ideas and giving 2 diff 
examples and i was conflating them – but i'm still not certain what you're 
saying the *behavior* of these examples would be, particulalry because 
(independent of the idea of resorting *AND* independent of the idea of 
supporting tiebreakers on sort/resort syntax) you *ALSO* seem to be suggesting 
a numeric "limit" that would be inlined as part of the sort/resort syntax – and 
this confuses me in 2 orthoginal ways:
 * are you suggesting this would be an alternative for the existing {{limit}} 
param on these facets?
 ** if so, what would be behavior if someone tired to do both? use the "inline 
limit" and use a "limit" param?
 ** if not, then what do you mean by "limit" in the above sentence?
 * assuming you did mean as a replacement/override of the existing {{limit}} 
param, i don't understand your example and what the value add of asking solr to 
resort the "top 10" by criteria "baz desc, qux asc" if we're already returning 
the "top 50"

{quote}If there are use cases for starting with N sorted things and reducing 
that to K with a different sort, then it's just sort of recursive. Why would 
there be use cases for one resort and not two resorts?
 ...
 One use case that comes to mind are stock screens I've seen that consist of 
multiple sorting and "take top N" steps.

Example: Sort by current dividend yield and take the top 100, then sort those 
by low PE and take the top 50, then sort those by total return 1 year and take 
the top 10.
{quote}
...again: if this is a situation where solr is returning the top 100 buckets, 
what's the value add in having solr resort the top 50 (and then the top 10 
again) instead of just letting the client manipulate & re-order those same 
buckets?

I feel like maybe there is a disconnect in the _principle_ of the ideas we are 
discussing?

As I mentioned when i created this issue, the overall goal i'm trying to 
address is to mirror the concept of the "reranking query" at a facet bucket 
level ... for addressing the performance cost of sorting by something 
complex/expensive.
 * Today you can ask solr:
 ** Compute {{expensive_function()}} for every bucket that exists, and sort all 
the buckets by that function – then return the top {{$limit}} buckets"
 * I want to be able to tell solr:
 ** "Compute {{cheaper_aproximation_of_expensive_function()}} for every bucket 
that exists, sort all the buckets by that function, and compute 
{{expensive_function()}} only for the top candidate buckets – then (once 
refinement/merging is complete) resort just the fully populated buckets by 
{{expensive_function()}}

...note in particular that I'm not even suggesting any sort of new 
{{resort_limit}} option or any hard and fast guarantees on the number of 
buckets that are "resorted" – just a way to tell solr "during the first pass, 
you can use this cheap function instead of the final expensive function i 
really care about" ... in essence just a "performance hint" or "save some CPU 
cycles" type feature

What you're describing on the other hand seems to be more akin to a "i want 
specific operations to be performed on my buckets" type feature ... the 
examples you're describing sound almost like a subset of a more robust 
scripting type functionality, or at the very least a multi stage "post 
processing" that might include filtering or collapsing of buckets?

...Lemme come back to this conceptual disconnect in a minute...

{quote}Anyway we don't have to worry about multiple resorts now as long as we 
can unambiguously upgrade if desired later (i.e. whatever the resort spec looks 
like, if we can unambiguously wrap an array around it later and specify 
multiple of them, then we're good)
{quote}
right ... but if you're trying to future proof the API, there's also the 
question of "tiebreakers" look like when using the (existing) JSON object 
syntax for sorting instead of just the shorthand string syntax.

ie, if you completely ignore the concept of "resorting", today we support 
this...
{noformat}
json.facet={
  categories : {
type : terms,
field : cat,
limit : 5,
facet : { x : "sum(div(popularity,price))" },
// can use short hand of "x desc"
sort : { x : desc }, 
  }
}
{noformat}
...and if you assume you want to have multiple tiebreaker sorts then that would 
be something like...
{noformat}
json.facet={
  categories : {
type : terms,
field : cat,
limit : 5,
facet : { x : 

[jira] [Commented] (SOLR-12839) add a 'resort' option to JSON faceting

2018-10-14 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649387#comment-16649387
 ] 

Yonik Seeley commented on SOLR-12839:
-

bq. if "foo desc" is the primary sort, and "bar asc" is the tiebreaker, then 
what is being resorted on?

 "foo desc, bar asc 50" was an example of a single sort with tiebreak and a 
limit (no resort).
If one wanted a single string version ";" would be the divider.  For example 
adding a resort with a tiebreak: "foo desc, bar asc 50; baz desc, qux asc 10"

bq. why/how/when would it make sense to resort multiple times?
If there are use cases for starting with N sorted things and reducing that to K 
with a different sort, then it's just sort of recursive.  Why would there be 
use cases for one resort and not two resorts?

One use case that comes to mind are stock screens I've seen that consist of 
multiple sorting and "take top N" steps.
Example: Sort by current dividend yield and take the top 100, then sort those 
by low PE and take the top 50, then sort those by total return 1 year and take 
the top 10.

bq. or how it could work practically given the 2 phrase nature of distributed 
facet refinement.

Hmm, good point.  Over the long term I'd always imagined the number of phases 
could be variable, so It's more of a current implementation detail (albeit a 
very major one).  It would currently kill the usefulness in distributed though. 
 
Anyway we don't have to worry about multiple resorts now as long as we can 
unambiguously upgrade if desired later (i.e. whatever the resort spec looks 
like, if we can unambiguously wrap an array around it later and specify 
multiple of them, then we're good)

> add a 'resort' option to JSON faceting
> --
>
> Key: SOLR-12839
> URL: https://issues.apache.org/jira/browse/SOLR-12839
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Major
> Attachments: SOLR-12839.patch, SOLR-12839.patch
>
>
> As discusssed in SOLR-9480 ...
> bq. Similar to how the {{rerank}} request param allows people to collect & 
> score documents using a "cheap" query, and then re-score the top N using a 
> ore expensive query, I think it would be handy if JSON Facets supported a 
> {{resort}} option that could be used on any FacetRequestSorted instance right 
> along side the {{sort}} param, using the same JSON syntax, so that clients 
> could have Solr internaly sort all the facet buckets by something simple 
> (like count) and then "Re-Sort" the top N=limit (or maybe ( 
> N=limit+overrequest ?) using a more expensive function like skg()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12839) add a 'resort' option to JSON faceting

2018-10-13 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649265#comment-16649265
 ] 

Hoss Man commented on SOLR-12839:
-

{quote}We should perhaps think about how to extend to N sorts instead of 2.
{quote}
I'm confused ... why/how/when would it make sense to resort multiple times? Do 
you have an example of what you have in mind?

My usecase/idea here is directly analogous to the "rerank" concept of queries 
and sorting (docs) by score: start with a "simple" query/sort when doing the 
first pass to find matches/buckets, then "rerank/resort only the topN on a more 
complex query/function that you don't want to compute against every doc/bucket.

I can't really think of a situation where it would would be useful to resort 
buckets more then once – or how it could work practically given the 2 phrase 
nature of distributed facet refinement. (as long as we only do 2 phrases, it 
seems like it would just be wasted cycles to do anything but the first and last 
sort?)
{quote}Maybe just append a number to our existing sort syntax, so we would get 
something like "foo desc, bar asc 50" (bar would be a tiebreak in this case)
{quote}
I'm not really understanding what you're suggestion ... can you elaborate on 
what would happen in that example? if "foo desc" is the primary sort, and "bar 
asc" is the tiebreaker, then what is being resorted on? and where does the "50" 
in your example come into play,? (is that suppose to be how many of the top 
buckets we want to resort? the rerank parser has a {{reRankDocs}} local param 
but i didn't really see much point in having that concept for facet resorting 
since the shards will have already computed all the stat values for all the 
buckets once they've been refined, and so why bother artificially limiting the 
number of buckets being considered for resorting beyond that?)
{quote}Also keeping in mind that sort should be able to have tiebreaks someday.
{quote}
...
{quote}Or given that this is just an extension of the sort syntax, it could 
even just go in the "sort" param itself and not bother with "resort"
{quote}
I briefly considered adding resorting as a new option to the sort syntax, but 
quickly dismissed it for the precise reason that you mention: i expect the sort 
option to get more robust, and supporting secondary, tertiary, etc... tie 
breakers - so trying to shoe horn resorting into it as well seemed like a bad 
idea (plus i wanted to ensure that whatever features were added to the sort 
syntax down the road – tie breakers, etc... – would also be supported for 
resorting as well.)

> add a 'resort' option to JSON faceting
> --
>
> Key: SOLR-12839
> URL: https://issues.apache.org/jira/browse/SOLR-12839
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Major
> Attachments: SOLR-12839.patch, SOLR-12839.patch
>
>
> As discusssed in SOLR-9480 ...
> bq. Similar to how the {{rerank}} request param allows people to collect & 
> score documents using a "cheap" query, and then re-score the top N using a 
> ore expensive query, I think it would be handy if JSON Facets supported a 
> {{resort}} option that could be used on any FacetRequestSorted instance right 
> along side the {{sort}} param, using the same JSON syntax, so that clients 
> could have Solr internaly sort all the facet buckets by something simple 
> (like count) and then "Re-Sort" the top N=limit (or maybe ( 
> N=limit+overrequest ?) using a more expensive function like skg()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12839) add a 'resort' option to JSON faceting

2018-10-13 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649244#comment-16649244
 ] 

Yonik Seeley commented on SOLR-12839:
-

We should perhaps think about how to extend to N sorts instead of 2.
Also keeping in mind that sort should be able to have tiebreaks someday.

Brainstorming syntax:
Maybe just append a number to our existing sort syntax, so we would get 
something like "foo desc, bar asc 50" (bar would be a tiebreak in this case)
So two resorts in a row could be
  "field1 asc 100; field2 desc 10" or a slightly more decomposed array ["field1 
asc 100","field2 desc 10"]
Or given that this is just an extension of the sort syntax, it could even just 
go in the "sort" param itself and not bother with "resort"
sort:"count desc 5" could be a synonym for sort:"count desc",limit:5

It's late and my slides for Activate aren't done take it for what it's 
worth ;-)

> add a 'resort' option to JSON faceting
> --
>
> Key: SOLR-12839
> URL: https://issues.apache.org/jira/browse/SOLR-12839
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Major
> Attachments: SOLR-12839.patch, SOLR-12839.patch
>
>
> As discusssed in SOLR-9480 ...
> bq. Similar to how the {{rerank}} request param allows people to collect & 
> score documents using a "cheap" query, and then re-score the top N using a 
> ore expensive query, I think it would be handy if JSON Facets supported a 
> {{resort}} option that could be used on any FacetRequestSorted instance right 
> along side the {{sort}} param, using the same JSON syntax, so that clients 
> could have Solr internaly sort all the facet buckets by something simple 
> (like count) and then "Re-Sort" the top N=limit (or maybe ( 
> N=limit+overrequest ?) using a more expensive function like skg()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12839) add a 'resort' option to JSON faceting

2018-10-13 Thread Lucene/Solr QA (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648773#comment-16648773
 ] 

Lucene/Solr QA commented on SOLR-12839:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
21s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Release audit (RAT) {color} | 
{color:green}  3m  5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Check forbidden APIs {color} | 
{color:green}  3m  5s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} Validate source patterns {color} | 
{color:red}  3m  5s{color} | {color:red} Validate source patterns 
validate-source-patterns failed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 85m  
0s{color} | {color:green} core in the patch passed. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 95m 35s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | SOLR-12839 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943677/SOLR-12839.patch |
| Optional Tests |  compile  javac  unit  ratsources  checkforbiddenapis  
validatesourcepatterns  |
| uname | Linux lucene2-us-west.apache.org 4.4.0-112-generic #135-Ubuntu SMP 
Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | ant |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-SOLR-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
 |
| git revision | master / 9c8ffab |
| ant | version: Apache Ant(TM) version 1.9.6 compiled on July 20 2018 |
| Default Java | 1.8.0_172 |
| Validate source patterns | 
https://builds.apache.org/job/PreCommit-SOLR-Build/198/artifact/out/patch-validate-source-patterns-root.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-SOLR-Build/198/testReport/ |
| modules | C: solr/core U: solr/core |
| Console output | 
https://builds.apache.org/job/PreCommit-SOLR-Build/198/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> add a 'resort' option to JSON faceting
> --
>
> Key: SOLR-12839
> URL: https://issues.apache.org/jira/browse/SOLR-12839
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Major
> Attachments: SOLR-12839.patch, SOLR-12839.patch
>
>
> As discusssed in SOLR-9480 ...
> bq. Similar to how the {{rerank}} request param allows people to collect & 
> score documents using a "cheap" query, and then re-score the top N using a 
> ore expensive query, I think it would be handy if JSON Facets supported a 
> {{resort}} option that could be used on any FacetRequestSorted instance right 
> along side the {{sort}} param, using the same JSON syntax, so that clients 
> could have Solr internaly sort all the facet buckets by something simple 
> (like count) and then "Re-Sort" the top N=limit (or maybe ( 
> N=limit+overrequest ?) using a more expensive function like skg()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org