[jira] [Commented] (SOLR-7707) Add StreamExpression Support to RollupStream
[ https://issues.apache.org/jira/browse/SOLR-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14679343#comment-14679343 ] ASF subversion and git services commented on SOLR-7707: --- Commit 1694910 from [~joel.bernstein] in branch 'dev/trunk' [ https://svn.apache.org/r1694910 ] SOLR-7707: Updated CHANGES.txt Add StreamExpression Support to RollupStream Key: SOLR-7707 URL: https://issues.apache.org/jira/browse/SOLR-7707 Project: Solr Issue Type: Improvement Components: SolrJ Reporter: Dennis Gove Priority: Minor Fix For: Trunk Attachments: SOLR-7707.patch, SOLR-7707.patch, SOLR-7707.patch, SOLR-7707.patch This ticket is to add Stream Expression support to the RollupStream as discussed in SOLR-7560. Proposed expression syntax for the RollupStream (copied from that ticket) {code} rollup( someStream(), over=fieldA, fieldB, fieldC, min(fieldA), max(fieldA), min(fieldB), mean(fieldD), sum(fieldC) ) {code} This requires making the *Metric types Expressible but I think that ends up as a good thing. Would make it real easy to support other options on metrics like excluding outliers, for example find the sum of values within 3 standard deviations from the mean could be {code} sum(fieldC, limit=standardDev(3)) {code} (note, how that particular calculation could be implemented is left as an exercise for the reader, I'm just using it as an example of adding additional options on a relatively simple metric). Another option example is what to do with null values. For example, in some cases a null should not impact a mean but in others it should. You could express those as {code} mean(fieldA, replace(null, 0)) // replace null values with 0 thus leading to an impact on the mean mean(fieldA, includeNull=true) // nulls are counted in the denominator but nothing added to numerator mean(fieldA, includeNull=false) // nulls neither counted in denominator nor added to numerator mean(fieldA, replace(null, fieldB), includeNull=true) // if fieldA is null replace it with fieldB, include null fieldB in mean {code} so on and so forth. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7707) Add StreamExpression Support to RollupStream
[ https://issues.apache.org/jira/browse/SOLR-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613855#comment-14613855 ] ASF subversion and git services commented on SOLR-7707: --- Commit 1689168 from [~joel.bernstein] in branch 'dev/trunk' [ https://svn.apache.org/r1689168 ] SOLR-7707: Add StreamExpression Support to RollupStream Add StreamExpression Support to RollupStream Key: SOLR-7707 URL: https://issues.apache.org/jira/browse/SOLR-7707 Project: Solr Issue Type: Improvement Components: SolrJ Reporter: Dennis Gove Priority: Minor Attachments: SOLR-7707.patch, SOLR-7707.patch, SOLR-7707.patch, SOLR-7707.patch This ticket is to add Stream Expression support to the RollupStream as discussed in SOLR-7560. Proposed expression syntax for the RollupStream (copied from that ticket) {code} rollup( someStream(), over=fieldA, fieldB, fieldC, min(fieldA), max(fieldA), min(fieldB), mean(fieldD), sum(fieldC) ) {code} This requires making the *Metric types Expressible but I think that ends up as a good thing. Would make it real easy to support other options on metrics like excluding outliers, for example find the sum of values within 3 standard deviations from the mean could be {code} sum(fieldC, limit=standardDev(3)) {code} (note, how that particular calculation could be implemented is left as an exercise for the reader, I'm just using it as an example of adding additional options on a relatively simple metric). Another option example is what to do with null values. For example, in some cases a null should not impact a mean but in others it should. You could express those as {code} mean(fieldA, replace(null, 0)) // replace null values with 0 thus leading to an impact on the mean mean(fieldA, includeNull=true) // nulls are counted in the denominator but nothing added to numerator mean(fieldA, includeNull=false) // nulls neither counted in denominator nor added to numerator mean(fieldA, replace(null, fieldB), includeNull=true) // if fieldA is null replace it with fieldB, include null fieldB in mean {code} so on and so forth. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7707) Add StreamExpression Support to RollupStream
[ https://issues.apache.org/jira/browse/SOLR-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613162#comment-14613162 ] Joel Bernstein commented on SOLR-7707: -- No problem, I'm still wrapping up SOLR-744. Add StreamExpression Support to RollupStream Key: SOLR-7707 URL: https://issues.apache.org/jira/browse/SOLR-7707 Project: Solr Issue Type: Improvement Components: SolrJ Reporter: Dennis Gove Priority: Minor Attachments: SOLR-7707.patch, SOLR-7707.patch This ticket is to add Stream Expression support to the RollupStream as discussed in SOLR-7560. Proposed expression syntax for the RollupStream (copied from that ticket) {code} rollup( someStream(), over=fieldA, fieldB, fieldC, min(fieldA), max(fieldA), min(fieldB), mean(fieldD), sum(fieldC) ) {code} This requires making the *Metric types Expressible but I think that ends up as a good thing. Would make it real easy to support other options on metrics like excluding outliers, for example find the sum of values within 3 standard deviations from the mean could be {code} sum(fieldC, limit=standardDev(3)) {code} (note, how that particular calculation could be implemented is left as an exercise for the reader, I'm just using it as an example of adding additional options on a relatively simple metric). Another option example is what to do with null values. For example, in some cases a null should not impact a mean but in others it should. You could express those as {code} mean(fieldA, replace(null, 0)) // replace null values with 0 thus leading to an impact on the mean mean(fieldA, includeNull=true) // nulls are counted in the denominator but nothing added to numerator mean(fieldA, includeNull=false) // nulls neither counted in denominator nor added to numerator mean(fieldA, replace(null, fieldB), includeNull=true) // if fieldA is null replace it with fieldB, include null fieldB in mean {code} so on and so forth. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7707) Add StreamExpression Support to RollupStream
[ https://issues.apache.org/jira/browse/SOLR-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613374#comment-14613374 ] Joel Bernstein commented on SOLR-7707: -- Patch looks great! I switched over the SQLHandler to use Streaming Expressions as the parallel transport format and it's extremely compact compared to string encoded object serialization. Add StreamExpression Support to RollupStream Key: SOLR-7707 URL: https://issues.apache.org/jira/browse/SOLR-7707 Project: Solr Issue Type: Improvement Components: SolrJ Reporter: Dennis Gove Priority: Minor Attachments: SOLR-7707.patch, SOLR-7707.patch, SOLR-7707.patch This ticket is to add Stream Expression support to the RollupStream as discussed in SOLR-7560. Proposed expression syntax for the RollupStream (copied from that ticket) {code} rollup( someStream(), over=fieldA, fieldB, fieldC, min(fieldA), max(fieldA), min(fieldB), mean(fieldD), sum(fieldC) ) {code} This requires making the *Metric types Expressible but I think that ends up as a good thing. Would make it real easy to support other options on metrics like excluding outliers, for example find the sum of values within 3 standard deviations from the mean could be {code} sum(fieldC, limit=standardDev(3)) {code} (note, how that particular calculation could be implemented is left as an exercise for the reader, I'm just using it as an example of adding additional options on a relatively simple metric). Another option example is what to do with null values. For example, in some cases a null should not impact a mean but in others it should. You could express those as {code} mean(fieldA, replace(null, 0)) // replace null values with 0 thus leading to an impact on the mean mean(fieldA, includeNull=true) // nulls are counted in the denominator but nothing added to numerator mean(fieldA, includeNull=false) // nulls neither counted in denominator nor added to numerator mean(fieldA, replace(null, fieldB), includeNull=true) // if fieldA is null replace it with fieldB, include null fieldB in mean {code} so on and so forth. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7707) Add StreamExpression Support to RollupStream
[ https://issues.apache.org/jira/browse/SOLR-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612814#comment-14612814 ] Dennis Gove commented on SOLR-7707: --- Looks like I cut my branch from trunk before those changes were committed. I'll go through some rebasing tomorrow and post up a new patch. Sorry abut that. Add StreamExpression Support to RollupStream Key: SOLR-7707 URL: https://issues.apache.org/jira/browse/SOLR-7707 Project: Solr Issue Type: Improvement Components: SolrJ Reporter: Dennis Gove Priority: Minor Attachments: SOLR-7707.patch, SOLR-7707.patch This ticket is to add Stream Expression support to the RollupStream as discussed in SOLR-7560. Proposed expression syntax for the RollupStream (copied from that ticket) {code} rollup( someStream(), over=fieldA, fieldB, fieldC, min(fieldA), max(fieldA), min(fieldB), mean(fieldD), sum(fieldC) ) {code} This requires making the *Metric types Expressible but I think that ends up as a good thing. Would make it real easy to support other options on metrics like excluding outliers, for example find the sum of values within 3 standard deviations from the mean could be {code} sum(fieldC, limit=standardDev(3)) {code} (note, how that particular calculation could be implemented is left as an exercise for the reader, I'm just using it as an example of adding additional options on a relatively simple metric). Another option example is what to do with null values. For example, in some cases a null should not impact a mean but in others it should. You could express those as {code} mean(fieldA, replace(null, 0)) // replace null values with 0 thus leading to an impact on the mean mean(fieldA, includeNull=true) // nulls are counted in the denominator but nothing added to numerator mean(fieldA, includeNull=false) // nulls neither counted in denominator nor added to numerator mean(fieldA, replace(null, fieldB), includeNull=true) // if fieldA is null replace it with fieldB, include null fieldB in mean {code} so on and so forth. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7707) Add StreamExpression Support to RollupStream
[ https://issues.apache.org/jira/browse/SOLR-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612810#comment-14612810 ] Joel Bernstein commented on SOLR-7707: -- Looks like your patch is a commit or two behind svn trunk. Take a look at: https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/handler/SQLHandler.java You'll see it already has the MultipleFieldComparator, StreamComparator incorporated. Wondering if the git repo is falling to far behind. Add StreamExpression Support to RollupStream Key: SOLR-7707 URL: https://issues.apache.org/jira/browse/SOLR-7707 Project: Solr Issue Type: Improvement Components: SolrJ Reporter: Dennis Gove Priority: Minor Attachments: SOLR-7707.patch, SOLR-7707.patch This ticket is to add Stream Expression support to the RollupStream as discussed in SOLR-7560. Proposed expression syntax for the RollupStream (copied from that ticket) {code} rollup( someStream(), over=fieldA, fieldB, fieldC, min(fieldA), max(fieldA), min(fieldB), mean(fieldD), sum(fieldC) ) {code} This requires making the *Metric types Expressible but I think that ends up as a good thing. Would make it real easy to support other options on metrics like excluding outliers, for example find the sum of values within 3 standard deviations from the mean could be {code} sum(fieldC, limit=standardDev(3)) {code} (note, how that particular calculation could be implemented is left as an exercise for the reader, I'm just using it as an example of adding additional options on a relatively simple metric). Another option example is what to do with null values. For example, in some cases a null should not impact a mean but in others it should. You could express those as {code} mean(fieldA, replace(null, 0)) // replace null values with 0 thus leading to an impact on the mean mean(fieldA, includeNull=true) // nulls are counted in the denominator but nothing added to numerator mean(fieldA, includeNull=false) // nulls neither counted in denominator nor added to numerator mean(fieldA, replace(null, fieldB), includeNull=true) // if fieldA is null replace it with fieldB, include null fieldB in mean {code} so on and so forth. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7707) Add StreamExpression Support to RollupStream
[ https://issues.apache.org/jira/browse/SOLR-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14608288#comment-14608288 ] Joel Bernstein commented on SOLR-7707: -- Ok, thanks Dennis. I'll take a look at the ParallelStream test. Add StreamExpression Support to RollupStream Key: SOLR-7707 URL: https://issues.apache.org/jira/browse/SOLR-7707 Project: Solr Issue Type: Improvement Components: SolrJ Reporter: Dennis Gove Priority: Minor Attachments: SOLR-7707.patch This ticket is to add Stream Expression support to the RollupStream as discussed in SOLR-7560. Proposed expression syntax for the RollupStream (copied from that ticket) {code} rollup( someStream(), over=fieldA, fieldB, fieldC, min(fieldA), max(fieldA), min(fieldB), mean(fieldD), sum(fieldC) ) {code} This requires making the *Metric types Expressible but I think that ends up as a good thing. Would make it real easy to support other options on metrics like excluding outliers, for example find the sum of values within 3 standard deviations from the mean could be {code} sum(fieldC, limit=standardDev(3)) {code} (note, how that particular calculation could be implemented is left as an exercise for the reader, I'm just using it as an example of adding additional options on a relatively simple metric). Another option example is what to do with null values. For example, in some cases a null should not impact a mean but in others it should. You could express those as {code} mean(fieldA, replace(null, 0)) // replace null values with 0 thus leading to an impact on the mean mean(fieldA, includeNull=true) // nulls are counted in the denominator but nothing added to numerator mean(fieldA, includeNull=false) // nulls neither counted in denominator nor added to numerator mean(fieldA, replace(null, fieldB), includeNull=true) // if fieldA is null replace it with fieldB, include null fieldB in mean {code} so on and so forth. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org