[ 
https://issues.apache.org/jira/browse/SOLR-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-7707:
------------------------------
    Attachment: SOLR-7707.patch

I found the problem.

There is a test class called CountStream. In some of the test files 
(particularly 
solr/solrj/src/test-files/solrj/solr/collection1/conf/solrconfig-streaming.xml) 
the function name "count" was mapped to that Stream. However, now with a count 
metric I was also mapping the count function name to CountMetric.

For the moment I have corrected this by renaming CountStream to 
RecordCountStream and commented out the mapping in the solrconfig-streaming.xml 
file. I chose to change this one because it is a class in the test suite and 
not, apparently, used outside of testing.

However, this brings up an interesting question. Should we allow conflicting 
names across streams and metrics. Right now both the mapping for function name 
to Stream or Metric is stored in the same Map and as such we we are not 
allowing the conflict of names - ie, both a stream and metric cannot share the 
same function name. However, should we allow that?

I believe the answer, for clarity, is no. If you assign the string "count" to 
CountMetric then you cannot also assign it to CountStream. This will allow 
users to know what "count(....)" means without having to know the context. For 
example, allowing "count" to map to both could result in confusion in the 
following

{code}
rollup(
  count(search(....)),
  min(fieldA),
  count(fieldB)
)
{code}

> Add StreamExpression Support to RollupStream
> --------------------------------------------
>
>                 Key: SOLR-7707
>                 URL: https://issues.apache.org/jira/browse/SOLR-7707
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrJ
>            Reporter: Dennis Gove
>            Priority: Minor
>         Attachments: SOLR-7707.patch, SOLR-7707.patch
>
>
> This ticket is to add Stream Expression support to the RollupStream as 
> discussed in SOLR-7560.
> Proposed expression syntax for the RollupStream (copied from that ticket)
> {code}
> rollup(
>   someStream(....),
>   over="fieldA, fieldB, fieldC",
>   min(fieldA),
>   max(fieldA),
>   min(fieldB),
>   mean(fieldD),
>   sum(fieldC)
> )
> {code}
> This requires making the *Metric types Expressible but I think that ends up 
> as a good thing. Would make it real easy to support other options on metrics 
> like excluding outliers, for example find the sum of values within 3 standard 
> deviations from the mean could be 
> {code}
> sum(fieldC, limit=standardDev(3))
> {code}
>  (note, how that particular calculation could be implemented is left as an 
> exercise for the reader, I'm just using it as an example of adding additional 
> options on a relatively simple metric).
> Another option example is what to do with null values. For example, in some 
> cases a null should not impact a mean but in others it should. You could 
> express those as
> {code}
> mean(fieldA, replace(null, 0))  // replace null values with 0 thus leading to 
> an impact on the mean
> mean(fieldA, includeNull="true") // nulls are counted in the denominator but 
> nothing added to numerator
> mean(fieldA, includeNull="false") // nulls neither counted in denominator nor 
> added to numerator
> mean(fieldA, replace(null, fieldB), includeNull="true") // if fieldA is null 
> replace it with fieldB, include null fieldB in mean
> {code}
> so on and so forth.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to