[
https://issues.apache.org/jira/browse/SOLR-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dennis Gove updated SOLR-7707:
------------------------------
Attachment: SOLR-7707.patch
I found the problem.
There is a test class called CountStream. In some of the test files
(particularly
solr/solrj/src/test-files/solrj/solr/collection1/conf/solrconfig-streaming.xml)
the function name "count" was mapped to that Stream. However, now with a count
metric I was also mapping the count function name to CountMetric.
For the moment I have corrected this by renaming CountStream to
RecordCountStream and commented out the mapping in the solrconfig-streaming.xml
file. I chose to change this one because it is a class in the test suite and
not, apparently, used outside of testing.
However, this brings up an interesting question. Should we allow conflicting
names across streams and metrics. Right now both the mapping for function name
to Stream or Metric is stored in the same Map and as such we we are not
allowing the conflict of names - ie, both a stream and metric cannot share the
same function name. However, should we allow that?
I believe the answer, for clarity, is no. If you assign the string "count" to
CountMetric then you cannot also assign it to CountStream. This will allow
users to know what "count(....)" means without having to know the context. For
example, allowing "count" to map to both could result in confusion in the
following
{code}
rollup(
count(search(....)),
min(fieldA),
count(fieldB)
)
{code}
> Add StreamExpression Support to RollupStream
> --------------------------------------------
>
> Key: SOLR-7707
> URL: https://issues.apache.org/jira/browse/SOLR-7707
> Project: Solr
> Issue Type: Improvement
> Components: SolrJ
> Reporter: Dennis Gove
> Priority: Minor
> Attachments: SOLR-7707.patch, SOLR-7707.patch
>
>
> This ticket is to add Stream Expression support to the RollupStream as
> discussed in SOLR-7560.
> Proposed expression syntax for the RollupStream (copied from that ticket)
> {code}
> rollup(
> someStream(....),
> over="fieldA, fieldB, fieldC",
> min(fieldA),
> max(fieldA),
> min(fieldB),
> mean(fieldD),
> sum(fieldC)
> )
> {code}
> This requires making the *Metric types Expressible but I think that ends up
> as a good thing. Would make it real easy to support other options on metrics
> like excluding outliers, for example find the sum of values within 3 standard
> deviations from the mean could be
> {code}
> sum(fieldC, limit=standardDev(3))
> {code}
> (note, how that particular calculation could be implemented is left as an
> exercise for the reader, I'm just using it as an example of adding additional
> options on a relatively simple metric).
> Another option example is what to do with null values. For example, in some
> cases a null should not impact a mean but in others it should. You could
> express those as
> {code}
> mean(fieldA, replace(null, 0)) // replace null values with 0 thus leading to
> an impact on the mean
> mean(fieldA, includeNull="true") // nulls are counted in the denominator but
> nothing added to numerator
> mean(fieldA, includeNull="false") // nulls neither counted in denominator nor
> added to numerator
> mean(fieldA, replace(null, fieldB), includeNull="true") // if fieldA is null
> replace it with fieldB, include null fieldB in mean
> {code}
> so on and so forth.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]