[
https://issues.apache.org/jira/browse/SOLR-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dennis Gove updated SOLR-7525:
------------------------------
Attachment: SOLR-7525.patch
Rebases off of trunk and adds a DistinctOperation for use in the ReducerStream.
The DistinctOperation ensures that for any given group only a single tuple will
be returned. Currently it is implemented to return the first tuple in a group
but a possible enhancement down the road could be to support a parameter asking
for some other tuple in the group (such as the first in a sub-sorted list).
Also, while implementing this I realized that the UniqueStream can be
refactored to be just a type of ReducerStream with DistinctOperation. That
change is not included in this patch but will be done under a separate ticket.
Also of note, I'm not sure if the getChildren() function declared in
TupleStream is necessary any longer. If I recall correctly that function was
used by the StreamHandler when passing streams to workers but since all that
has been changed to pass the result of toExpression(....) I think we can get
rid of the getChildren() function. I will explore that possibility.
> Add ComplementStream to the Streaming API and Streaming Expressions
> -------------------------------------------------------------------
>
> Key: SOLR-7525
> URL: https://issues.apache.org/jira/browse/SOLR-7525
> Project: Solr
> Issue Type: New Feature
> Components: SolrJ
> Reporter: Joel Bernstein
> Priority: Minor
> Attachments: SOLR-7525.patch, SOLR-7525.patch
>
>
> This ticket adds a ComplementStream to the Streaming API and Streaming
> Expression language.
> The ComplementStream will wrap two TupleStreams (StreamA, StreamB) and emit
> Tuples from StreamA that are not in StreamB.
> Streaming API Syntax:
> {code}
> ComplementStream cstream = new ComplementStream(streamA, streamB, comp);
> {code}
> Streaming Expression syntax:
> {code}
> complement(search(...), search(...), on(...))
> {code}
> Internal implementation will rely on the ReducerStream. The ComplementStream
> can be parallelized using the ParallelStream.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]