[
https://issues.apache.org/jira/browse/SOLR-8281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joel Bernstein updated SOLR-8281:
---------------------------------
Description:
The RollupMergeStream merges the aggregate results emitted by the RollupStream
on *worker* nodes.
This is designed to be used in conjunction with the HashJoinStream to perform
rollup Aggregations on the joined Tuples. The HashJoinStream will require the
tuples to be partitioned on the Join keys. To avoid needing to repartition on
the *group by* fields for the RollupStream, we can perform a merge of the
rolled up Tuples coming from the workers.
The construct would like this:
{code}
mergeRollup (...
parallel (...
rollup (...
hashJoin (
search(...),
search(...),
on="fieldA"
)
)
)
)
{code}
The pseudo code above would push the *hashJoin* and *rollup* to the *worker*
nodes. The emitted rolled up tuples would be merged by the mergeRollup.
was:
The RollupMergeStream merges the aggregate results emitted by the RollupStream
on *worker* nodes.
This is designed to be used in conjunction with the HashJoinStream to perform
rollup Aggregations on the joined Tuples. The HashJoinStream will require the
tuples to be partitioned on the Join keys. To avoid needing to repartition on
the *group by* fields for the RollupStream, we can perform a merge of the
rolled up Tuples coming from the workers.
The construct would like this:
{code}
mergeRollup (...
parallel (...
hashJoin (
search(...),
search(...),
on="fieldA"
)
)
)
{code}
> Add RollupMergeStream to Streaming API
> --------------------------------------
>
> Key: SOLR-8281
> URL: https://issues.apache.org/jira/browse/SOLR-8281
> Project: Solr
> Issue Type: Bug
> Reporter: Joel Bernstein
>
> The RollupMergeStream merges the aggregate results emitted by the
> RollupStream on *worker* nodes.
> This is designed to be used in conjunction with the HashJoinStream to perform
> rollup Aggregations on the joined Tuples. The HashJoinStream will require the
> tuples to be partitioned on the Join keys. To avoid needing to repartition on
> the *group by* fields for the RollupStream, we can perform a merge of the
> rolled up Tuples coming from the workers.
> The construct would like this:
> {code}
> mergeRollup (...
> parallel (...
> rollup (...
> hashJoin (
> search(...),
> search(...),
> on="fieldA"
> )
> )
> )
> )
> {code}
> The pseudo code above would push the *hashJoin* and *rollup* to the *worker*
> nodes. The emitted rolled up tuples would be merged by the mergeRollup.
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]