[
https://issues.apache.org/jira/browse/SAMZA-355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jakob Homan updated SAMZA-355:
------------------------------
Attachment: SAMZA-355.patch
Patch that implements the above-described grouper, post SAMZA-123.
> Provide a SystemStreamPartitionGrouper that groups into N sets
> --------------------------------------------------------------
>
> Key: SAMZA-355
> URL: https://issues.apache.org/jira/browse/SAMZA-355
> Project: Samza
> Issue Type: Task
> Components: container
> Affects Versions: 0.8.0
> Reporter: Jakob Homan
> Attachments: SAMZA-355.patch
>
>
> As part of SAMZA-123, it was proposed to provide an SSPGrouper that would
> hash the SSPs into a fixed number of tasknames. This would provide similar,
> not-at-all-grouped functionality as the GroupByPartitionGrouper but with only
> n TaskInstances created rather than one for each SSP. If that n is tied to
> the number of containers, this is a conceptually simple way of processing
> lots of SSPs that need no co-grouping.
> As long as the n is not changed, the SSPs would be guaranteed to hash to the
> same TaskName with each run.
> There was some concern over if this is the correct approach to take, so it
> was agreed to post the patch separately and make it available to the
> community and for further discussion.
--
This message was sent by Atlassian JIRA
(v6.2#6252)