[
https://issues.apache.org/jira/browse/SAMZA-123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986719#comment-13986719
]
Jay Kreps commented on SAMZA-123:
---------------------------------
So to "checkpoint" my comments:
- I am violently pro task id/name :-)
- I see the possible value of being able to use semantics of the topic name to
override the grouping as in the datacenter example, so I don't really object to
making the grouping pluggable.
- My real concern is around safety and semantics if the set of tasks change. If
we introduce options that, if set incorrectly, corrupt your output I think
people will tend to hurt themselves (and then we will spend a ton of time
trying to debug). So I would like to think through the strongest validations we
can do and how those would work. I think validation should be part of this
feature, though they need not be part of this patch.
- I agree that we need to think through the checkpointing/config log stuff.
> Move topic partition grouping to the AM and generalize
> ------------------------------------------------------
>
> Key: SAMZA-123
> URL: https://issues.apache.org/jira/browse/SAMZA-123
> Project: Samza
> Issue Type: Sub-task
> Components: container
> Affects Versions: 0.6.0
> Reporter: Jakob Homan
> Assignee: Jakob Homan
> Attachments: SAMZA-123-design-doc.md, SAMZA-123-design-doc.pdf
>
>
> Currently the AM sends a set of all the topics and partitions to the
> container, which then groups them by partition and assigns each set to a task
> instance. By moving the grouping to the AM, we can assign arbitrary groups to
> task instances, which will allow more partitioning strategies, as discussed
> in SAMZA-71.
--
This message was sent by Atlassian JIRA
(v6.2#6252)