[ 
https://issues.apache.org/jira/browse/SAMZA-123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983115#comment-13983115
 ] 

Martin Kleppmann commented on SAMZA-123:
----------------------------------------

I'm aware that adding another opinion here risks "design by committee" and just 
dragging things out further, so please feel free to ignore me. That said, FWIW 
in case it's useful:

* Personally I think the term "cohort" is fine; I agree with Jakob about 
preferring a term that doesn't already have other meanings. If there is 
objection to the term "cohort" specifically, how about "shard"? I don't think 
that's used anywhere in Kafka or YARN. The word is unfortunately a bit 
[tainted|http://www.mongodb-is-web-scale.com/] by MongoDB, but apart from that 
I think it has the right connotations.
* I didn't see anyone pick up [~jkreps]'s suggestion of storing the SSP/cohort 
mapping in Zookeeper, but I think it would be worth considering. Kafka already 
requires ZK, so it wouldn't be new infrastructure (just a new library 
dependency in the AM). ZK would likely be too expensive for frequently-changing 
things like checkpointed offsets, but probably ok for rarely-changing things 
like configuration and the SSP/cohort mapping.
* No strong opinions about backwards compatibility at this stage.

> Move topic partition grouping to the AM and generalize
> ------------------------------------------------------
>
>                 Key: SAMZA-123
>                 URL: https://issues.apache.org/jira/browse/SAMZA-123
>             Project: Samza
>          Issue Type: Sub-task
>          Components: container
>    Affects Versions: 0.6.0
>            Reporter: Jakob Homan
>            Assignee: Jakob Homan
>         Attachments: SAMZA-123-design-doc.md, SAMZA-123-design-doc.pdf
>
>
> Currently the AM sends a set of all the topics and partitions to the 
> container, which then groups them by partition and assigns each set to a task 
> instance. By moving the grouping to the AM, we can assign arbitrary groups to 
> task instances, which will allow more partitioning strategies, as discussed 
> in SAMZA-71.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to