[jira] [Commented] (SAMZA-123) Move topic partition grouping to the AM and generalize

Jay Kreps (JIRA) Wed, 30 Apr 2014 11:29:18 -0700

    [ 
https://issues.apache.org/jira/browse/SAMZA-123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985879#comment-13985879
 ]


Jay Kreps commented on SAMZA-123:
---------------------------------

Let me also try to explain my objection to the groupByN strategy or any 
strategy that mutates the set of tasks.

The model I think we should be able to give is a state machine model. I.e. 
input comes in a deterministic manner to a deterministic task. In cases where 
we have multiple inputs we just need to stitch them together in a deterministic 
or repeatable manner to make this work.

The advantages of this model are
1. Allows the maintenance of state along with each task.
2. Allows a fault tolerance model where we eliminate duplicate output from the 
task to provide exact semantics.

If the identity of our tasks change in the wrong way, the right way to split up 
state (or whether that can be done at all) somewhat disappears.

Likewise the ability to suppress duplicate output is based on a notion of 
identity for the task that will give wrong answers if the inputs to that task 
are dramatically changed.

I think these are two really core and important features so I just really 
really want to make sure we think through the implications of any change here 
holistically.

What this raises though is something that we haven't really addressed head on 
in the current model, which is what kinds of partition assignment changes are 
actually okay?

Partition assignment changes can happen in only two cases: when a topic is 
added or deleted.

Interesting these cases both seem to work out okay. The semantics of adding a 
topic are that you add those partitions to the appropriate tasks and they start 
processing from the beginning. It is essentially the same as if they had always 
been assigned those partitions but data was extremely delayed in arrival. The 
delete case is also okay I think. In this case the effect is as if data just 
stops arriving from that topic.

What will certainly break things would be a case where a partition moves from 
one task to another or the total set of tasks changes. I actually really think 
the right thing to do is just disallow those kinds of changes. I agree that 
sometimes you may be able to reason that it is okay, but I think it is more 
dangerous than useful.

The GroupByN strategy does fix a real problem at LinkedIn for large jobs where 
we records metrics at the task level. But this problem is so specific to our 
environment that I really really would rather not bake that constraint into the 
system (the LinkedIn ingraphs people are actually trying to fix this anyway so 
the problem may disappear in the next few months).

Do we have other ideas for grouping strategies other than GroupBySSP and 
GroupByPartition? If we were able to otherwise scale the task count I think 
these are the only two semantically different cases I can think of. I would be 
most comfortable with an implementation that just exposed something like
{code}
  co.partition.streams=true
{code}
although I agree that is more limited.

> Move topic partition grouping to the AM and generalize
> ------------------------------------------------------
>
>                 Key: SAMZA-123
>                 URL: https://issues.apache.org/jira/browse/SAMZA-123
>             Project: Samza
>          Issue Type: Sub-task
>          Components: container
>    Affects Versions: 0.6.0
>            Reporter: Jakob Homan
>            Assignee: Jakob Homan
>         Attachments: SAMZA-123-design-doc.md, SAMZA-123-design-doc.pdf
>
>
> Currently the AM sends a set of all the topics and partitions to the 
> container, which then groups them by partition and assigns each set to a task 
> instance. By moving the grouping to the AM, we can assign arbitrary groups to 
> task instances, which will allow more partitioning strategies, as discussed 
> in SAMZA-71.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SAMZA-123) Move topic partition grouping to the AM and generalize

Reply via email to