Re: Kafka stream repartitioning

Chris Riccomini Thu, 18 Dec 2014 08:23:34 -0800

Hey Zdenek,

> Partition splitting or repartitioning is not supported.

This statement is too strong. Splitting, as you describe it, can be done.
Stop the job, add more partitions to the topic, then start it again.

> Why is that? Do you know internal reasons?

The reason we make this statement is because adding partitions can have a
surprising affect on your Samza job's output in cases where your job is
stateful, or replies on semantic partitioning (by key). If you have 4
partitions, and are partitioning by member ID, the member ID 4 will always
end up on one of the partitions (say, partition 0). If you add more
partitions to the topic, new messages for member ID 4 might end up in a
different partition (say, partition 4). This is because we hash-mod to
determine partition assignment. If the mod (number of partitions) changes,
the partition assignment will change.

If you have state, say you're counting the number of messages per-member
ID, then the task that is processing partition 0 will suddenly stop seeing
any new messages for member ID 4. The new task, which would be created to
process partition 4 will start seeing messages for member ID 4, but will
have no count state associated with the ID, so will start from 0. In
essence, your state is trashed.

So, when we say we don't support splitting or repartitioning, all we're
really saying is that we provide no facility for you to maintain your
state when you add more partitions to your topic. If you're OK with that,
or if your job is stateless, then you're fine.

> How do you scale without possibility to change partition's count?

Samza's partition count provides an upper bound on parallelism, but is not
the only thing that dictates scaling. By default, every input partition
number gets its own StreamTask to process its messages. These tasks are
grouped together into "containers", which is basically a way of saying
several StreamTask objects all run within a single Java process. Taking
our above example, if you had 4 partitions, but one container, then all
four StreamTasks are processing messages in just one Java process, on one
machine. If you want to scale out, you add more containers
(yarn.container.count). In our example, you can add up to four. Adding a
fifth makes no sense because you have only four input partitions, so the
fifth container would do nothing.

A lot of distributed systems work this way. The key is to "over partition"
your topics. Partitions are relatively cheap in Kafka, though not entirely
free. If you pick a reasonably high partition count, you should be able to
run your job for several years without having to worry about running out
of capacity. When you do run out, you have two options 1) expand the topic
and trash any state that you had 2) manually migrate the state to a second
job with the proper partition count, then start the job with the new
repartitioned input topic.

> When we kill Samza job and run it again (the same partitions count),
>will Samza continued where it stopped?

Yes.

> why we can not kill Samza job, increse the partition's count and then
>rerun the same Samza job again?

You can, but doing so might "break" your state (see first comment above).

Cheers,
Chris

On 12/18/14 12:29 AM, "[email protected]"
<[email protected]> wrote:

>Hi,
>
>In Samza documentation says:
>"Samza currently assumes that a stream¹s partition count will never
>change. Partition splitting or repartitioning is not supported."
>
>Why is that? Do you know internal reasons?
>How do you scale without possibility to change partition's count?
>When we kill Samza job and run it again (the same partitions count), will
>Samza continued where it stopped?
>If YES, why we can not kill Samza job, increse the partition's count and
>then rerun the same Samza job again?
>
>Thanks for reply.
>
>Zdenek Tison

Re: Kafka stream repartitioning

Reply via email to