Re: [DISCUSS] KIP-253: Support in-order message delivery with partition expansion

Jun Rao Mon, 26 Mar 2018 22:11:38 -0700

Hi, Jan,

Thanks for the reply. A few more comments inlined below.


On Fri, Mar 23, 2018 at 7:15 AM, Jan Filipiak <jan.filip...@trivago.com>
wrote:

>
>> I agree that decoupling the number of tasks in a consumer group from the
>> number of partitions in the input topic is a good idea. This allows each
>> consumer group to change its degree of parallelism independently. There
>> are
>> different ways to achieve such decoupling. You suggested one approach,
>> which works when splitting partitions. Not sure if this works in general,
>> for example when merging partitions.
>>
> Merging is the same.
>
> topicOldMapping-0 : 25 => [0:50]
> topicOldMapping-1 : 25 => [1:50]
> topicOldMapping-2 : 15 => [0:50]
> topicOldMapping-3 : 45 => [1:50]
>
> generating this mapping is trivial again if we do not copy data. (end
> offsets point to start offsets)
> If we do copy data the consumer will create the mapping.


For merging partitions, I was thinking of the following case. Suppose that
the input topic has 4 partitions and the consumer has 4 instances and each
instance is assigned 1 partition. Now if we want to merge the 4 partitions
into 2 partitions, we can't assign the 2 partitions to 4 instances without
changing the key space that each instance covers. In order not to force the
consumers to rebuild the states immediately, one more general option is to
just let the consumer re-shuffle the input.


>
>
>
>> Once a consumer group decides to change its degree of parallelism, it
>> needs
>> to rebuild the local state for the new tasks. We can either rebuild the
>> states from the input topic or from the local state (through change
>> capture
>> topic). I think we agreed that rebuilding from the local state is more
>> efficient. It also seems that it's better to let the KStreams library do
>> the state rebuilding, instead of each application. Would you agree?
>>
> If you want todo by RPC, you are changing the running application without
> having a new good one. This
> is against the kappa achitecture I would not recommend that.
>
> If you replay the changelog and only poll records that are for your
> partition. You have the problem of knowing
> which offset from the input topic your current state relates to.
>
>
Hmm, my understanding is that the changelog data always reflects the state
up to committed offset in the input topic. So, when rebuilding the state,
you keep reading from the current changelog and pulling the key belonging
to the new state. When you almost catch up on the changelog topic, you stop
the consumption, drain the remaining data in the changelog, and resume the
consumption. At this point, the new local state should match the committed
offset in the input topic.



> If you rebuild you could leave the old running and just wait for the new
> to be _good_ then change who's output you
> show to the customers.
>
>
>> About the special consumer. I guess that's only needed if one wants to
>> recopy the existing data? However, if we can truly decouple the tasks in
>> the consumer group from the partitions in the input topic, it seems there
>> is no need for copying existing data? It's also not clear to me who will
>> start/stop this special consumer.
>>
> Who starts and stops it is also not very clear to me. I do not have strong
> opinions.
> The thing is that I am looking for an explanation how you can have a
> logcompacted topic working without copying.
> I agree that for running consumers its no problem as they are already past
> the history. But the whole purpose of Log compaction
> is to be able to bootstrap new consumers. They are completely lost with a
> topic expanded without repartitioning.
> The topic will be broken forever. and this is not acceptable.
>
>
Yes, one option here is to always reshuffle the compacted topic in the
consumer until the number of partition matches the number of tasks. One
still pays the copying overhead, but only when it's needed and doesn't
affect the brokers.



> This is why I am so intrigued to model the problem as described because it
> has no overhead for the no copy path while it allows
> to also perform a copy.
>
> State handling wise one has also all the options. Exactly the 3 you
> mentioned I guess. its just that this statestore RPC is a bad idea
> and it was only invented to allow for optimisations that are better not
> done.Not to say they are premature ;)
>
>
Hmm, could you elaborate that a bit? Recopying existing data needs RPCs
across consumers and brokers, which seems ok to you. To me, that's not any
different from RPC across consumer instances.


> I hope it makes it clear
>
> best jan


Thanks,

Jun

Re: [DISCUSS] KIP-253: Support in-order message delivery with partition expansion

Reply via email to