Some context: Each k-v store has a changelog topic. The # of partitions in
that changelog topic is equal to the # of tasks. Each task's K-V store will
be mapped to a particular partition of that changelog topic. This mapping
from taskNames-changeLogPartitionNumber is stored in coordinator stream.

Of course, you don't want this k-v changelog topic to keep growing. So,
people configure it with some expiration. The expiration can either be:
1. Time retention: Records older than the retention are purged.
2. Compaction: Newer key-values will over-write older keys and only the
most recent value is retained.

I'm not sure if offsets are always monotonically increasing in Kafka or
could change after a compaction/ a time based retention kicks in for the
topic partition.





On Sat, Jun 11, 2016 at 11:53 PM, David Yu <david...@optimizely.com> wrote:

> My understanding of store changelog is that, each task writes store changes
> to a particular changelog partition for that task. (Does that mean the
> changelog keys are task names?)
>
> One thing that confuses me is that, the last offsets of some changelog
> partitions do not move. I'm using the kafka GetOffsetShell tool to get the
> last offsets for each partition. The result looks like this:
>
> partition   offset
> 0 7090
> 1 3737937
> 2 3733222
> 3 3719065
> 4 3730208
> 5 3731128
> 6 3734669
> 7 3691461
> 8 3759133
> 9 7286
> 10 3690347
> 11 3722450
> 12 7376
> 13 3738454
> 14 3742316
> 15 3710512
> 16 3777267
> 17 3750596
> 18 3728185
> 19 3694470
>
> As you can see, three of the partitions barely got any updates. In fact,
> the offsets stopped moving for a while. The traffic for each task should be
> fairly balanced. I checked the task log and made sure that the stores for
> these partitions are actively updated.
>
> Any idea why this is happening? Or am I missing something?
>
> Thanks,
> David
>



-- 
Jagadish V,
Graduate Student,
Department of Computer Science,
Stanford University

Reply via email to