Thanks Eno ..

regarding the merging part, I was talking about merging topics using
streams only - so that is safe as you mentioned.

Regarding the restore part, I have another question. May be it's a bit
naive too ..

During restore why does Kafka replay the whole topic / partition to
recreate the state in the local state store ? Isn't there any way to just
have the latest message as the current state ? Because that's what it is ..
right ? The last message in the topic / partition IS the latest state. May
be I am missing something obvious ?

regards.

On Fri, Jul 14, 2017 at 6:23 PM, Eno Thereska <eno.there...@gmail.com>
wrote:

> Hi Debasish,
>
> Your intuition about the first part is correct. Kafka Streams
> automatically assigns a partition of a topic to
> a task in an instance. It will never be the case that the same partition
> is assigned to two tasks.
>
> About the merging or changing of partitions part, it would help if we know
> more about what you
> are trying to do. For example, if behind the scenes you add or remove
> partitions that would not work
> well with Kafka Streams. However, if you use the Kafka Streams itself to
> create new topics (e.g.,
> by merging two topics into one, or vice versa by taking one topic and
> splitting it into more topics), then
> that would work fine.
>
> Eno
>
> > On 13 Jul 2017, at 23:49, Debasish Ghosh <ghosh.debas...@gmail.com>
> wrote:
> >
> > Hi -
> >
> > I have a question which is mostly to clarify some conceptions regarding
> > state management and restore functionality using Kafka Streams ..
> >
> > When I have multiple instances of the same application running (same
> > application id for each of the instances), are the following assumptions
> > correct ?
> >
> >   1. each instance has a separate state store (local)
> >   2. all instances are backed up by a *single* changelog topic
> >
> > Now the question arises, how does restore work in the above case when we
> > have 1 changelog topic backing up multiple state stores ?
> >
> > Each instance of the application ingests data from specific partitions of
> > the topic. And there can be multiple topics too. e.g. if we have m topics
> > with n partitions in each, and p instances of the application, then all
> the
> > (m x n) partitions are distributed across the p instances of the
> > application. Is this true ?
> >
> > If so, then does the changelog topic also has (m x n) partitions, so that
> > Kafka knows which state to restore in which store in case of a restore
> > operation ?
> >
> > And finally, if we decide to merge topics / partitions in between without
> > complete reset of the application, will (a) it work ? and (b) the
> changelog
> > topic gets updated accordingly and (c) is this recommended ?
> >
> > regards.
> >
> > --
> > Debasish Ghosh
> > http://manning.com/ghosh2
> > http://manning.com/ghosh
> >
> > Twttr: @debasishg
> > Blog: http://debasishg.blogspot.com
> > Code: http://github.com/debasishg
>
>


-- 
Debasish Ghosh
http://manning.com/ghosh2
http://manning.com/ghosh

Twttr: @debasishg
Blog: http://debasishg.blogspot.com
Code: http://github.com/debasishg

Reply via email to