Hello Matthias,

No is not too late, I understand the complexity of change of modify the
topic size.

The approach to oversize the partition is implemented from day one, but
having multiple topics one each domain subject, I want to consider the case
of someone miss sizing a specific domain.

I think the approach you mention of having 2 application ids is the same
topic is the safer, and even if costly it's a valid approach for other use
cases and not only partition resizing.

Thanks for the advice

On 9 Feb 2018 6:56 pm, "Matthias J. Sax" <matth...@confluent.io> wrote:

Changing source topic partitions for stateful application is very difficult.

The best thing is to overpartition source topics to avoid this issue in
the first place. (Too late for you I guess.)

If you really need to change the partitions count, the cleanest solution
is, to create a second input topic with the new number of partitions and
start a new copy of the application that reads from the new source topic
and uses a different application.id. You run both applications in
parallel until the new application is ready to take over (the new app
should write to different output topics, too). During this time, your
producers should write to both topic.

It would also be possible, to copy the data from the old to the new
topic (new app builder.stream("old").to("new")) to let the new
application catch up quicker. This requires a syncronized switch of the
producer applications from old to new topic though.... Ie, run the "copy
app" until it reaches the end of the log. Stop Producers; let "copy app"
finish to copy the latest writes; restart producers and point to new topic.

If you can accept application downtime, it's of course simpler. You can
stop the app, reset it via the reset tool and clean local state. Change
number of partitions. Restart the app.

All approaches are difficult... There might be other approaches too. But
you cannot change the number of input topic partitions on the fly while
the application is running. It would crash.


On 2/9/18 6:51 AM, Pegerto Fernandez Torres wrote:
> Hello,
> I start to think in the problem of expanding our source partitions, it
> looks like a stateless stream will handle the expansion properly after
> Rebalancing.
> In the case of a state-full then the problem seems different, first
> is the changelog itself, the expected definition will miss match.
> The second problem is the locality of the data, the results as new
> partition will ben process in a task that dont have the associated state.
>  Have someone design some work around or is there any additional
> documentation I should look at?


Check out our new blog, "Get your -aas in gear: Privatelink and PaaS 
democratisation on AWS" by Guy Richardson <https://goo.gl/CoRMqF> 

opencredo.com <http://www.opencredo.com/> . Twitter 
<https://twitter.com/OpenCredo> . LinkedIn 

OpenCredo Ltd -- Delivering Software Innovation

Registered Office:  5-11 Lavington Street, London, SE1 0NZ
Registered in UK. No 3943999

Reply via email to