Hello Matthias, No is not too late, I understand the complexity of change of modify the topic size.
The approach to oversize the partition is implemented from day one, but having multiple topics one each domain subject, I want to consider the case of someone miss sizing a specific domain. I think the approach you mention of having 2 application ids is the same topic is the safer, and even if costly it's a valid approach for other use cases and not only partition resizing. Thanks for the advice On 9 Feb 2018 6:56 pm, "Matthias J. Sax" <matth...@confluent.io> wrote: Changing source topic partitions for stateful application is very difficult. The best thing is to overpartition source topics to avoid this issue in the first place. (Too late for you I guess.) If you really need to change the partitions count, the cleanest solution is, to create a second input topic with the new number of partitions and start a new copy of the application that reads from the new source topic and uses a different application.id. You run both applications in parallel until the new application is ready to take over (the new app should write to different output topics, too). During this time, your producers should write to both topic. It would also be possible, to copy the data from the old to the new topic (new app builder.stream("old").to("new")) to let the new application catch up quicker. This requires a syncronized switch of the producer applications from old to new topic though.... Ie, run the "copy app" until it reaches the end of the log. Stop Producers; let "copy app" finish to copy the latest writes; restart producers and point to new topic. If you can accept application downtime, it's of course simpler. You can stop the app, reset it via the reset tool and clean local state. Change number of partitions. Restart the app. All approaches are difficult... There might be other approaches too. But you cannot change the number of input topic partitions on the fly while the application is running. It would crash. -Matthias On 2/9/18 6:51 AM, Pegerto Fernandez Torres wrote: > Hello, > > I start to think in the problem of expanding our source partitions, it > looks like a stateless stream will handle the expansion properly after > Rebalancing. > > In the case of a state-full then the problem seems different, first problem > is the changelog itself, the expected definition will miss match. > > The second problem is the locality of the data, the results as new > partition will ben process in a task that dont have the associated state. > > Have someone design some work around or is there any additional > documentation I should look at? > -- Check out our new blog, "Get your -aas in gear: Privatelink and PaaS democratisation on AWS" by Guy Richardson <https://goo.gl/CoRMqF> opencredo.com <http://www.opencredo.com/> . Twitter <https://twitter.com/OpenCredo> . LinkedIn <https://www.linkedin.com/company/opencredo> OpenCredo Ltd -- Delivering Software Innovation Registered Office: 5-11 Lavington Street, London, SE1 0NZ Registered in UK. No 3943999