Hi, Thanks Gordon, should have read the announcement :) This might indeed be the case here, I will just use the workaround. At least this is a known issue, almost got a heart attack today :D
Cheers, Gyula Tzu-Li (Gordon) Tai <tzuli...@apache.org> ezt írta (időpont: 2018. jan. 8., H, 17:56): > Hi Gyula, > > Is your 1.3 savepoint from Flink 1.3.1 or 1.3.0? > In those versions, we had a critical bug that caused duplicate partition > assignments in corner cases, so the assignment logic was altered from 1.3.1 > to 1.3.2 (and therefore also 1.4.0). > > If you indeed was using 1.3.1 or 1.3.0, and you are certain that the > savepoint does not contain duplicate partition assignments caused by the > bug, then yes restoring with DOP 1 and then rescaling again is a good > workaround. > > Please see the 1.3.2 release announcement [1] for details. > > Best, > Gordon > > [1] http://flink.apache.org/news/2017/08/05/release-1.3.2.html > > > > On Jan 8, 2018 6:57 AM, "Gyula Fóra" <gyula.f...@gmail.com> wrote: > > Migrating the jobs by setting the sources to parallelism = 1 and then > scale back up after migration seems to be a good workaround, but I am > wondering if something I do made this happen or this is a bug. > > Gyula Fóra <gyula.f...@gmail.com> ezt írta (időpont: 2018. jan. 8., H, > 14:46): > >> Hi, >> >> Is it possible that the Kafka partition assignment logic has changed >> between Flink 1.3 and 1.4? I am trying to migrate some jobs using Kafka >> 0.8 sources and about half my jobs lost offset state for some partitions >> (but not all partitions). Jobs with parallelism 1 dont seem to be >> affected... >> >> Any ideas? >> >> Gyula >> > >