It's not a recommended approach, but if you're able to handle this "side-effect" downstream (eg. ignore the "wrong / incomplete" results), then you should be OK
On Mon, Nov 8, 2021 at 4:36 PM Pedro Facal <ped...@empathy.co> wrote: > Hi David, > > Thanks for the quick response. Say our app writes to disk every 10 > minutes: the difference is, in one case a single window is emitted and thus > a single file is written, while if we drain and restart the pipeline we > will end up with two files (because we will have two windows emitted for > the 10 minutes period when the drain happened). > > In our particular case this is ok - everything consuming the files > downstream will just use the two "wrong" (ie partial) files transparently. > So if this is the extent of the drain side-effects, I think it's safe for > us to do this for pipeline upgrades. Or am I missing something? > > Thanks, > > Pedro > > > On Mon, 8 Nov 2021 at 15:58, David Morávek <d...@apache.org> wrote: > >> Hi Pedro, >> >> draining basically means that all of the sources will finish and progress >> their watermark to end of the global window, which will fire all of the >> triggers as a result. In other words, it will trigger the _ON_TIME_ results >> from all of the unfinished windows, even though they might not have seen a >> complete input. >> >> This behavior doesn't really play well with the pipeline upgrades, >> because you've already emitted "wrong" results from the pipeline. >> >> Best, >> D. >> >> >> On Mon, Nov 8, 2021 at 3:19 PM Pedro Facal <ped...@empathy.co> wrote: >> >>> Hello, >>> >>> We have an apache beam streaming application, running under flink >>> native kubernetes. It consolidates aws kinesis records into parquet files >>> every few minutes. >>> >>> To manage the lifecycle of this app, we use the rest api to stop the >>> job with a savepoint and then restart the cluster/job from said savepoint. >>> This normally works as expected, but we run into problems when the data >>> schema changes. So far so good, since, as expected, even if the schema >>> changes, stopping the job using "drain:true" results in a proper upgrade >>> without issues. >>> >>> To avoid over complicating our release workflows, we are evaluating >>> the possibility of doing a "drain" restart every time we do a new release. >>> However, we have come across the following: >>> >>> > Use the --drain flag if you want to terminate the job permanently. If >>> you want to resume the job at a later point in time, then do not drain the >>> pipeline because it could lead to incorrect results when the job is resumed >>> [1]. >>> >>> It's not clear what kind of "incorrect results" we could face here - can >>> anybody elaborate? Our own tests show that we do not lose events from the >>> kinesis queue after the restart. >>> >>> Thanks, >>> >>> Pedro >>> >>> [1]( >>> https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/cli/#terminating-a-job >>> ) >>> >>> -- >>> *Pedro Facal San Luis* <ped...@empathy.co> >>> Data Team Lead >>> [image: Empathy Logo] >>> Privacy Policy <https://www.empathy.co/privacy-policy/> >>> >> > > -- > *Pedro Facal San Luis* <ped...@empathy.co> > Data Team Lead > [image: Empathy Logo] > Privacy Policy <https://www.empathy.co/privacy-policy/> >