Hi Pedro, draining basically means that all of the sources will finish and progress their watermark to end of the global window, which will fire all of the triggers as a result. In other words, it will trigger the _ON_TIME_ results from all of the unfinished windows, even though they might not have seen a complete input.
This behavior doesn't really play well with the pipeline upgrades, because you've already emitted "wrong" results from the pipeline. Best, D. On Mon, Nov 8, 2021 at 3:19 PM Pedro Facal <ped...@empathy.co> wrote: > Hello, > > We have an apache beam streaming application, running under flink > native kubernetes. It consolidates aws kinesis records into parquet files > every few minutes. > > To manage the lifecycle of this app, we use the rest api to stop the job > with a savepoint and then restart the cluster/job from said savepoint. This > normally works as expected, but we run into problems when the data schema > changes. So far so good, since, as expected, even if the schema changes, > stopping the job using "drain:true" results in a proper upgrade without > issues. > > To avoid over complicating our release workflows, we are evaluating the > possibility of doing a "drain" restart every time we do a new release. > However, we have come across the following: > > > Use the --drain flag if you want to terminate the job permanently. If > you want to resume the job at a later point in time, then do not drain the > pipeline because it could lead to incorrect results when the job is resumed > [1]. > > It's not clear what kind of "incorrect results" we could face here - can > anybody elaborate? Our own tests show that we do not lose events from the > kinesis queue after the restart. > > Thanks, > > Pedro > > [1]( > https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/cli/#terminating-a-job > ) > > -- > *Pedro Facal San Luis* <ped...@empathy.co> > Data Team Lead > [image: Empathy Logo] > Privacy Policy <https://www.empathy.co/privacy-policy/> >