This is 1.4 BTW. I am not sure that I am reading this correctly but the lifecycle of cancel/resume is 2 steps
1. Cancel job with SP closeCurrentPartFile https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-filesystem/src/main/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSink.java#L549 is called from close() https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-filesystem/src/main/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSink.java#L416 and that moves files to pending state. That I would presume is called when one does a cancel. 2. The restore on resume https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-filesystem/src/main/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSink.java#L369 calls handleRestoredBucketState https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-filesystem/src/main/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSink.java#L704 clears the pending files from state without finalizing them? That does not seem to be right. I must be reading the code totally wrong ? I am not sure also whether --allowNonRestoredState is skipping getting the state . At least https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/cli.html#savepoints is not exactly clear what it does if we add an operator ( GDF I think will add a new operator in the DAG without state even if stateful, in my case the Map operator is not even stateful ) Thanks and please bear with me if this is all something pretty simple. Vishal On Fri, Feb 9, 2018 at 11:54 AM, Vishal Santoshi <vishal.santo...@gmail.com> wrote: > What should be the behavior of BucketingSink vis a vis state ( pending , > inprogess and finalization ) when we suspend and resume ? > > So I did this > > * I had a pipe writing to hdfs suspend and resume using > > --allowNonRestoredState as in I had added a harmless MapOperator ( > stateless ). > > > * I see that a file on hdfs, the file being written to ( before the cancel > with save point ) go into a pending state _part-0-21.pending > > > * I see a new file being written to in the resumed pipe > _part-0-22.in-progress. > > > What I do not see is the file in _part-0-21.pending being finalized ( > as in renamed to a just part-0-21. I would have assumed that would be the > case in this controlled suspend/resume circumstance. Further it is a rename > and hdfs mv is not an expensive operation. > > > > Am I understanding the process correct and it yes any pointers ? > > > Regards, > > > Vishal >