This is 1.4 BTW.  I am not sure that I am reading this correctly but the
lifecycle of cancel/resume is 2 steps



1. Cancel job with SP


closeCurrentPartFile

https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-filesystem/src/main/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSink.java#L549

is called from close()


https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-filesystem/src/main/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSink.java#L416


and that moves files to pending state.  That I would presume is called when
one does a cancel.



2. The restore on resume

https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-filesystem/src/main/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSink.java#L369

calls

handleRestoredBucketState

https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-filesystem/src/main/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSink.java#L704

clears the pending files from state without finalizing them?



That does not seem to be right. I must be reading the code totally wrong ?

I am not sure also whether --allowNonRestoredState is skipping getting the
state . At least
https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/cli.html#savepoints
is not exactly clear what it does if we add an operator ( GDF I think will
add a new operator in the DAG without state even if stateful, in my case
the Map operator is not even stateful )


Thanks and please bear with me if this is all something pretty simple.

Vishal












On Fri, Feb 9, 2018 at 11:54 AM, Vishal Santoshi <vishal.santo...@gmail.com>
wrote:

> What should be the behavior of BucketingSink vis a vis state ( pending ,
> inprogess and finalization ) when we suspend and resume ?
>
> So I did this
>
> * I had a pipe writing to hdfs suspend and resume using
>
> --allowNonRestoredState as in I had added a harmless MapOperator (
> stateless ).
>
>
> * I see that a file on hdfs, the file being written to ( before the cancel
> with save point )  go into a pending state  _part-0-21.pending
>
>
> * I see a new file being written to in the resumed pipe
> _part-0-22.in-progress.
>
>
> What  I do not see is the file in  _part-0-21.pending being finalized (
> as in renamed to a just part-0-21. I would have assumed that would be the
> case in this controlled suspend/resume circumstance. Further it is a rename
> and hdfs mv is not an expensive operation.
>
>
>
> Am I understanding the process correct and it yes any pointers ?
>
>
> Regards,
>
>
> Vishal
>

Reply via email to