without --allowNonRestoredState, on a suspend/resume we do see the length
file along with the finalized file ( finalized during resume )

-rw-r--r--   3 root hadoop         10 2018-02-09 13:57
/vishal/sessionscid/dt=2018-02-09/_part-0-28.valid-length

that does makes much more sense.

I guess we should document --allowNonRestoredState better ? It seems it
actually drops state ?



On Fri, Feb 9, 2018 at 1:37 PM, Vishal Santoshi <vishal.santo...@gmail.com>
wrote:

> This is 1.4 BTW.  I am not sure that I am reading this correctly but the
> lifecycle of cancel/resume is 2 steps
>
>
>
> 1. Cancel job with SP
>
>
> closeCurrentPartFile
>
> https://github.com/apache/flink/blob/master/flink-
> connectors/flink-connector-filesystem/src/main/java/org/
> apache/flink/streaming/connectors/fs/bucketing/BucketingSink.java#L549
>
> is called from close()
>
>
> https://github.com/apache/flink/blob/master/flink-
> connectors/flink-connector-filesystem/src/main/java/org/
> apache/flink/streaming/connectors/fs/bucketing/BucketingSink.java#L416
>
>
> and that moves files to pending state.  That I would presume is called
> when one does a cancel.
>
>
>
> 2. The restore on resume
>
> https://github.com/apache/flink/blob/master/flink-
> connectors/flink-connector-filesystem/src/main/java/org/
> apache/flink/streaming/connectors/fs/bucketing/BucketingSink.java#L369
>
> calls
>
> handleRestoredBucketState
>
> https://github.com/apache/flink/blob/master/flink-
> connectors/flink-connector-filesystem/src/main/java/org/
> apache/flink/streaming/connectors/fs/bucketing/BucketingSink.java#L704
>
> clears the pending files from state without finalizing them?
>
>
>
> That does not seem to be right. I must be reading the code totally wrong ?
>
> I am not sure also whether --allowNonRestoredState is skipping getting
> the state . At least https://ci.apache.org/projects/flink/flink-docs-
> release-1.4/ops/cli.html#savepoints is not exactly clear what it does if
> we add an operator ( GDF I think will add a new operator in the DAG without
> state even if stateful, in my case the Map operator is not even stateful )
>
>
> Thanks and please bear with me if this is all something pretty simple.
>
> Vishal
>
>
>
>
>
>
>
>
>
>
>
>
> On Fri, Feb 9, 2018 at 11:54 AM, Vishal Santoshi <
> vishal.santo...@gmail.com> wrote:
>
>> What should be the behavior of BucketingSink vis a vis state ( pending ,
>> inprogess and finalization ) when we suspend and resume ?
>>
>> So I did this
>>
>> * I had a pipe writing to hdfs suspend and resume using
>>
>> --allowNonRestoredState as in I had added a harmless MapOperator (
>> stateless ).
>>
>>
>> * I see that a file on hdfs, the file being written to ( before the
>> cancel with save point )  go into a pending state  _part-0-21.pending
>>
>>
>> * I see a new file being written to in the resumed pipe
>> _part-0-22.in-progress.
>>
>>
>> What  I do not see is the file in  _part-0-21.pending being finalized (
>> as in renamed to a just part-0-21. I would have assumed that would be the
>> case in this controlled suspend/resume circumstance. Further it is a rename
>> and hdfs mv is not an expensive operation.
>>
>>
>>
>> Am I understanding the process correct and it yes any pointers ?
>>
>>
>> Regards,
>>
>>
>> Vishal
>>
>
>

Reply via email to