What should be the behavior of BucketingSink vis a vis state ( pending ,
inprogess and finalization ) when we suspend and resume ?

So I did this

* I had a pipe writing to hdfs suspend and resume using

--allowNonRestoredState as in I had added a harmless MapOperator (
stateless ).

* I see that a file on hdfs, the file being written to ( before the cancel
with save point )  go into a pending state  _part-0-21.pending

* I see a new file being written to in the resumed pipe

What  I do not see is the file in  _part-0-21.pending being finalized ( as
in renamed to a just part-0-21. I would have assumed that would be the case
in this controlled suspend/resume circumstance. Further it is a rename and
hdfs mv is not an expensive operation.

Am I understanding the process correct and it yes any pointers ?



Reply via email to