kezhuw commented on issue #2033:
URL: https://github.com/apache/iceberg/issues/2033#issuecomment-784153620


   @stevenzwu I think you are right, at least partially, it is a Flink bug 
anyway. Before [FLIP-147(Support Checkpoints After Tasks 
Finished)](https://cwiki.apache.org/confluence/x/mw-ZCQ) , sink writers sit in 
dilemma situation that there is no reliable way to commit final result to 
external system. `endInput` was not designed to commit final result, but 
actually has been used for a workaround/last-resort. There is no strong 
guarantee that `endInput` will be invoked only once. So basically, if sink 
writers want to commit final result in `endInput`, they should prepare for 
situations that `endInput` could run multiple times. Thus, @openinx the test 
should not focus (or focus only) on stop-with-savepoint, but multiple runs of 
`endInput`.
   
   I want quote some comments from our discussions here.
   
   @aljoscha 
[said](https://issues.apache.org/jira/browse/FLINK-21132?focusedCommentId=17274459&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17274459)
 that:
   
   > The motivation for introducing endOfInput() were things like hash-join in 
the SQL runner where an operator would read from the build side until getting 
an end-of-input, at which point it would switch over to reading from the probe 
side. With these use cases in mind sending an endOfInput() is a bug. The same 
is true for sinks, which will do some bookkeeping based on knowing that all the 
input data has been read.
   
   @gaoyunhaii 
[said](https://issues.apache.org/jira/browse/FLINK-21132?focusedCommentId=17271999&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17271999)
 that:
   
   > First of all, I think there would be some problem fro the current 
implementation that committed in endOfInput(), considering if the job is 
bounded (namely the first case). The problem is that if the failover happens 
right after commit() in the endOfInput, then the job will be restarted and 
fallback to the last checkpoint, which will cause replay of the data committed 
in endOfInput. FLIP-147: Support checkpoint after tasks finished is a precedent 
step to solve the commit problem of the sinks in the first case.
   
   cc @pnowojski  @aljoscha @gaoyunhaii @rkhachatryan @becketqin @StephanEwen 
@tillrohrmann


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to