rkhachatryan commented on pull request #14024:
URL: https://github.com/apache/flink/pull/14024#issuecomment-725312076


   > why the previous fix was not sufficient. Could you please motivate it on 
the PR description or commit message?
   
   The previous fix was for a different problem: watermarks being emitted 
before the recovered state of result subpartitions.
   
   I've added this to the PR description:
   > If partitions are requested fast enough then the operator (chain) can 
receive data before being initialized. In particular, this can cause 
out-of-orderness: elements from the upstream followed by elements from operator 
state (not channel state).
   
   > I'd also like to see a test that shows why the previous way does not work. 
I'm assuming the modified UCITCase would show that, but it's also not easy to 
see. However, I'm also assuming that writing a dedicated test that depicts some 
race condition is probably quite hard, so I'd be fine without one.
   
   Yes, the modified UCITCase shows that, but only on a 2nd-3rd run. And yes, 
there is (was) a race condition between getting a response from the upstream 
and initializing the state, which is difficult to test deterministically.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to