rkhachatryan commented on pull request #14024: URL: https://github.com/apache/flink/pull/14024#issuecomment-725312076
> why the previous fix was not sufficient. Could you please motivate it on the PR description or commit message? The previous fix was for a different problem: watermarks being emitted before the recovered state of result subpartitions. I've added this to the PR description: > If partitions are requested fast enough then the operator (chain) can receive data before being initialized. In particular, this can cause out-of-orderness: elements from the upstream followed by elements from operator state (not channel state). > I'd also like to see a test that shows why the previous way does not work. I'm assuming the modified UCITCase would show that, but it's also not easy to see. However, I'm also assuming that writing a dedicated test that depicts some race condition is probably quite hard, so I'd be fine without one. Yes, the modified UCITCase shows that, but only on a 2nd-3rd run. And yes, there is (was) a race condition between getting a response from the upstream and initializing the state, which is difficult to test deterministically. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
