morozov commented on PR #3873: URL: https://github.com/apache/flink-cdc/pull/3873#issuecomment-2666875989
I read more about the semantics of `InputStatus`, and I think the current logic is good. Please validate my understanding below. There are three possible value of the status: 1. `MORE_AVAILABLE` means that a new record is immediately available (i.e. won't block the reader). 2. `NOTHING_AVAILABLE` means that there's no new record immediately available (so the reader may block). 3. `END_OF_INPUT` means end of input. In this specific case, here's how we interpret them: 1. `END_OF_INPUT` is impossible in the binlog/streaming mode. 2. There's no difference for us between `MORE_AVAILABLE` and `NOTHING_AVAILABLE` because we don't care about blocking, we want to just read all the changes. So, what happens in the test is: 1. Initially, the reader returns `NOTHING_AVAILABLE` because the connector hasn't captured anything yet. 2. Since the output is empty, the test keep iterating in the do-while loop. 3. As soon as the change become available, the reader returns `MORE_AVAILABLE` and then returns _all captured changes_ before switching back to `NOTHING_AVAILABLE`. 4. At this point, the output is no longer empty, so the loop breaks. So if the assumption in #3 doesn't hold, it may be a reason why the test fail. I don't know of a better simple way to read a fixed expected number of changes from the reader without introducing a timeout or risking a never-terminating test. I think the current approach is fine. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
