morozov commented on PR #3873:
URL: https://github.com/apache/flink-cdc/pull/3873#issuecomment-2666875989

   I read more about the semantics of `InputStatus`, and I think the current 
logic is good. Please validate my understanding below.
   
   There are three possible value of the status:
   1. `MORE_AVAILABLE` means that a new record is immediately available (i.e. 
won't block the reader).
   2. `NOTHING_AVAILABLE` means that there's no new record immediately 
available (so the reader may block).
   3. `END_OF_INPUT` means end of input.
   
   In this specific case, here's how we interpret them:
   1. `END_OF_INPUT` is impossible in the binlog/streaming mode.
   2. There's no difference for us between `MORE_AVAILABLE` and 
`NOTHING_AVAILABLE` because we don't care about blocking, we want to just read 
all the changes.
   
   So, what happens in the test is:
   1. Initially, the reader returns `NOTHING_AVAILABLE` because the connector 
hasn't captured anything yet.
   2. Since the output is empty, the test keep iterating in the do-while loop.
   3. As soon as the change become available, the reader returns 
`MORE_AVAILABLE` and then returns _all captured changes_ before switching back 
to `NOTHING_AVAILABLE`.
   4. At this point, the output is no longer empty, so the loop breaks.
   
   So if the assumption in #​3 doesn't hold, it may be a reason why the test 
fail. I don't know of a better simple way to read a fixed expected number of 
changes from the reader without introducing a timeout or risking a 
never-terminating test. I think the current approach is fine.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to