AHeise commented on pull request #14140:
URL: https://github.com/apache/flink/pull/14140#issuecomment-731379700


   > Hi @AHeise, I try to list existing alternatives below:
   > 
   >     1. Migrate dependent tests to use `StreamTaskMailboxTestHarness`.
   > 
   >     2. Introduce `TaskMailbox.isMailboxThreadBlocked` to let testing 
thread query status of mailbox thread concurrently.
   > 
   >     3. Use `MailboxExecutor` to query whether all input has been processed.
   >        a. Use 
`StreamTask.getInputOutputJointFuture(InputStatus.NOTHING_AVAILABLE).isDone()`
   >        b. Use `MailboxProcessor.isDefaultActionUnavailable()`.
   > 
   > 
   > First, comparing to `MailboxExecutor`, 
`TaskMailbox.isMailboxThreadBlocked` is intrusive and undermine the 
encapsulation mailbox modeling trying to provide. I think we have converged to 
avoid this approach.
   > 
   > Second, migration may require big hard work since there are almost 53 
dependent tests as you counted. Personally, I think it would be nice if we can 
solve unstable `StreamTaskTestHarness.waitForInputProcessing` with few changes 
before migration. But it is totally up to you and/or other committers to decide 
whether it is worthwhile or not.
   > 
   > Third, I think 3(a) or similar may be what you suggest in jira. Togather 
with all-queues-empty while-looping, 3(a) and 3(b) should have same effect. I 
notice that there are some optimizations in `UnionInputGate` which cause 
`UnionInputGate.getAvailableFuture().isDone()` returns `false` while there are 
cached data. If we drop all-queues-empty while-looping, 3(a) will fail due to 
above optimizations. I am kind of preferring 
`MailboxProcessor.isDefaultActionUnavailable()`, since it is resistance to 
these optimizations.
   
   Thank you very much for your deep investigation. I asked Roman to assess the 
solution and the alternatives as he is much more adept on threading issues than 
me.
   
   In theory, I'd go with the first approach, but I understand that this is 
hardly feasible. So I like your current fix in most regards (details may or may 
not be improved). 
   
   One more idea, couldn't we also inject `END_OF_INPUT` `StreamStatus` at the 
beginning of `allInputProcessed`? I was hoping that the thread would then 
eventually terminate itself and we could simply `join`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to