lostluck commented on pull request #11864:
URL: https://github.com/apache/beam/pull/11864#issuecomment-636140338


   This PR does have the assumption baked in that we'll always get a data 
message and a control message for an instruction.
   
   It doesn't handle the less well behaved cases of "only receive instructions, 
and never any data" or the "only data, never instructions" cases, which, as you 
say, probably require a time component to handle them properly. 
   
   Only instruction, is a bit of a waste of CPU, and ends up using little to no 
CPU while it's waiting for a channel send. On the other hand, it will then 
never signal the runner half or self terminate. So if the runner is waiting on 
that, it's it's own fault, and not well behaved. This may cause problems for a 
stage as a whole if the runner doesn't decide to disregard this bundle.
   
   Only data is a bigger risk for an individual worker, since it will block the 
worker eventually with what I call the Boulder problem, since the channel 
buffer may fill, and will start "pushback" on the data channel preventing data 
messages from reaching the other processing threads. This is overall desirable 
behavior, up until it isn't and the instruction for that data never comes.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to