CrynetLogistics commented on pull request #18651:
URL: https://github.com/apache/flink/pull/18651#issuecomment-1049734372


   @fapaul Thanks. The purpose of the PR is to reduce the number flushes in 
case there are already too many in flight requests (blocking). Specifically 
this is when 
   1. there exists space in the buffer to add an element
   2. there are no free in flight requests slots are left, and,
   3. where the number of in flight requests that have not yet completed (and 
are waiting on `completeRequest`) is strictly smaller than the number of in 
flight requests that need to be freed in order to process the current flush in 
a non-blocking fashion.
   
   It's an interesting suggestion and would save some mailbox time on the write 
path. Another benefit is if checkpoints are frequent, we would save operator 
time in write. However, I'm just cautious about tying the sink behaviour to how 
checkpoints have been set up. My feeling is that if I can accurately determine 
whether the current write & flush is not blocking, it would be prudent to flush 
that batch as soon as I know (with the added side benefit that failed request 
entries have also been completed and added sooner). The other worry is if I 
leave more flushing responsibility with the checkpointing mechanism, fewer 
flushes will result in a build up of elements in the buffer and cause the 
`bufferedRequestEntries.size() >= maxBufferedRequests` condition to trigger in 
`write()` which would shift the completion of in flight request cycles back to 
the mailbox thread. This is okay, but I feel the buffer would spend most of its 
life being full, introducing an artificial latency especially
  if the buffer is large.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to