sijie commented on issue #1088: ISSUE #1086 (@bug W-4146427@) Client-side 
backpressure in netty (Fixes: io.netty.util.internal.OutOfDirectMemoryError 
under continuous heavy load)
URL: https://github.com/apache/bookkeeper/pull/1088#issuecomment-361807580
 
 
   @dlg99 
   
   I see the metrics do show great results. However I have one concern about 
this change, it is basically blocking the scheduler threads on waiting a slow 
netty channel, this can result in bad side-effects - e.g responses might not be 
able to be processed, other requests will also potentially be blocked as well.
   
   so I guess the benchmark itself is only testing writes and most likely one 
one or a few ledgers. In this case, the benchmark can easily show improvements, 
because there is effectively only one client/ledger handle is writing. However 
a real-world use case can be different, this change here will cause a situation 
- a slow bookie can potentially impact the whole application that use the 
bookkeeper client.
   
   Ideally I think the throttling/back-pressure should be pop back to the 
ledger handles who issue the write request, rather than asking 
PerChannelBookieClient to do the blocking-wait work. I am not sure what would 
be the best approach here. 
   
   One thought I have here:
   
   - when a ledger handle (effectively PendingAddOp) attempts to 
sendWriteRequest, PCBC checks if the channel is writable, if it is not 
writable, fail the write request with a special exception. PCBC itself doesn't 
block on waiting channel to become writable.
   
   - when PendingAddOp receives this special exception, it knows the channel is 
not available for write. then you can apply the throttling logic here:
   
   1) you can notify the ledger handle that a channel is not available, asking 
the ledger handle to slow down applications or block the application
   2) at the same time, you can listen on the channel state change. once the 
channel is writable, issue the write to the channel, and unblock the 
application for submitting add entries.
   
   In this way, you can block the application thread rather than the internal 
io scheduler which can be shared across multiple ledger handles.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to