On Wed, 2016-12-14 at 03:38 +0000, Idzerda, Edan wrote:
> 
> > On Dec 13, 2016, at 4:59 AM, Oleg Kalnichevski <[email protected]> wrote:
> > 
> > On Mon, 2016-12-12 at 21:15 +0000, Idzerda, Edan wrote:
> >> Hello!  Our reverse proxy uses the Async Client pool to handle connections 
> >> to backend servers.  We've been tracking a problem for a while where we 
> >> observe the initial TCP connection is made, but no thread is available to 
> >> handle the SSL setup before a 10 second timeout expires.  We get into 
> >> trouble because some of our backend servers are very slow, and some of our 
> >> clients download very slowly.
> >> 
> >> 
> >> I'm experimenting with a patch to 
> >> AbstractMultiworkerIOReactor.addChannel() to determine whether the next 
> >> dispatcher thread is "busy."  My first try was to look at bufferedSessions 
> >> from the BaseIOReactor, and go through the list of dispatchers one time to 
> >> see if I can find a free one.
> >> 
> >> 
> >>        int i = Math.abs(this.currentWorker++ % this.workerCount);
> >> 
> >>        for (int j = 0; j < this.workerCount; j++) {
> >>            if (this.dispatchers[i].getSessionCount() == 0) {
> >>                break;
> >>            }
> >>            i = Math.abs(this.currentWorker++ % this.workerCount);
> >>        }
> >>        this.dispatchers[i].addChannel(entry);
> >> 
> >> This seems to help us in MOST of the cases we see this issue in 
> >> production, but there still seem to be a small number of threads which 
> >> collide.  I'm testing a different version which looks at AbstractIOReactor 
> >> "sessions" to determine thread busy state, but it never seems to show more 
> >> than "1" session if I look at the size after piling up slow connections on 
> >> top of each other.
> >> 
> >> I have two questions:
> >>    Is there a better way to determine whether a thread is busy?
> >>    Would you be willing to accept a patch to make the dispatchers array in 
> >> AbstractMultiworkerIOReactor "protected" so I can implement my own 
> >> ConnectingIOReactor that overrides addChannel() with my own thread 
> >> selection model?
> >> 
> >> Thanks a lot for your help and for providing such a great library to the 
> >> community!
> >> 
> >> - edan
> >> 
> > 
> > Hi Edan
> > 
> > What I do not quite understand is why i/o dispatch threads get blocked
> > for 10 seconds or longer. This sounds awfully suspicious.
> > 
> > I could imagine exposing the list of i/o dispatchers to subclasses of
> > AbstractMultiworkerIOReactor in 4.4.x branch but would rather prefer to
> > keep it as a last resort.
> > 
> > Oleg
> 
> Thanks..  I would prefer not to have to patch httpcore-nio like this if I 
> could work out the root cause.  Since I am still seeing connections failing 
> to complete SSL within 10 seconds with my first patch (above), I am trying a 
> new one now that uses an AtomicInteger for currentWorker.  We are seeing far 
> less connection problems with the patch, but there are still enough apparent 
> thread selection collisions that some requests fail.
> 
> The only way I have been able to reproduce this problem is by using an 
> artificially rate limited connection (ex, curl --limit-rate 1m) and 
> downloading a relatively large file.  If I use a small file, say 50K, I 
> notice that the dispatchers thread do not get stuck. I can download more 
> files than I have worker threads, and AbstractIOReactor’s “sessions” set 
> count stays at 0.  With a larger file, like 500k, the sessions size goes to 
> 1, and I can only download the same number of files as I have worker threads.
> 
> Does this make any sense to you?  Is it possible the higher level proxy 
> library is hanging on to the HttpResponse’s Entity too long?  I see they call 
> HttpEntity.getContent() and create an InputStream out of it…

This is likely to be the cause of your grief. InputStream / OutputStream
interfaces are inherently blocking and they do not mix well with event
driven i/o without quite bit of effort and complex code. By using
blocking i/o to produce requests or consume response the higher level
proxy library likely blocks i/o dispatch threads and starves other
connections managed by the same dispatcher.

I would recommend rewriting your code based on native
HttpAsyncRequestProducer / HttpAsyncResponseConsumer for more optimal
results.

Oleg

>  But why would that make a worker thread become non-responsive until it 
> finishes?   I see a note on IOEventDispatch suggesting that “all methods of 
> this interface are executed on the dispatch thread of the I/O reactor … it is 
> important that processing that takes place in the event methods will not 
> block the dispatch thread for too long, as the I/O reactor will be unable to 
> react to other events”
> 
> Is that worth pursuing?  Any suggestions on how to debug this would be 
> appreciated!
> 
> - edan
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to