> On Dec 13, 2016, at 4:59 AM, Oleg Kalnichevski <[email protected]> wrote:
> 
> On Mon, 2016-12-12 at 21:15 +0000, Idzerda, Edan wrote:
>> Hello!  Our reverse proxy uses the Async Client pool to handle connections 
>> to backend servers.  We've been tracking a problem for a while where we 
>> observe the initial TCP connection is made, but no thread is available to 
>> handle the SSL setup before a 10 second timeout expires.  We get into 
>> trouble because some of our backend servers are very slow, and some of our 
>> clients download very slowly.
>> 
>> 
>> I'm experimenting with a patch to AbstractMultiworkerIOReactor.addChannel() 
>> to determine whether the next dispatcher thread is "busy."  My first try was 
>> to look at bufferedSessions from the BaseIOReactor, and go through the list 
>> of dispatchers one time to see if I can find a free one.
>> 
>> 
>>        int i = Math.abs(this.currentWorker++ % this.workerCount);
>> 
>>        for (int j = 0; j < this.workerCount; j++) {
>>            if (this.dispatchers[i].getSessionCount() == 0) {
>>                break;
>>            }
>>            i = Math.abs(this.currentWorker++ % this.workerCount);
>>        }
>>        this.dispatchers[i].addChannel(entry);
>> 
>> This seems to help us in MOST of the cases we see this issue in production, 
>> but there still seem to be a small number of threads which collide.  I'm 
>> testing a different version which looks at AbstractIOReactor "sessions" to 
>> determine thread busy state, but it never seems to show more than "1" 
>> session if I look at the size after piling up slow connections on top of 
>> each other.
>> 
>> I have two questions:
>>    Is there a better way to determine whether a thread is busy?
>>    Would you be willing to accept a patch to make the dispatchers array in 
>> AbstractMultiworkerIOReactor "protected" so I can implement my own 
>> ConnectingIOReactor that overrides addChannel() with my own thread selection 
>> model?
>> 
>> Thanks a lot for your help and for providing such a great library to the 
>> community!
>> 
>> - edan
>> 
> 
> Hi Edan
> 
> What I do not quite understand is why i/o dispatch threads get blocked
> for 10 seconds or longer. This sounds awfully suspicious.
> 
> I could imagine exposing the list of i/o dispatchers to subclasses of
> AbstractMultiworkerIOReactor in 4.4.x branch but would rather prefer to
> keep it as a last resort.
> 
> Oleg

Thanks..  I would prefer not to have to patch httpcore-nio like this if I could 
work out the root cause.  Since I am still seeing connections failing to 
complete SSL within 10 seconds with my first patch (above), I am trying a new 
one now that uses an AtomicInteger for currentWorker.  We are seeing far less 
connection problems with the patch, but there are still enough apparent thread 
selection collisions that some requests fail.

The only way I have been able to reproduce this problem is by using an 
artificially rate limited connection (ex, curl --limit-rate 1m) and downloading 
a relatively large file.  If I use a small file, say 50K, I notice that the 
dispatchers thread do not get stuck. I can download more files than I have 
worker threads, and AbstractIOReactor’s “sessions” set count stays at 0.  With 
a larger file, like 500k, the sessions size goes to 1, and I can only download 
the same number of files as I have worker threads.

Does this make any sense to you?  Is it possible the higher level proxy library 
is hanging on to the HttpResponse’s Entity too long?  I see they call 
HttpEntity.getContent() and create an InputStream out of it… But why would that 
make a worker thread become non-responsive until it finishes?   I see a note on 
IOEventDispatch suggesting that “all methods of this interface are executed on 
the dispatch thread of the I/O reactor … it is important that processing that 
takes place in the event methods will not block the dispatch thread for too 
long, as the I/O reactor will be unable to react to other events”

Is that worth pursuing?  Any suggestions on how to debug this would be 
appreciated!

- edan


Reply via email to