On 12/03/2013 18:25, Remy Maucherat wrote:
> On Tue, 2013-03-12 at 13:25 +0000, Mark Thomas wrote:
>> Thanks for the pointers. I tried a few of them (the less invasive ones)
>> with no luck. I'm putting together a test case that demonstrates the
>> problem. That should make it very simple to determine if the root cause
>> lies in APR or my code.
> 
> Ok, too bad. I was thinking the timeout trick could do it since the
> "proper" non blocking option + EAGAIN didn't work that well for me then.
> But that was a while ago. Using a 0 timeout is simpler overall, but
> obviously it looks like a hack.
> 
> Yes, if you can write a simple test case, it would probably be much
> easier to isolate an issue in APR, it's likely far too complex
> otherwise.

And we have a test case. Run the latest TestSocket with:
LIMIT=100000
SSL_ENABLED=true
CHUNK_SIZE=2

The problem isn't 100% reproducible but:
- I have never been able to repeat it with SSL disabled
- I saw the problem once with a chunk size of 4
- I have never seen the problem with a chunk size > 4

Based on what this test does any my observations I have a theory but it
will need someone who understands the APR/native code better than I do
to confirm the theory and (if correct) produce a fix.

I am assuming the the SSL encryption processes the data in chunks where
the chunk size is a small number of bytes (e.g. 8).

What I think happens is this:

Lots of data is written to the socket.
The write buffers become 99.9% full.
The client writes a very small amount of data (e.g. 4 bytes).
This is enough for the SSL encryption to output a little more data so
the write buffers are now 100% full.
Not all of the 4 bytes from the client were read.
Socket gets added to the Poller.
Other end of the socket reads some data.
Poller fires write possible.
StartLoop:
Client writes remaining bytes (<4).
There is not enough new input for the SSL encryption to generate more
output so it reports 0 bytes written.
Client believes write buffers are still full.
Socket gets added to the Poller.
Poller instantly fires write possible
Goto StartLoop

And an infinite loop is entered.

I believe I can fix this by changing how I do the buffering in WebSocket
so I don't end up having to write short 4 byte chunks. I have been
meaning to do this for a while anyway.

That said, this does look like an issue with APR/native and SSL.

Thoughts?

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to