On 12/03/2013 18:25, Remy Maucherat wrote: > On Tue, 2013-03-12 at 13:25 +0000, Mark Thomas wrote: >> Thanks for the pointers. I tried a few of them (the less invasive ones) >> with no luck. I'm putting together a test case that demonstrates the >> problem. That should make it very simple to determine if the root cause >> lies in APR or my code. > > Ok, too bad. I was thinking the timeout trick could do it since the > "proper" non blocking option + EAGAIN didn't work that well for me then. > But that was a while ago. Using a 0 timeout is simpler overall, but > obviously it looks like a hack. > > Yes, if you can write a simple test case, it would probably be much > easier to isolate an issue in APR, it's likely far too complex > otherwise.
And we have a test case. Run the latest TestSocket with: LIMIT=100000 SSL_ENABLED=true CHUNK_SIZE=2 The problem isn't 100% reproducible but: - I have never been able to repeat it with SSL disabled - I saw the problem once with a chunk size of 4 - I have never seen the problem with a chunk size > 4 Based on what this test does any my observations I have a theory but it will need someone who understands the APR/native code better than I do to confirm the theory and (if correct) produce a fix. I am assuming the the SSL encryption processes the data in chunks where the chunk size is a small number of bytes (e.g. 8). What I think happens is this: Lots of data is written to the socket. The write buffers become 99.9% full. The client writes a very small amount of data (e.g. 4 bytes). This is enough for the SSL encryption to output a little more data so the write buffers are now 100% full. Not all of the 4 bytes from the client were read. Socket gets added to the Poller. Other end of the socket reads some data. Poller fires write possible. StartLoop: Client writes remaining bytes (<4). There is not enough new input for the SSL encryption to generate more output so it reports 0 bytes written. Client believes write buffers are still full. Socket gets added to the Poller. Poller instantly fires write possible Goto StartLoop And an infinite loop is entered. I believe I can fix this by changing how I do the buffering in WebSocket so I don't end up having to write short 4 byte chunks. I have been meaning to do this for a while anyway. That said, this does look like an issue with APR/native and SSL. Thoughts? Mark --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org