2015-02-24 16:33 GMT+01:00 Mark Thomas <ma...@apache.org>:

> On 24/02/2015 13:10, Rémy Maucherat wrote:
> > I'm having issues with the write timeout tests in
> > TestWsWebSocketContainer, which made me do some changes since there are
> > still things I don't understand:
>
> These appear to be OK for me at the moment with NIO and NIO2 but the
> very nature of timing issues means that doesn't count for much. I am
> seeing failures or crashes with APR/native so there is still work to be
> done there.
>
> > - In WsRemoteEndpointImplServer, onWritePossible appears to be able to be
> > invoked concurrently (doWrite calls it directly and changes the
> buffers). I
> > think it should be synced.
>
> Those calls should be nested. If you are seeing concurrent calls then
> there is probably still an issue around write registration.
>

I still think there is concurrency there, at least with the first write
notification (which is concurrent if the first read does write immediately,
just like our "big" failing test does). Without the read/write concurrency,
I think there wouldn't be any issue.

With the TestWebSocketFrameClient failure, the contending traces look like
(I used a semaphore to isolate them):
    [junit] java.lang.Exception: Stack trace
    [junit]     at java.lang.Thread.dumpStack(Thread.java:1329)
    [junit]     at
org.apache.tomcat.websocket.server.WsRemoteEndpointImplServer.onWritePossible(WsRemoteEndpointImplServer.java:146)
    [junit]     at
org.apache.tomcat.websocket.server.WsRemoteEndpointImplServer.doWrite(WsRemoteEndpointImplServer.java:87)
    [junit]     at
org.apache.tomcat.websocket.WsRemoteEndpointImplBase$OutputBufferSendHandler.write(WsRemoteEndpointImplBase.java:822)
    [junit]     at
org.apache.tomcat.websocket.WsRemoteEndpointImplBase.writeMessagePart(WsRemoteEndpointImplBase.java:447)
    [junit]     at
org.apache.tomcat.websocket.WsRemoteEndpointImplBase.startMessage(WsRemoteEndpointImplBase.java:338)
    [junit]     at
org.apache.tomcat.websocket.WsRemoteEndpointImplBase$TextMessageSendHandler.write(WsRemoteEndpointImplBase.java:730)
    [junit]     at
org.apache.tomcat.websocket.WsRemoteEndpointImplBase.sendPartialString(WsRemoteEndpointImplBase.java:250)
    [junit]     at
org.apache.tomcat.websocket.WsRemoteEndpointImplBase.sendString(WsRemoteEndpointImplBase.java:193)
    [junit]     at
org.apache.tomcat.websocket.WsRemoteEndpointBasic.sendText(WsRemoteEndpointBasic.java:37)
    [junit]     at
org.apache.tomcat.websocket.TesterFirehoseServer$Endpoint.onMessage(TesterFirehoseServer.java:121)
    [junit]     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
    [junit]     at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    [junit]     at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    [junit]     at java.lang.reflect.Method.invoke(Method.java:483)
    [junit]     at
org.apache.tomcat.websocket.pojo.PojoMessageHandlerWholeBase.onMessage(PojoMessageHandlerWholeBase.java:80)
    [junit]     at
org.apache.tomcat.websocket.WsFrameBase.sendMessageText(WsFrameBase.java:393)
    [junit]     at
org.apache.tomcat.websocket.WsFrameBase.processDataText(WsFrameBase.java:494)
    [junit]     at
org.apache.tomcat.websocket.WsFrameBase.processData(WsFrameBase.java:289)
    [junit]     at
org.apache.tomcat.websocket.WsFrameBase.processInputBuffer(WsFrameBase.java:130)
    [junit]     at
org.apache.tomcat.websocket.server.WsFrameServer.onDataAvailable(WsFrameServer.java:56)
    [junit]     at
org.apache.tomcat.websocket.server.WsHttpUpgradeHandler$WsReadListener.onDataAvailable(WsHttpUpgradeHandler.java:207)
    [junit]     at
org.apache.coyote.http11.upgrade.UpgradeServletInputStream.onDataAvailable(UpgradeServletInputStream.java:213)
    [junit]     at
org.apache.coyote.http11.upgrade.UpgradeProcessor.upgradeDispatch(UpgradeProcessor.java:108)
    [junit]     at
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:658)
    [junit]     at
org.apache.coyote.http11.Http11Nio2Protocol$Http11ConnectionHandler.process(Http11Nio2Protocol.java:130)
    [junit]     at
org.apache.tomcat.util.net.Nio2Endpoint$SocketProcessor.doRun(Nio2Endpoint.java:1694)
    [junit]     at
org.apache.tomcat.util.net.Nio2Endpoint$SocketProcessor.run(Nio2Endpoint.java:1653)
    [junit]     at
org.apache.tomcat.util.net.Nio2Endpoint.processSocket0(Nio2Endpoint.java:578)
    [junit]     at
org.apache.tomcat.util.net.Nio2Endpoint.processSocket(Nio2Endpoint.java:563)
    [junit]     at
org.apache.tomcat.util.net.Nio2Endpoint$Nio2SocketWrapper$3.completed(Nio2Endpoint.java:794)
    [junit]     at
org.apache.tomcat.util.net.Nio2Endpoint$Nio2SocketWrapper$3.completed(Nio2Endpoint.java:775)
    [junit]     at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126)
    [junit]     at sun.nio.ch.Invoker$2.run(Invoker.java:218)
    [junit]     at
sun.nio.ch.AsynchronousChannelGroupImpl$1.run(AsynchronousChannelGroupImpl.java:112)
    [junit]     at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    [junit]     at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    [junit]     at
org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
    [junit]     at java.lang.Thread.run(Thread.java:745)

    [junit] java.lang.Exception: Stack trace
    [junit]     at java.lang.Thread.dumpStack(Thread.java:1329)
    [junit]     at
org.apache.tomcat.websocket.server.WsRemoteEndpointImplServer.onWritePossible(WsRemoteEndpointImplServer.java:146)
    [junit]     at
org.apache.tomcat.websocket.server.WsHttpUpgradeHandler$WsWriteListener.onWritePossible(WsHttpUpgradeHandler.java:252)
    [junit]     at
org.apache.coyote.http11.upgrade.UpgradeServletOutputStream.onWritePossible(UpgradeServletOutputStream.java:252)
    [junit]     at
org.apache.coyote.http11.upgrade.UpgradeProcessor.upgradeDispatch(UpgradeProcessor.java:110)
    [junit]     at
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:658)
    [junit]     at
org.apache.coyote.http11.Http11Nio2Protocol$Http11ConnectionHandler.process(Http11Nio2Protocol.java:130)
    [junit]     at
org.apache.tomcat.util.net.Nio2Endpoint$SocketProcessor.doRun(Nio2Endpoint.java:1694)
    [junit]     at
org.apache.tomcat.util.net.Nio2Endpoint$SocketProcessor.run(Nio2Endpoint.java:1649)
    [junit]     at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    [junit]     at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    [junit]     at
org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
    [junit]     at java.lang.Thread.run(Thread.java:745)


> > - In Nio2Endpoint socket wrapper uses nestedWriteCompletionCount over the
> > inline flag that was used in 8. If the write completes inline, then
> isReady
> > should already be set back to true, and writing could continue. So the
> > change was IMO adding more write notifications which could hide some
> > issues. I tried changing that many times following the refactoring
> started,
> > but this is the first time I can do it without obviously breaking the
> > testsuite (where some of the non blocking write tests would hang due to
> > missing write notifications).
>
> This change was to prevent multiple write threads being triggered if
> there were multiple levels of nesting with the write completion handler.
> It was a fairly rare event but it did happen.
>

If it recurses, only one completion handler (the most nested one) will
notify. With the algorithm change, it would be the first one, which IMO
does not make any real difference but changes the timing of the call. So
not sure but it's not a priority.

>
> > - NPE guards in the NIO connector socket processor for concurrent closing
> > [NIO2 has them, somehow it wasn't needed earlier in NIO, which is also an
> > odd thing; I actually feel better having to add them].
> >
> > So this could improve on some possible timing related problems. I'll keep
> > on investigating though before committing anything.
>
> One thing to keep in mind that may simplify some of these issues is that
> once WebSocket moves to using the Tomcat I/O layer directly the
> requirement for one container thread reading and one container thread
> writing concurrently will go away. A number of the concurrency issues we
> have observed are triggered by these concurrent threads so switching
> back to a single thread should help.
>
> Yes, IMO the read/write concurrency is causing problems.

Rémy

Reply via email to