Re: Stabilizing the trunk (9.0.x) build

2015-03-02 Thread Mark Thomas
On 27/02/2015 15:01, Mark Thomas wrote:
 On 27/02/2015 14:42, Christopher Schultz wrote:
 On 2/27/15 7:00 AM, Mark Thomas wrote:

snip/

 There is also an issue with APR on Linux that I can reproduce (with some
 code changes) that triggers a crash every couple of runs.

 Next time is happens, can you give me the backtrace and register details
 (basically, the top of the Java hs_* file)?

 From my perspective, it should not be possible to crash tcnative if we
 can help it -- even if the Java code is all kinds of wrong. Throwing
 exceptions is fine, but taking-down the JVM is obnoxious :)
 
 I should be able to do this fairly easily. I'll open BZ item with the
 info you requested when I have it.

As requested:

https://bz.apache.org/bugzilla/show_bug.cgi?id=57653

Mark


-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Stabilizing the trunk (9.0.x) build

2015-02-27 Thread Christopher Schultz
Mark,

On 2/27/15 7:00 AM, Mark Thomas wrote:
 Another update. I think I am getting close to being able to commit these
 changes[1].
 
 The current status is:
 - NIO appears to pass on Windows, OSX and Linux
 - APR appears to pass on OSX and Linux
 - APR unknown on Windows
 - NIO2 appears to pass on OSX and Linux
 - NIO2 hanging on Windows
 
 I say appears to pass since with timing issues one can never be sure.
 
 There is also an issue with APR on Linux that I can reproduce (with some
 code changes) that triggers a crash every couple of runs.

Next time is happens, can you give me the backtrace and register details
(basically, the top of the Java hs_* file)?

From my perspective, it should not be possible to crash tcnative if we
can help it -- even if the Java code is all kinds of wrong. Throwing
exceptions is fine, but taking-down the JVM is obnoxious :)

 I'm not sure if it is possible to trigger the error with the current
 code. I plan to look at this some more once the unit tests are
 passing.

-chris



signature.asc
Description: OpenPGP digital signature


Re: Stabilizing the trunk (9.0.x) build

2015-02-27 Thread Mark Thomas
On 27/02/2015 14:42, Christopher Schultz wrote:
 Mark,
 
 On 2/27/15 7:00 AM, Mark Thomas wrote:
 Another update. I think I am getting close to being able to commit these
 changes[1].

 The current status is:
 - NIO appears to pass on Windows, OSX and Linux
 - APR appears to pass on OSX and Linux
 - APR unknown on Windows
 - NIO2 appears to pass on OSX and Linux
 - NIO2 hanging on Windows

 I say appears to pass since with timing issues one can never be sure.

Cracked it (I think). Unit tests pass for all three connectors on all
three platforms.

 There is also an issue with APR on Linux that I can reproduce (with some
 code changes) that triggers a crash every couple of runs.
 
 Next time is happens, can you give me the backtrace and register details
 (basically, the top of the Java hs_* file)?
 
 From my perspective, it should not be possible to crash tcnative if we
 can help it -- even if the Java code is all kinds of wrong. Throwing
 exceptions is fine, but taking-down the JVM is obnoxious :)

I should be able to do this fairly easily. I'll open BZ item with the
info you requested when I have it.

Mark


-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Stabilizing the trunk (9.0.x) build

2015-02-27 Thread Rémy Maucherat
2015-02-27 13:00 GMT+01:00 Mark Thomas ma...@apache.org:

 Another update. I think I am getting close to being able to commit these
 changes[1].

 The current status is:
 - NIO appears to pass on Windows, OSX and Linux
 - APR appears to pass on OSX and Linux
 - APR unknown on Windows
 - NIO2 appears to pass on OSX and Linux
 - NIO2 hanging on Windows

 The testsuite passes for me, on Windows (with non connector or websocket
related fails) and Linux (NIO2). Do I need a really slow thing like the CI
system to run into issues ?

It's not related, but there's a glitch with some testsuites and CI systems:
the websocket client needs a lot of entropy if each test is run in a
separate JVM (this does not happen with the Tomcat testsuite).

Rémy


Re: Stabilizing the trunk (9.0.x) build

2015-02-27 Thread Mark Thomas
Another update. I think I am getting close to being able to commit these
changes[1].

The current status is:
- NIO appears to pass on Windows, OSX and Linux
- APR appears to pass on OSX and Linux
- APR unknown on Windows
- NIO2 appears to pass on OSX and Linux
- NIO2 hanging on Windows

I say appears to pass since with timing issues one can never be sure.

There is also an issue with APR on Linux that I can reproduce (with some
code changes) that triggers a crash every couple of runs. I'm not sure
if it is possible to trigger the error with the current code. I plan to
look at this some more once the unit tests are passing.

Mark


[1] https://github.com/markt-asf/tomcat/tree/markt-trunk

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Stabilizing the trunk (9.0.x) build

2015-02-26 Thread Mark Thomas
On 26/02/2015 21:58, Christopher Schultz wrote:
 Mark,
 
 On 2/23/15 4:16 AM, Mark Thomas wrote:
 Given that it is my changes that have triggered the problems I think I
 have a responsibility to fix them (and intend to do so over) but I'm not
 going to say no if anyone wants to pitch in. Therefore, I'm starting
 this thread so that we can co-ordinate work on fixing the various
 failures being reported.

 I'm going to start with why
 TestWsWebSocketContainer.testMaxMessageSize04() hangs on Windows.
 
 I'd like to commit Ognjen's patch for
 https://bz.apache.org/bugzilla/show_bug.cgi?id=55988 (patch is
 https://bz.apache.org/bugzilla/attachment.cgi?id=32407action=diff).
 
 It's fairly innocuous, but since it will change the AbstractEndpoint
 class and you guys are trying to track-down irritating issues in there,
 would you prefer that I hold-off?

No objections to committing the patch from me.

Mark


-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Stabilizing the trunk (9.0.x) build

2015-02-26 Thread Mark Thomas
On 26/02/2015 12:25, Rémy Maucherat wrote:
 2015-02-26 11:42 GMT+01:00 Mark Thomas ma...@apache.org:
 
 What I have at the moment is at:
 https://github.com/markt-asf/tomcat/tree/markt-trunk

 I'm currently running the unit tests.

 Looking good.

Better, certainly.

NIO tests pass on Windows, Linux and OSX.
I've found a bug in NIO2 + SSL that is fairly common on Linux/OSX that I
have fixed and am re-running the tests.

I haven't really looked at APR/native yet but there did appear to be
some unexpectedly long running tests on Windows (the only platform to
get to APR/native so far) so I suspect there is still more to do.

Overall I think things are heading in the right direction.

Mark


-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Stabilizing the trunk (9.0.x) build

2015-02-26 Thread Rémy Maucherat
2015-02-26 17:40 GMT+01:00 Mark Thomas ma...@apache.org:

 Better, certainly.

 NIO tests pass on Windows, Linux and OSX.


Very good !


 I've found a bug in NIO2 + SSL that is fairly common on Linux/OSX that I
 have fixed and am re-running the tests.


Aw, *another* one ?


 I haven't really looked at APR/native yet but there did appear to be
 some unexpectedly long running tests on Windows (the only platform to
 get to APR/native so far) so I suspect there is still more to do.

 Overall I think things are heading in the right direction.

 Rémy


Re: Stabilizing the trunk (9.0.x) build

2015-02-26 Thread Mark Thomas
On 26/02/2015 17:34, Rémy Maucherat wrote:
 2015-02-26 17:40 GMT+01:00 Mark Thomas ma...@apache.org:
 
 Better, certainly.

 NIO tests pass on Windows, Linux and OSX.

 
 Very good !
 
 
 I've found a bug in NIO2 + SSL that is fairly common on Linux/OSX that I
 have fixed and am re-running the tests.

 
 Aw, *another* one ?

Yes. Looks like it affects 8.0.x as well.

This fixes it:
https://github.com/markt-asf/tomcat/commit/f8eda8da61751b0b224d59dbd93ed9f5f1fa9441

 I haven't really looked at APR/native yet but there did appear to be
 some unexpectedly long running tests on Windows (the only platform to
 get to APR/native so far) so I suspect there is still more to do.

There was another issue but it looked to be a fairly simple one - one of
the concurrent read/write fixes was breaking a bunch of stuff. Most
likely the fix wasn't right but since we don't need it removing it was
the simplest solution.

Mark

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Stabilizing the trunk (9.0.x) build

2015-02-26 Thread Mark Thomas
On 25/02/2015 19:32, Rémy Maucherat wrote:
 2015-02-25 19:36 GMT+01:00 Mark Thomas ma...@apache.org:
 
 I was planning on waiting until the build was stable but given that:
 - read/write concurrency is at the root of a lot of these issues
 - only WebSocket should be using it now in trunk
 - the plan is to refactor WebSocket to remove it

 I'm going to go back to what I have in git, rebase it to current trunk
 and see where we are. If the unit tests pass on the usual platforms I'd
 be tempted to commit it. WDYT?

 
 Ok.

What I have at the moment is at:
https://github.com/markt-asf/tomcat/tree/markt-trunk

I'm currently running the unit tests.

Mark

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Stabilizing the trunk (9.0.x) build

2015-02-26 Thread Rémy Maucherat
2015-02-26 11:42 GMT+01:00 Mark Thomas ma...@apache.org:

 What I have at the moment is at:
 https://github.com/markt-asf/tomcat/tree/markt-trunk

 I'm currently running the unit tests.

 Looking good.

Rémy


Re: Stabilizing the trunk (9.0.x) build

2015-02-26 Thread Christopher Schultz
Mark,

On 2/23/15 4:16 AM, Mark Thomas wrote:
 Given that it is my changes that have triggered the problems I think I
 have a responsibility to fix them (and intend to do so over) but I'm not
 going to say no if anyone wants to pitch in. Therefore, I'm starting
 this thread so that we can co-ordinate work on fixing the various
 failures being reported.
 
 I'm going to start with why
 TestWsWebSocketContainer.testMaxMessageSize04() hangs on Windows.

I'd like to commit Ognjen's patch for
https://bz.apache.org/bugzilla/show_bug.cgi?id=55988 (patch is
https://bz.apache.org/bugzilla/attachment.cgi?id=32407action=diff).

It's fairly innocuous, but since it will change the AbstractEndpoint
class and you guys are trying to track-down irritating issues in there,
would you prefer that I hold-off?

Thanks,
-chris



signature.asc
Description: OpenPGP digital signature


Re: Stabilizing the trunk (9.0.x) build

2015-02-25 Thread Rémy Maucherat
2015-02-24 16:33 GMT+01:00 Mark Thomas ma...@apache.org:

 On 24/02/2015 13:10, Rémy Maucherat wrote:
  I'm having issues with the write timeout tests in
  TestWsWebSocketContainer, which made me do some changes since there are
  still things I don't understand:

 These appear to be OK for me at the moment with NIO and NIO2 but the
 very nature of timing issues means that doesn't count for much. I am
 seeing failures or crashes with APR/native so there is still work to be
 done there.

  - In WsRemoteEndpointImplServer, onWritePossible appears to be able to be
  invoked concurrently (doWrite calls it directly and changes the
 buffers). I
  think it should be synced.

 Those calls should be nested. If you are seeing concurrent calls then
 there is probably still an issue around write registration.


I still think there is concurrency there, at least with the first write
notification (which is concurrent if the first read does write immediately,
just like our big failing test does). Without the read/write concurrency,
I think there wouldn't be any issue.

With the TestWebSocketFrameClient failure, the contending traces look like
(I used a semaphore to isolate them):
[junit] java.lang.Exception: Stack trace
[junit] at java.lang.Thread.dumpStack(Thread.java:1329)
[junit] at
org.apache.tomcat.websocket.server.WsRemoteEndpointImplServer.onWritePossible(WsRemoteEndpointImplServer.java:146)
[junit] at
org.apache.tomcat.websocket.server.WsRemoteEndpointImplServer.doWrite(WsRemoteEndpointImplServer.java:87)
[junit] at
org.apache.tomcat.websocket.WsRemoteEndpointImplBase$OutputBufferSendHandler.write(WsRemoteEndpointImplBase.java:822)
[junit] at
org.apache.tomcat.websocket.WsRemoteEndpointImplBase.writeMessagePart(WsRemoteEndpointImplBase.java:447)
[junit] at
org.apache.tomcat.websocket.WsRemoteEndpointImplBase.startMessage(WsRemoteEndpointImplBase.java:338)
[junit] at
org.apache.tomcat.websocket.WsRemoteEndpointImplBase$TextMessageSendHandler.write(WsRemoteEndpointImplBase.java:730)
[junit] at
org.apache.tomcat.websocket.WsRemoteEndpointImplBase.sendPartialString(WsRemoteEndpointImplBase.java:250)
[junit] at
org.apache.tomcat.websocket.WsRemoteEndpointImplBase.sendString(WsRemoteEndpointImplBase.java:193)
[junit] at
org.apache.tomcat.websocket.WsRemoteEndpointBasic.sendText(WsRemoteEndpointBasic.java:37)
[junit] at
org.apache.tomcat.websocket.TesterFirehoseServer$Endpoint.onMessage(TesterFirehoseServer.java:121)
[junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
[junit] at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[junit] at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[junit] at java.lang.reflect.Method.invoke(Method.java:483)
[junit] at
org.apache.tomcat.websocket.pojo.PojoMessageHandlerWholeBase.onMessage(PojoMessageHandlerWholeBase.java:80)
[junit] at
org.apache.tomcat.websocket.WsFrameBase.sendMessageText(WsFrameBase.java:393)
[junit] at
org.apache.tomcat.websocket.WsFrameBase.processDataText(WsFrameBase.java:494)
[junit] at
org.apache.tomcat.websocket.WsFrameBase.processData(WsFrameBase.java:289)
[junit] at
org.apache.tomcat.websocket.WsFrameBase.processInputBuffer(WsFrameBase.java:130)
[junit] at
org.apache.tomcat.websocket.server.WsFrameServer.onDataAvailable(WsFrameServer.java:56)
[junit] at
org.apache.tomcat.websocket.server.WsHttpUpgradeHandler$WsReadListener.onDataAvailable(WsHttpUpgradeHandler.java:207)
[junit] at
org.apache.coyote.http11.upgrade.UpgradeServletInputStream.onDataAvailable(UpgradeServletInputStream.java:213)
[junit] at
org.apache.coyote.http11.upgrade.UpgradeProcessor.upgradeDispatch(UpgradeProcessor.java:108)
[junit] at
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:658)
[junit] at
org.apache.coyote.http11.Http11Nio2Protocol$Http11ConnectionHandler.process(Http11Nio2Protocol.java:130)
[junit] at
org.apache.tomcat.util.net.Nio2Endpoint$SocketProcessor.doRun(Nio2Endpoint.java:1694)
[junit] at
org.apache.tomcat.util.net.Nio2Endpoint$SocketProcessor.run(Nio2Endpoint.java:1653)
[junit] at
org.apache.tomcat.util.net.Nio2Endpoint.processSocket0(Nio2Endpoint.java:578)
[junit] at
org.apache.tomcat.util.net.Nio2Endpoint.processSocket(Nio2Endpoint.java:563)
[junit] at
org.apache.tomcat.util.net.Nio2Endpoint$Nio2SocketWrapper$3.completed(Nio2Endpoint.java:794)
[junit] at
org.apache.tomcat.util.net.Nio2Endpoint$Nio2SocketWrapper$3.completed(Nio2Endpoint.java:775)
[junit] at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126)
[junit] at sun.nio.ch.Invoker$2.run(Invoker.java:218)
[junit] at
sun.nio.ch.AsynchronousChannelGroupImpl$1.run(AsynchronousChannelGroupImpl.java:112)

Re: Stabilizing the trunk (9.0.x) build

2015-02-25 Thread Mark Thomas
On 25/02/2015 14:31, Rémy Maucherat wrote:
 2015-02-24 16:33 GMT+01:00 Mark Thomas ma...@apache.org:
 
 On 24/02/2015 13:10, Rémy Maucherat wrote:
 I'm having issues with the write timeout tests in
 TestWsWebSocketContainer, which made me do some changes since there are
 still things I don't understand:

 These appear to be OK for me at the moment with NIO and NIO2 but the
 very nature of timing issues means that doesn't count for much. I am
 seeing failures or crashes with APR/native so there is still work to be
 done there.

 - In WsRemoteEndpointImplServer, onWritePossible appears to be able to be
 invoked concurrently (doWrite calls it directly and changes the
 buffers). I
 think it should be synced.

 Those calls should be nested. If you are seeing concurrent calls then
 there is probably still an issue around write registration.

 
 I still think there is concurrency there, at least with the first write
 notification (which is concurrent if the first read does write immediately,
 just like our big failing test does). Without the read/write concurrency,
 I think there wouldn't be any issue.
 
 With the TestWebSocketFrameClient failure, the contending traces look like
 (I used a semaphore to isolate them):
 [junit] java.lang.Exception: Stack trace
 [junit] at java.lang.Thread.dumpStack(Thread.java:1329)
 [junit] at
 org.apache.tomcat.websocket.server.WsRemoteEndpointImplServer.onWritePossible(WsRemoteEndpointImplServer.java:146)
 [junit] at
 org.apache.tomcat.websocket.server.WsRemoteEndpointImplServer.doWrite(WsRemoteEndpointImplServer.java:87)
 [junit] at
 org.apache.tomcat.websocket.WsRemoteEndpointImplBase$OutputBufferSendHandler.write(WsRemoteEndpointImplBase.java:822)
 [junit] at
 org.apache.tomcat.websocket.WsRemoteEndpointImplBase.writeMessagePart(WsRemoteEndpointImplBase.java:447)
 [junit] at
 org.apache.tomcat.websocket.WsRemoteEndpointImplBase.startMessage(WsRemoteEndpointImplBase.java:338)
 [junit] at
 org.apache.tomcat.websocket.WsRemoteEndpointImplBase$TextMessageSendHandler.write(WsRemoteEndpointImplBase.java:730)
 [junit] at
 org.apache.tomcat.websocket.WsRemoteEndpointImplBase.sendPartialString(WsRemoteEndpointImplBase.java:250)
 [junit] at
 org.apache.tomcat.websocket.WsRemoteEndpointImplBase.sendString(WsRemoteEndpointImplBase.java:193)
 [junit] at
 org.apache.tomcat.websocket.WsRemoteEndpointBasic.sendText(WsRemoteEndpointBasic.java:37)
 [junit] at
 org.apache.tomcat.websocket.TesterFirehoseServer$Endpoint.onMessage(TesterFirehoseServer.java:121)
 [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
 Method)
 [junit] at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 [junit] at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 [junit] at java.lang.reflect.Method.invoke(Method.java:483)
 [junit] at
 org.apache.tomcat.websocket.pojo.PojoMessageHandlerWholeBase.onMessage(PojoMessageHandlerWholeBase.java:80)
 [junit] at
 org.apache.tomcat.websocket.WsFrameBase.sendMessageText(WsFrameBase.java:393)
 [junit] at
 org.apache.tomcat.websocket.WsFrameBase.processDataText(WsFrameBase.java:494)
 [junit] at
 org.apache.tomcat.websocket.WsFrameBase.processData(WsFrameBase.java:289)
 [junit] at
 org.apache.tomcat.websocket.WsFrameBase.processInputBuffer(WsFrameBase.java:130)
 [junit] at
 org.apache.tomcat.websocket.server.WsFrameServer.onDataAvailable(WsFrameServer.java:56)
 [junit] at
 org.apache.tomcat.websocket.server.WsHttpUpgradeHandler$WsReadListener.onDataAvailable(WsHttpUpgradeHandler.java:207)
 [junit] at
 org.apache.coyote.http11.upgrade.UpgradeServletInputStream.onDataAvailable(UpgradeServletInputStream.java:213)
 [junit] at
 org.apache.coyote.http11.upgrade.UpgradeProcessor.upgradeDispatch(UpgradeProcessor.java:108)
 [junit] at
 org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:658)
 [junit] at
 org.apache.coyote.http11.Http11Nio2Protocol$Http11ConnectionHandler.process(Http11Nio2Protocol.java:130)
 [junit] at
 org.apache.tomcat.util.net.Nio2Endpoint$SocketProcessor.doRun(Nio2Endpoint.java:1694)
 [junit] at
 org.apache.tomcat.util.net.Nio2Endpoint$SocketProcessor.run(Nio2Endpoint.java:1653)
 [junit] at
 org.apache.tomcat.util.net.Nio2Endpoint.processSocket0(Nio2Endpoint.java:578)
 [junit] at
 org.apache.tomcat.util.net.Nio2Endpoint.processSocket(Nio2Endpoint.java:563)
 [junit] at
 org.apache.tomcat.util.net.Nio2Endpoint$Nio2SocketWrapper$3.completed(Nio2Endpoint.java:794)
 [junit] at
 org.apache.tomcat.util.net.Nio2Endpoint$Nio2SocketWrapper$3.completed(Nio2Endpoint.java:775)
 [junit] at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126)
 [junit] at sun.nio.ch.Invoker$2.run(Invoker.java:218)
  

Re: Stabilizing the trunk (9.0.x) build

2015-02-25 Thread Rémy Maucherat
2015-02-25 19:36 GMT+01:00 Mark Thomas ma...@apache.org:

 I was planning on waiting until the build was stable but given that:
 - read/write concurrency is at the root of a lot of these issues
 - only WebSocket should be using it now in trunk
 - the plan is to refactor WebSocket to remove it

 I'm going to go back to what I have in git, rebase it to current trunk
 and see where we are. If the unit tests pass on the usual platforms I'd
 be tempted to commit it. WDYT?


Ok.

Rémy


Re: Stabilizing the trunk (9.0.x) build

2015-02-24 Thread Mark Thomas
Progress is being made.

TestWsWebSocketContainer.testMaxMessageSize04() is fixed. I do want to
come back to exactly how/if flushing is performed on
ServletOutputStream.close() but I plan on parking that until the other
failures are fixed.

Next on my list is TestUpgrade.testMessagesBlocking(). I am seeing
failures on most runs on Linux and Windows (command line only - not
IDE). The symptom is that the connection to the client is closed before
the second message is received. I've tried - without success so far - to
reproduce this in a debugger.

I'll be working on this today.

On a related topic the Gump OpenSSL tests are still failing. They pass
when run directly from the command line on vmgump.a.o. I can't come up
with a better idea than adding some debugging to the tests.

Mark

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Stabilizing the trunk (9.0.x) build

2015-02-24 Thread Rainer Jung

Am 24.02.2015 um 10:01 schrieb Mark Thomas:


On a related topic the Gump OpenSSL tests are still failing. They pass
when run directly from the command line on vmgump.a.o. I can't come up
with a better idea than adding some debugging to the tests.


I installed OpenSSL master (current snapshot) locally and ran the 
TestOpenSSLCipherConfigurationParser test against our trunk.I get 
failures as well although I can confirm, that the correct OpenSSL 
version 1.1.0-dev was used.


Looking at the simplest failure example SSLv2: OpenSSL 1.1.0 no longer 
supports SSLv2, so openssl ciphers -v SSLv2 returns and empty result 
and that is what the test expects. OTOH in 
TestOpenSSLCipherConfigurationParser there are about 6 ciphers which are 
defined for SSLv2 and those show up in the failed tests (plus some of 
their aliases).


Not sure how to handle OpenSSL version compatibility in the tests and in 
the Tomcat runtime code. Which version of OpenSSl is 
java/org/apache/tomcat/util/net/jsse/openssl/ supposed to reflect? Any 
specific version, or any cipher existing in some OpenSSL version? That 
code I think does not actually use OpenSSL and is only a translation 
mechanism from OpenSSL syntax to JSSE syntax, correct?


The test OTOH actually use OpenSSL and compare results, so would never 
be compatible with a extended cipher list. Maybe for testing we need to 
mark the ciphers in the list, that actually exist in the OpenSSL version 
that's supposed to be used during the tests?I don't have a convincing 
idea...


Regards,

Rainer

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Stabilizing the trunk (9.0.x) build

2015-02-24 Thread Mark Thomas
On 24/02/2015 09:01, Mark Thomas wrote:
 Progress is being made.
 
 TestWsWebSocketContainer.testMaxMessageSize04() is fixed. I do want to
 come back to exactly how/if flushing is performed on
 ServletOutputStream.close() but I plan on parking that until the other
 failures are fixed.
 
 Next on my list is TestUpgrade.testMessagesBlocking(). I am seeing
 failures on most runs on Linux and Windows (command line only - not
 IDE). The symptom is that the connection to the client is closed before
 the second message is received. I've tried - without success so far - to
 reproduce this in a debugger.

I've tracked down and fixed one possible cause of this failure.
Unfortunately, the test still fails. It looks like the same problem
exists in NioSelectorPool. I'm investigating possible fixes.

Mark

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Stabilizing the trunk (9.0.x) build

2015-02-24 Thread Rémy Maucherat
2015-02-24 10:01 GMT+01:00 Mark Thomas ma...@apache.org:

 Progress is being made.

 TestWsWebSocketContainer.testMaxMessageSize04() is fixed. I do want to
 come back to exactly how/if flushing is performed on
 ServletOutputStream.close() but I plan on parking that until the other
 failures are fixed.

 I'm having issues with the write timeout tests in
TestWsWebSocketContainer, which made me do some changes since there are
still things I don't understand:
- In WsRemoteEndpointImplServer, onWritePossible appears to be able to be
invoked concurrently (doWrite calls it directly and changes the buffers). I
think it should be synced.
- In Nio2Endpoint socket wrapper uses nestedWriteCompletionCount over the
inline flag that was used in 8. If the write completes inline, then isReady
should already be set back to true, and writing could continue. So the
change was IMO adding more write notifications which could hide some
issues. I tried changing that many times following the refactoring started,
but this is the first time I can do it without obviously breaking the
testsuite (where some of the non blocking write tests would hang due to
missing write notifications).
- NPE guards in the NIO connector socket processor for concurrent closing
[NIO2 has them, somehow it wasn't needed earlier in NIO, which is also an
odd thing; I actually feel better having to add them].

So this could improve on some possible timing related problems. I'll keep
on investigating though before committing anything.

Rémy


Re: Stabilizing the trunk (9.0.x) build

2015-02-24 Thread Mark Thomas
On 24/02/2015 13:10, Rémy Maucherat wrote:
 I'm having issues with the write timeout tests in
 TestWsWebSocketContainer, which made me do some changes since there are
 still things I don't understand:

These appear to be OK for me at the moment with NIO and NIO2 but the
very nature of timing issues means that doesn't count for much. I am
seeing failures or crashes with APR/native so there is still work to be
done there.

 - In WsRemoteEndpointImplServer, onWritePossible appears to be able to be
 invoked concurrently (doWrite calls it directly and changes the buffers). I
 think it should be synced.

Those calls should be nested. If you are seeing concurrent calls then
there is probably still an issue around write registration.

 - In Nio2Endpoint socket wrapper uses nestedWriteCompletionCount over the
 inline flag that was used in 8. If the write completes inline, then isReady
 should already be set back to true, and writing could continue. So the
 change was IMO adding more write notifications which could hide some
 issues. I tried changing that many times following the refactoring started,
 but this is the first time I can do it without obviously breaking the
 testsuite (where some of the non blocking write tests would hang due to
 missing write notifications).

This change was to prevent multiple write threads being triggered if
there were multiple levels of nesting with the write completion handler.
It was a fairly rare event but it did happen.

 - NPE guards in the NIO connector socket processor for concurrent closing
 [NIO2 has them, somehow it wasn't needed earlier in NIO, which is also an
 odd thing; I actually feel better having to add them].
 
 So this could improve on some possible timing related problems. I'll keep
 on investigating though before committing anything.

One thing to keep in mind that may simplify some of these issues is that
once WebSocket moves to using the Tomcat I/O layer directly the
requirement for one container thread reading and one container thread
writing concurrently will go away. A number of the concurrency issues we
have observed are triggered by these concurrent threads so switching
back to a single thread should help.

Mark


-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Stabilizing the trunk (9.0.x) build

2015-02-23 Thread Mark Thomas
On 23/02/2015 10:40, Rémy Maucherat wrote:
 2015-02-23 10:16 GMT+01:00 Mark Thomas ma...@apache.org:
 
 Given that it is my changes that have triggered the problems I think I
 have a responsibility to fix them (and intend to do so over) but I'm not
 going to say no if anyone wants to pitch in. Therefore, I'm starting
 this thread so that we can co-ordinate work on fixing the various
 failures being reported.

 I'm going to start with why
 TestWsWebSocketContainer.testMaxMessageSize04() hangs on Windows.

 I'll try to help and get up to speed with the changes.

Thanks. Much appreciated.

I've made progress in that the test now fails rather than hangs. I'm
waiting for the various CI systems to see if the issues I've fixed were
the only causes of the hangs or if there are others still to fix.

In the meantime, I'm going to look at fixing this particular test.

Mark

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Stabilizing the trunk (9.0.x) build

2015-02-23 Thread Rémy Maucherat
2015-02-23 10:16 GMT+01:00 Mark Thomas ma...@apache.org:

 Given that it is my changes that have triggered the problems I think I
 have a responsibility to fix them (and intend to do so over) but I'm not
 going to say no if anyone wants to pitch in. Therefore, I'm starting
 this thread so that we can co-ordinate work on fixing the various
 failures being reported.

 I'm going to start with why
 TestWsWebSocketContainer.testMaxMessageSize04() hangs on Windows.

 I'll try to help and get up to speed with the changes.

Rémy