Possible Causes for "Connection reset by peer" when using NIO

Hubert, Eric Wed, 25 Jun 2008 09:12:31 -0700

Hi devs!

first of all I'd like to apologize for posting a "user-problem" to two 
dev-lists. I only did this as have not much background knowledge of the NIO 
implementation and think a solid understanding of NIO is necessary to help 
tackling our problem.

We are using the WSO2 ESB which is based on Apache Synapse, Apache Axis2 and
the HTTP Core NIO module. As the stacktrace only contains http-nio details, I
cc'ed the http components dev list. Hopefully someone can help out.

When sending about 3000 Hessian-requests per hour from clients (Tomcat) over
the ESB (Synapse 1.2 running on JDK 1.5.15, Linux 2.6.23.1-amd64-75) to a Bea
Weblogic 8.1 we see about 1 to 10 exceptions of type "java.io.IOException:
Connection reset by peer" in the ESB-log.

If I understand it right the ESB then executes a failover to the next service
node as we are using a load balancing group. So the client is not affected, but
the endpoint with the failure will be marked as inactive.

The problem is I don't understand the cause of this exception. It occurs during
the read on a Socket-Channel. So I think the server might close the connection
while the ESB is reading. But maybe internally some kind of pool is used and a
connection can change to some abnormal state?

We have seen such Exceptions before when we were using HTTP 1.1 in combination
with the Bea Weblogic server. Very likely an issue with HTTP keepalive
(persistent connections). So for any connection to a Bea service we use the
property mediator of Synapse to change the connection ESB <-> Bea to use HTTP
1.0:
<syn:property name="FORCE_HTTP_1.0" value="true" scope="axis2" />

Since then we hadn't seen this exception again. But now switching to another
environment we see this exception again, but only for Hessian services.
I have no clue what else could cause this exception. How can we detect the
cause? How to narrow down possible causes, if there are different
possibilities. I don't expect any network outages to be the reason, as other
services (SOAP)-based are working pretty well.

The exact exception we are getting is:

java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)
at sun.nio.ch.IOUtil.read(IOUtil.java:206)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:207)
at
org.apache.http.impl.nio.reactor.SessionInputBufferImpl.fill(SessionInputBufferImpl.java:85)
at
org.apache.http.impl.nio.codecs.AbstractMessageParser.fillBuffer(AbstractMessageParser.java:97)
at
org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:113)
at
org.apache.http.impl.nio.DefaultClientIOEventDispatch.inputReady(DefaultClientIOEventDispatch.java:99)
at
org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:98)
at
org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:195)
at
org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:180)
at
org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:142)
at
org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:70)
at
org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:318)

This exception occurs consistently a few time per hour on every possible
combination of client node, esb node and service endpoint node.

Any pointer or idea is greatly appreciated. Thanks a lot in advance!

Regards,
Eric

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Possible Causes for "Connection reset by peer" when using NIO

Reply via email to