The symptoms I get do seem to match what you describe. There's still two
problems with that though which I'd like to understand better.
1) Why don't I see this with the non-NIO transport? For example I can run
the Synapse server samples in either the Synapse sample server which uses
the NIO transport, or I can just use a separate axis2-1.1.1 distro with the
non-NIO transport. When using JMeter against axis2-1.1.1 it works fine and i
can send tens of thousands of requests without any errors. Whats different
here, the underlying TCP stack and config is the same isn't it?
2) Synapse often hangs after the IO error and needs to be restarted. Is
there any way we can make it recover from this without requiring a restart?
By handling the exception differently or something?
...ant
On 3/24/07, Asankha C. Perera < [EMAIL PROTECTED]> wrote:
Ant
This is the same error seen by Indika on Windows.. and I think my analysis
is correct. If you run the test for the first time or after a few minutes of
running the test last, you should be able to go to around 1000 iterations.
After you start to hit this issue, even 200 iterations would give you the
error. At this time, doing a netstat -na should show you that most of the
tcp ports are in TIME_WAIT state. Usually it could take at least one minute
till a port is cleared up by the OS. The tuning parameters I specified for
Linux tells the OS to use the full port range for applications, and to set
the tcp fin timeout to 30 secs - to clear up the ports as quickly as
possible. Without *any* OS tuning and on a Windows XP system - you
definitely will encounter this issue.
asankha
ant elder wrote:
I've tried again with the latest Synapse and HTTP components code and
several JVMs. The results feel slightly different than before but the end
result is still always the root exception included below. Sometime it
doesn't occur till around 1000 requests, but sometimes it happens after not
many requests at all.
...ant
java.io.IOException: Unable to establish loopback connection
at sun.nio.ch.PipeImpl$Initializer.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at sun.nio.ch.PipeImpl.<init>(Unknown Source)
at sun.nio.ch.SelectorProviderImpl.openPipe(Unknown Source)
at java.nio.channels.Pipe.open(Unknown Source)
at org.apache.axis2.transport.nhttp.ServerHandler.requestReceived(
ServerHandler.java:108)
at
org.apache.axis2.transport.nhttp.LoggingNHttpServiceHandler.requestReceived
(LoggingNHttpServiceHandler.java:83)
at
org.apache.http.impl.nio.DefaultNHttpServerConnection.consumeInput (
DefaultNHttpServerConnection.java:96)
at
org.apache.axis2.transport.nhttp.PlainServerIOEventDispatch.inputReady(
PlainServerIOEventDispatch.java:67)
at org.apache.http.impl.nio.reactor.BaseIOReactor.readable (
BaseIOReactor.java:68)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent
(AbstractIOReactor.java:160)
at
org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(
AbstractIOReactor.java :145)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(
AbstractIOReactor.java:127)
at
org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(
AbstractMultiworkerIOReactor.java :153)
at java.lang.Thread.run(Unknown Source)
Caused by: java.net.BindException: Address already in use: connect
at sun.nio.ch.Net.connect(Native Method)
at sun.nio.ch.SocketChannelImpl.connect (Unknown Source)
at java.nio.channels.SocketChannel.open(Unknown Source)
On 3/23/07, Asankha C. Perera < [EMAIL PROTECTED]> wrote:
>
> Ant
>
> I am quite sure that the problem seen by Indika now was related to the
> ports being exhausted - see the following articles and esp. the
> "MaxUserPort" and "TcpTimedWaitDelay" parameters that could tweaked - to be
> consistent with what I am using before running a load test on Linux. I will
> ask Indika to check these on Monday - but you may try this in the meantime
> if you get a chance
>
> http://www.microsoft.com/technet/network/deploy/depovg/tcpip2k.mspx
> http://www.microsoft.com/technet/community/columns/cableguy/cg1205.mspx
>
http://www.psc.edu/networking/projects/tcptune/OStune/winxp/winxp_stepbystep.html
>
>
> asankha
>
> Asankha C. Perera wrote:
>
> Hi Ant
>
> I fixed this for Linux and JDK 1.5 - I am confident of this fix as I was
> able to first recreate the issue consistently and then see the fix in action
> using 5 concurrent users sending a total of 5000 messages multiple times.
> However Indika is still seeing a 'similar' issue in Windows using JDK
> 1.4. We will try to see if its related to JDK 1.4 or Windows. If you get
> the latest nhttp code and build the nhttp JAR you could verify this fix -
> and let me know.
>
> I am listing some of the linux commands that came in handy for the
> resolution incase someone wants to check this.
>
> lsof -p 7426 => lists the open files for the pid given after the -p
> option
>
> ls -l /proc/9976/fd | wc -l => for each process the /proc filesystem
> lists the files used and thus you could count the open files with this
> command
>
> asankha
>
> Asankha C. Perera wrote:
>
> Ant / Oleg
>
> I can recreate this issue on both Windows and Linux and think its caused
> by my code related to use of Pipes.. and I am actively looking into this
> right now.. will get back to you on what I find.
>
> asankha
>
> ant elder wrote:
>
> I've tried on several JDKs now and _always_ get similar intermittent I/O
> related errors. I can use JMeter directly against Axis2-1.1.1 without
> any problems at all, so this does look like some issue with the NIO
> transport. Be really good to hear from other Windows users to see if this is
> just my specific environment or a more general problem problem.
>
> To recreate:
>
> 1) build Synapse server sample by running 'ant' in the
> samples\axis2Server\src\SimpleStockQuoteService directory
> 2) start the sample service by running
> samples\axis2Server\axis2server.bat
> 3) get the Synapse config (either 8 or 501) from
>
http://people.apache.org/~antelder/temp/<http://people.apache.org/%7Eantelder/temp/>,
> put in repository\conf\sample and start syanps: bin\synapse.bat -sample=8
> 4) get the JMeter config test1.jmx from
>
http://people.apache.org/~antelder/temp/<http://people.apache.org/%7Eantelder/temp/>,
> start Jmeter and File -> Open and point to the test1.jmx file
> 5) JMeter Run -> Start and after not to long IO errors should appear in
> the Syanpse console
>
> ...ant
>
> ---------- Forwarded message ----------
> From: Asankha C. Perera <[EMAIL PROTECTED]>
> Date: Mar 22, 2007 4:58 PM
> Subject: Re: [jira] Resolved: (HTTPCORE-60) Transport appears to be
> hanging because an unchecked exception caused the I/O dispatch thread to
> terminate
> To: HttpComponents Project < [email protected]>
>
> Oleg/Ant
>
> I am guessing this is something to do with Windows or the JDK you use..
> But I am unable to test this week, so will try to my best to try this
> sometime next week. As I said, on Linux I have run the system through
> thousands of messages and multiple threads concurrently and have fixed all
> the issues I came across.
>
> So Oleg, I do not see this as a blocker for the HttpCore release - but I
> will use your latest snapshots in Synapse to check on this in future if it
> occurs again
>
> thanks
> asankha
>
> Oleg Kalnichevski (JIRA) wrote:
>
> [
>
https://issues.apache.org/jira/browse/HTTPCORE-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
>
> ]
>
> Oleg Kalnichevski resolved HTTPCORE-60.
> ---------------------------------------
>
> Resolution: Fixed
>
> Anthony
> It turned out ClosedChannelException is a checked I/O exception so it cannot
kill the I/O dispatch thread. So, apparently I was wrong in my initial assertion
about the cause of the Synapse I/O transport lockup. I tweaked HttpCore code a
little and changed the IOSessionImpl to catch all ChannelClosedException-s thrown
by the underlying byte channel just in case.
>
>
>
>
> Please review the changes and let me know if it is okay to proceed with the
release
>
> Oleg
>
> Transport appears to be hanging because an unchecked exception caused the
I/O dispatch thread to terminate
>
----------------------------------------------------------------------------------------------------------
>
>
>
>
> Key: HTTPCORE-60
> URL: https://issues.apache.org/jira/browse/HTTPCORE-60
>
>
>
> Project: HttpComponents Core
> Issue Type: Bug
> Affects Versions: 4.0-alpha4
> Reporter: ant elder
> Assigned To: Oleg Kalnichevski
> Fix For: 4.0-alpha4
>
>
>
> See discussion on synapse-dev mailing list:
http://www.nabble.com/Intermittent-IO-Errors-using-Synapse-tf3439957.html
>
>
>
> The transport appears to be hanging because an unchecked exception
> caused the I/O dispatch thread to terminate. I believe there are several
> different types of problems (at least two) that we are seeing here.
>
> [I/O reactor worker thread 5] ERROR ServerHandler - I/O Error : null
>
>
> java.nio.channels.ClosedChannelException
> at
> sun.nio.ch.SocketChannelImpl.ensureReadOpen(SocketChannelImpl.java:112)
> at
> sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java
>
> :139)
>
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> [EMAIL PROTECTED] For additional
> commands, e-mail: [EMAIL PROTECTED]
>
> --------------------------------------------------------------------- To
> unsubscribe, e-mail: [EMAIL PROTECTED] For
> additional commands, e-mail: [EMAIL PROTECTED]
>
> --------------------------------------------------------------------- To
> unsubscribe, e-mail: [EMAIL PROTECTED] For
> additional commands, e-mail: [EMAIL PROTECTED]
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED] For
> additional commands, e-mail: [EMAIL PROTECTED]
>
--------------------------------------------------------------------- To
unsubscribe, e-mail: [EMAIL PROTECTED] For additional
commands, e-mail: [EMAIL PROTECTED]