There was some sort of issue with TCP connections between this server and a different server (written in C) that was its client, but I never was able to definitively pin down exactly what was causing the problem. However, we implemented some steps that we took a guess might help resolve the issue and since we haven't had problems since, hopefully we were right. (Caveat, however, is that we actually haven't been using this server in production recently, so it's also possible that we haven't actually fixed the problem after all.)

In any case, the way we "solved" this was to make the Linux box running this server much more aggressive about using TCP keep-alive settings in its networking, and terminating connections that weren't alive. This was done by adding the following settings to the script that launches the server:

# Adjust TCP keep-alive settings:

# first keep-alive probe packet is sent after 5 minutes of inactivity
echo 300 > /proc/sys/net/ipv4/tcp_keepalive_time

# wait 1 minute for a response from a keep-alive probe
echo 60 > /proc/sys/net/ipv4/tcp_keepalive_intvl

# perform up to 10 keep alive probes before terminating connection
echo 10 > /proc/sys/net/ipv4/tcp_keepalive_probes

Note, by the way, that you would of course need root privs to do this.

HTH,

DR

On 08/11/2010 02:49 PM, Marcel Casado wrote:


Hi David,

Did you figure out what was the problem. We are running into mina broken
pipes.

Thanks,

-Marcel


darose wrote:

Having a problem with our MINA server.  The server got hung last night.  I
was able to telnet into it, but it never responded with a welcome prompt,
and would not accept any of my commands.  Linux showed that it had 82 open
sockets to the client machines, all of them in SYN_RECV state.

Worse, the log was 4.7GB in size, full of broken pipe errors.  (3,366,455
occurrences of it!)  Stack trace is as follows:

java.io.IOException: Broken pipe
         at sun.nio.ch.FileDispatcher.write0(Native Method)
         at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
         at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:122)
         at sun.nio.ch.IOUtil.write(IOUtil.java:93)
         at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:352)
         at
org.apache.mina.transport.socket.nio.NioProcessor.write(NioProcessor.java:185)
         at
org.apache.mina.transport.socket.nio.NioProcessor.write(NioProcessor.java:41)
         at
org.apache.mina.core.polling.AbstractPollingIoProcessor.writeBuffer(AbstractPollingIoProcessor.java:776)
         at
org.apache.mina.core.polling.AbstractPollingIoProcessor.flushNow(AbstractPollingIoProcessor.java:713)
         at
org.apache.mina.core.polling.AbstractPollingIoProcessor.flush(AbstractPollingIoProcessor.java:648)
         at
org.apache.mina.core.polling.AbstractPollingIoProcessor.access$500(AbstractPollingIoProcessor.java:56)
         at
org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:895)
         at
org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
         at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
         at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
         at java.lang.Thread.run(Thread.java:636)


Anyone have any ideas?

We're running MINA 2.0M6 on OpenJDK 1.6 on CentOS 5.4.

Thanks,

DR





Reply via email to