Hi
We have a Giraph program which works fine if the graph is small. However, it
complains about “Connection reset by peer” when the graph is big.
We added a log statement right before the sendMessageToAllEdges method to
capture the message size and the number of messages being send. We noticed the
exception is raised whenever a vertex attempt to send large messages to lots of
neighbor. For example, the log below indicates that the 64-edges-vertex is fine
but the 3853-edges-vertex is causing exception. We cannot see any other error
from the log. Seems like it is related to Netty communication. May be the
buffer size? Any advise is highly appreciated.
2014-04-10 09:55:36,070 INFO [compute-0]
com.neimanmarcus.api.matching.giraph.cc.CCVertex3: !!!***
omx136986717936641d8a715150240372f9ef8da1a306 has 64 neighbors
2014-04-10 09:55:36,129 INFO [compute-0]
com.neimanmarcus.api.matching.giraph.cc.CCVertex3: !!!***
omx135795296165908348809827498ffc192bedd9f136 has 3853 neighbors
2014-04-10 09:55:58,161 INFO [netty-server-exec-0]
org.apache.giraph.comm.netty.handler.RequestDecoder: decode: Server window
metrics MBytes/sec sent = 0.0004, MBytes/sec received = 12.7831, MBytesSent =
0.0134, MBytesReceived = 383.7369, ave sent req MBytes = 0, ave received req
MBytes = 0.0481, secs waited = 30.018
2014-04-10 09:56:08,806 WARN [netty-server-exec-5]
org.apache.giraph.comm.netty.handler.RequestServerHandler: exceptionCaught:
Channel failed with remote address /10.241.17.33:43398
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcher.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:69)
at sun.nio.ch.IOUtil.write(IOUtil.java:26)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
at
org.jboss.netty.channel.socket.nio.SocketSendBufferPool$UnpooledSendBuffer.transferTo(SocketSendBufferPool.java:198)
at
org.jboss.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:468)
at
org.jboss.netty.channel.socket.nio.AbstractNioWorker.writeFromTaskLoop(AbstractNioWorker.java:423)
at
org.jboss.netty.channel.socket.nio.AbstractNioChannel$WriteTask.run(AbstractNioChannel.java:364)
at
org.jboss.netty.channel.socket.nio.AbstractNioWorker.processWriteTaskQueue(AbstractNioWorker.java:341)
at
org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:237)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:38)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
2014-04-10 09:56:09,040 WARN [netty-server-exec-6]
org.apache.giraph.comm.netty.handler.RequestServerHandler: exceptionCaught:
Channel failed with remote address /10.241.17.33:43398
java.nio.channels.ClosedChannelException
at
org.jboss.netty.channel.socket.nio.AbstractNioWorker.cleanUpWriteBuffer(AbstractNioWorker.java:673)
at
org.jboss.netty.channel.socket.nio.AbstractNioWorker.writeFromUserCode(AbstractNioWorker.java:400)
at
org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.handleAcceptedSocket(NioServerSocketPipelineSink.java:120)
text: Unable to write to output stream.