Hi Lukas

Thanks for quick response. It seems I found the problem.

On 2,6,14 worker, errors show:

raph.comm.netty.NettyClient: Request failed
java.nio.channels.ClosedChannelException
2015-05-22 05:20:57,606 ERROR [netty-client-worker-1]
org.apache.giraph.comm.netty.NettyClient: Request failed
java.nio.channels.ClosedChannelException
2015-05-22 05:20:57,606 ERROR [netty-client-worker-1]
org.apache.giraph.comm.netty.NettyClient: Request failed
java.nio.channels.ClosedChannelException
2015-05-22 05:20:57,606 ERROR [netty-client-worker-1]
org.apache.giraph.comm.netty.NettyClient: Request failed
java.nio.channels.ClosedChannelException
2015-05-22 05:20:57,606 ERROR [netty-client-worker-1]
org.apache.giraph.comm.netty.NettyClient: Request failed
java.nio.channels.ClosedChannelException
2015-05-22 05:20:57,606 ERROR [netty-client-worker-1]
org.apache.giraph.comm.netty.NettyClient: Request failed
java.nio.channels.ClosedChannelException
2015-05-22 05:20:57,606 ERROR [netty-client-worker-1]
org.apache.giraph.comm.netty.NettyClient: Request failed
java.nio.channels.ClosedChannelException
2015-05-22 05:20:57,606 ERROR [netty-client-worker-1]
org.apache.giraph.comm.netty.NettyClient: Request failed
java.nio.channels.ClosedChannelException
2015-05-22 05:20:57,606 ERROR [netty-client-worker-1]
org.apache.giraph.comm.netty.NettyClient: Request failed
java.nio.channels.ClosedChannelException
2015-05-22 05:20:57,606 ERROR [netty-client-worker-1]
org.apache.giraph.comm.netty.NettyClient: Request failed
java.nio.channels.ClosedChannelException
2015-05-22 05:20:57,606 ERROR [netty-client-worker-1]
org.apache.giraph.comm.netty.NettyClient: Request failed
java.nio.channels.ClosedChannelException
2015-05-22 05:20:57,606 ERROR [netty-client-worker-1]
org.apache.giraph.comm.netty.NettyClient: Request failed
java.nio.channels.ClosedChannelException
2015-05-22 05:20:57,606 ERROR [netty-client-worker-1]
org.apache.giraph.comm.netty.NettyClient: Request failed
java.nio.channels.ClosedChannelException
2015-05-22 05:20:57,606 ERROR [netty-client-worker-1]
org.apache.giraph.comm.netty.NettyClient: Request failed
java.nio.channels.ClosedChannelException
2015-05-22 05:20:57,607 ERROR [netty-client-worker-1]
org.apache.giraph.comm.netty.NettyClient: Request failed
java.nio.channels.ClosedChannelException
2015-05-22 05:20:57,607 ERROR [netty-client-worker-1]
org.apache.giraph.comm.netty.NettyClient: Request failed
java.nio.channels.ClosedChannelException
2015-05-22 05:20:57,607 ERROR [netty-client-worker-1]
org.apache.giraph.comm.netty.NettyClient: Request failed
java.nio.channels.ClosedChannelException
2015-05-22 05:20:57,607 ERROR [netty-client-worker-1]
org.apache.giraph.comm.netty.NettyClient: Request failed
java.nio.channels.ClosedChannelException
2015-05-22 05:20:57,607 ERROR [netty-client-worker-1]
org.apache.giraph.comm.netty.NettyClient: Request failed
java.nio.channels.ClosedChannelException
2015-05-22 05:20:57,607 WARN [netty-client-worker-1]
org.apache.giraph.comm.netty.handler.ResponseClientHandler:
exceptionCaught: Channel failed with remote address
bespin03c.umiacs.umd.edu/192.168.74.113:30005
java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
        at sun.nio.ch.IOUtil.read(IOUtil.java:192)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
        at 
io.netty.buffer.UnpooledUnsafeDirectByteBuf.setBytes(UnpooledUnsafeDirectByteBuf.java:446)
        at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:871)
        at 
io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:208)
        at 
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:118)
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:485)
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:452)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:346)
        at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:745)


I checked with bespin03c.umiacs.umd.edu/192.168.74.113:30005 and it shows:


2015-05-22 05:20:50,028 ERROR [main]
org.apache.giraph.graph.GraphMapper: Caught an unrecoverable exception
waitFor: ExecutionException occurred while waiting for
org.apache.giraph.utils.ProgressableUtils$FutureWaitable@7328027c
java.lang.IllegalStateException: waitFor: ExecutionException occurred
while waiting for
org.apache.giraph.utils.ProgressableUtils$FutureWaitable@7328027c
        at 
org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:193)
        at 
org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:151)
        at 
org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:136)
        at 
org.apache.giraph.utils.ProgressableUtils.getFutureResult(ProgressableUtils.java:99)
        at 
org.apache.giraph.utils.ProgressableUtils.getResultsWithNCallables(ProgressableUtils.java:233)
        at 
org.apache.giraph.graph.GraphTaskManager.processGraphPartitions(GraphTaskManager.java:756)
        at 
org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:335)
        at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:93)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.util.concurrent.ExecutionException:
java.lang.OutOfMemoryError: GC overhead limit exceeded
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:188)
        at 
org.apache.giraph.utils.ProgressableUtils$FutureWaitable.getResult(ProgressableUtils.java:327)
        at 
org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:187)
        ... 14 more



So the problem could be only solved by expand the memory of cluster if
I still use default hash way?


Thanks


Hai




Hai Lan, PhD student
[email protected] <[email protected]>
Department of Geographical Science
University of Maryland, College Park
1104 LeFrak Hall
College Park, MD 20742, USA

On Fri, May 22, 2015 at 6:32 AM, Lukas Nalezenec <
[email protected]> wrote:

>  On 22.5.2015 12:25, Hai Lan wrote:
>
> Missing chosen workers [Worker(hostname=bespin05.umiacs.umd.edu, MRtaskID=2, 
> port=30002), Worker(hostname=bespin04d.umiacs.umd.edu, MRtaskID=6, 
> port=30006), Worker(hostname=bespin03a.umiacs.umd.edu, MRtaskID=14, 
> port=30014)] on superstep 0
>
>
> Hi,
> See in logs what happened on the missing workers.
> Lukas
>

Reply via email to