[
https://issues.apache.org/jira/browse/GIRAPH-601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13624979#comment-13624979
]
Eli Reisman commented on GIRAPH-601:
------------------------------------
to clarify point one: YARN adds that "little extra" not you, so its sort of a
grey area. Just keep in mind if your cluster offers 10 gigs of available
resources, doing -w 8 to account for a gig for master and a gig for app master
is not good enough. You need to leave some extra container resources "overhead"
unused for YARN jobs because they will also suck up some extra each.
clarify about yarn-site: there is more than one resource setting in yarn-site
make sure they are all set the way you need or bad things like this happen with
little error reporting.
Hope its going well, good luck with this.
> Exception when running pagerank benchmark with 6 or more workers on a
> pseudodistributed setup: SendVertexRequest cannot be cast to MasterRequest
> ------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: GIRAPH-601
> URL: https://issues.apache.org/jira/browse/GIRAPH-601
> Project: Giraph
> Issue Type: Bug
> Reporter: Eugene Koontz
> Attachments: instrumentation.patch, print_addresses.patch
>
>
> Building Giraph with:
> {code}
> mvn -DskipTests -Phadoop_2.0.3 clean compile
> {code}
> Running pagerank like this:
> {code}
> $HADOOP_RUNTIME/bin/hadoop jar $JAR \
> org.apache.giraph.benchmark.PageRankBenchmark \
> -e 10 -s 10 -v -V 10 -w 6
> {code}
> I see this in
> /tmp/userlogs/application_1364578380737_0003/container_1364578380737_0003_01_000002/
> :
> {code}
> 2013-03-29 10:58:06,371 DEBUG [org.apache.giraph.master.MasterThread]
> org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: Got finished
> worker list = [Eugenes-MacBook-Pro.local_1, Eugenes-MacBook-Pro.local_3],
> size = 2, worker list = [Worker(hostname=Eugenes-MacBook-Pro.local,
> MRtaskID=2, port=30002), Worker(hostname=Eugenes-MacBook-Pro.local,
> MRtaskID=1, port=30001), Worker(hostname=Eugenes-MacBook-Pro.local,
> MRtaskID=4, port=30004), Worker(hostname=Eugenes-MacBook-Pro.local,
> MRtaskID=3, port=30003), Worker(hostname=Eugenes-MacBook-Pro.local,
> MRtaskID=5, port=30005), Worker(hostname=Eugenes-MacBook-Pro.local,
> MRtaskID=0, port=30010)], size = 6 from
> /_hadoopBsp/job_1364578380737_0003/_vertexInputSplitDoneDir
> 2013-03-29 10:58:06,373 WARN [netty-server-exec-3]
> org.apache.giraph.comm.netty.handler.RequestServerHandler: exceptionCaught:
> Channel failed with remote address /172.16.175.1:56236
> java.lang.ClassCastException:
> org.apache.giraph.comm.requests.SendVertexRequest cannot be cast to
> org.apache.giraph.comm.requests.MasterRequest
> at
> org.apache.giraph.comm.netty.handler.MasterRequestServerHandler.processRequest(MasterRequestServerHandler.java:27)
> at
> org.apache.giraph.comm.netty.handler.RequestServerHandler.messageReceived(RequestServerHandler.java:106)
> at
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
> at
> org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:71)
> at
> org.jboss.netty.handler.execution.ChannelUpstreamEventRunnable.doRun(ChannelUpstreamEventRunnable.java:45)
> at
> org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:69)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> at java.lang.Thread.run(Thread.java:680)
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira