RE: cannot run Giraph trunk with Hadoop 2.0.0-alpha

David Garcia Mon, 20 Aug 2012 18:59:19 -0700

You can remove this error by recursively removing _bsp folder from the 
zookeeper file system...and then running the job again.  Probably should remove 
folder from hdfs too.

________________________________________
From: Johnny Zhang [[email protected]]
Sent: Monday, August 20, 2012 6:59 PM
To: [email protected]
Subject: Re: cannot run Giraph trunk with Hadoop 2.0.0-alpha

sorry for wide distribution, I further check the folder  
'_bsp/_defaultZkManagerDir/job_1344903945125_0032' exists, and it has one sub 
folder  '_bsp/_defaultZkManagerDir/job_1344903945125_0032/_task' and another 
file inside, so the hdfs file permission should not be a issue. but not sure 
why Giraph still complain 
'_bsp/_defaultZkManagerDir/job_1344903945125_0032/_zkServer does not exist'.

Does Zookeeper needs further configuration? Or any other possible reason cannot 
create _zkServer folder ?

Thanks,
Johnny

On Mon, Aug 20, 2012 at 11:59 AM, Johnny Zhang 
<[email protected]<mailto:[email protected]>> wrote:
Alessandro:
Thanks for reminding me on that. Now I can run the pagerank example 
successfully, though I still get one zookeeper server related exception. Here 
is part of the log:

12/08/20 11:56:44 WARN mapreduce.Job: Error reading task output Server returned 
HTTP response code: 400 for URL: 
http://cs-10-20-76-76.cloud.cloudera.com:8080/tasklog?plaintext=true&attemptid=attempt_1344903945125_0032_m_000002_2&filter=stdout
12/08/20 11:56:44 WARN mapreduce.Job: Error reading task output Server returned 
HTTP response code: 400 for URL: 
http://cs-10-20-76-76.cloud.cloudera.com:8080/tasklog?plaintext=true&attemptid=attempt_1344903945125_0032_m_000002_2&filter=stderr
12/08/20 11:56:44 INFO mapreduce.Job: Task Id : 
attempt_1344903945125_0032_m_000001_2, Status : FAILED
Error: java.lang.RuntimeException: java.io.FileNotFoundException: File 
_bsp/_defaultZkManagerDir/job_1344903945125_0032/_zkServer does not exist.
at 
org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:749)
at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:320)
at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:570)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:725)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
Caused by: java.io.FileNotFoundException: File 
_bsp/_defaultZkManagerDir/job_1344903945125_0032/_zkServer does not exist.
at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:365)
at 
org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:708)
... 9 more

12/08/20 11:56:44 WARN mapreduce.Job: Error reading task output Server returned 
HTTP response code: 400 for URL: 
http://cs-10-20-76-76.cloud.cloudera.com:8080/tasklog?plaintext=true&attemptid=attempt_1344903945125_0032_m_000001_2&filter=stdout
12/08/20 11:56:44 WARN mapreduce.Job: Error reading task output Server returned 
HTTP response code: 400 for URL: 
http://cs-10-20-76-76.cloud.cloudera.com:8080/tasklog?plaintext=true&attemptid=attempt_1344903945125_0032_m_000001_2&filter=stderr
12/08/20 11:56:45 INFO mapreduce.Job: Job job_1344903945125_0032 failed with 
state FAILED due to:
12/08/20 11:56:45 INFO mapreduce.Job: Counters: 28
File System Counters
FILE: Number of bytes read=120
FILE: Number of bytes written=49450
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=44
HDFS: Number of bytes written=0
HDFS: Number of read operations=8
HDFS: Number of large read operations=0
HDFS: Number of write operations=3
Job Counters
Failed map tasks=10
Launched map tasks=13
Other local map tasks=13
Total time spent by all maps in occupied slots (ms)=692328
Total time spent by all reduces in occupied slots (ms)=0
Map-Reduce Framework
Map input records=0
Map output records=0
Input split bytes=44
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=34
CPU time spent (ms)=450
Physical memory (bytes) snapshot=96169984
Virtual memory (bytes) snapshot=1599012864
Total committed heap usage (bytes)=76087296
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=0

Thanks,
Johnny

On Mon, Aug 20, 2012 at 11:47 AM, Alessandro Presta 
<[email protected]<mailto:[email protected]>> wrote:
Looks like you compiled for hadoop 0.20.203, which had a different API (that's 
why we have to use Munge). Can you try recompiling with the hadoop_2.0.0 
profile?

From: Johnny Zhang <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Monday, August 20, 2012 7:31 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: cannot run Giraph trunk with Hadoop 2.0.0-alpha

Hi, all:
I am trying to run Giraph trunk with Hadoop 2.0.0-alpha.
I am getting below error when I run a page rank example job with 3 workers.

# hadoop jar 
target/giraph-0.2-SNAPSHOT-for-hadoop-0.20.203.0-jar-with-dependencies.jar 
org.apache.giraph.benchmark.PageRankBenchmark -e 1 -s 3 -v -V 50000000 -w 3
12/08/20 11:10:38 WARN mapred.JobConf: The variable mapred.child.ulimit is no 
longer used.
12/08/20 11:10:38 INFO benchmark.PageRankBenchmark: Using class 
org.apache.giraph.benchmark.PageRankBenchmark
12/08/20 11:10:38 WARN conf.Configuration: mapred.job.tracker is deprecated. 
Instead, use mapreduce.jobtracker.address
12/08/20 11:10:38 WARN conf.Configuration: mapred.job.map.memory.mb is 
deprecated. Instead, use mapreduce.map.memory.mb
12/08/20 11:10:38 WARN conf.Configuration: mapred.job.reduce.memory.mb is 
deprecated. Instead, use mapreduce.reduce.memory.mb
12/08/20 11:10:38 WARN conf.Configuration: 
mapred.map.tasks.speculative.execution is deprecated. Instead, use 
mapreduce.map.speculative
Exception in thread "main" java.lang.IncompatibleClassChangeError: Found 
interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at 
org.apache.giraph.bsp.BspOutputFormat.checkOutputSpecs(BspOutputFormat.java:43)
at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:411)
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:326)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1226)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1223)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1223)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1244)
at org.apache.giraph.graph.GiraphJob.run(GiraphJob.java:714)
at org.apache.giraph.benchmark.PageRankBenchmark.run(PageRankBenchmark.java:150)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at 
org.apache.giraph.benchmark.PageRankBenchmark.main(PageRankBenchmark.java:164)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)

My $HADOOP_MAPRED_HOME and $JAVA_HOME is set up correctly, could anyone tell me 
if I need to setup anything else? Thanks a lot.

Johnny

RE: cannot run Giraph trunk with Hadoop 2.0.0-alpha

Reply via email to