Hello everyone.

This is my first message to an Apache mailing list, so please excuse (and correct) possibly incorrect usage or netiquette issues :)

I am a newbie to both Hadoop and Giraph (but not to Linux). After overcoming _several_ configuration-related hurdles, I have successfully built both (versions Hadoop 2.0.5-alpha and Giraph 1.1.0-SNAPSHOT, using the patch for issue #688), and seem to have a properly working HDFS/Hadoop installation in pseudo-distributed mode. I can run Hadoop code just fine.

However, Giraph fails during the computation (seemingly just before returning or writing the result – the time elapsed before the crash differs, depending on which example I try to run). See below for the error. I don't know whether it is a bug, or me doing something wrong.

I'm using the YARN-enabled version of Giraph (and, thus, an external ZooKeeper service), assuming that Giraph will completely move to YARN _eventually_. (Is that correct?)

Also, two other (somewhat unrelated) questions:
(1) When is the file conf/giraph-site.xml actually parsed/used? Is it read by Hadoop? Should it be copied somewhere? I tried setting the ZooKeeper host:port in that file (giraph.zkList), but it was ignored and I finally had to add the property in the command line shown below... Any relevant documentation? (In general, documentation for certain Hadoop features, such as the configuration files, seems to be lacking...) (2) What's the correct process to submit a patch with a really simple typo correction in a string? (perhaps I should just contact the file author – it's nothing important)

I'll append the shell commands I used (after $) and the output at the end of this message.

Thank you in advance,
Nicholas


$ function giraphrunner(){ hadoop jar /tmp/software/giraph/giraph-examples/target/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-2.0.5-alpha-jar-with-dependencies.jar org.apache.giraph.GiraphRunner -Dgiraph.zkList=localhost:2181 "$@"; }

$ time giraphrunner org.apache.giraph.examples.SimplePageRankComputation -vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip /dir/tiny_graph.txt -of org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op /dir/simplepagerank -w 1 13/07/07 15:06:15 INFO utils.ConfigurationUtils: No edge input format specified. Ensure your InputFormat does not require one. 13/07/07 15:06:15 INFO yarn.GiraphYarnClient: Final output path is: hdfs://localhost:9000/dir/simplepagerank 13/07/07 15:06:15 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.YarnClientImpl is inited. 13/07/07 15:06:15 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.YarnClientImpl is started. 13/07/07 15:06:15 INFO yarn.GiraphYarnClient: Defaulting per-task heap size to 1024MB. 13/07/07 15:06:15 INFO yarn.GiraphYarnClient: Obtained new Application ID: application_1372875746593_0018 13/07/07 15:06:15 WARN conf.Configuration: mapred.job.id is deprecated. Instead, use mapreduce.job.id 13/07/07 15:06:15 WARN conf.Configuration: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir 13/07/07 15:06:15 INFO yarn.YarnUtils: Registered file in LocalResources: giraph-conf.xml 13/07/07 15:06:16 INFO yarn.GiraphYarnClient: ApplicationSumbissionContext for GiraphApplicationMaster launch container is populated. 13/07/07 15:06:16 INFO client.YarnClientImpl: Submitted application application_1372875746593_0018 to ResourceManager at localhost/127.0.0.1:8032 13/07/07 15:06:16 INFO yarn.GiraphYarnClient: GiraphApplicationMaster container request was submitted to ResourceManager for job: Giraph: org.apache.giraph.examples.SimplePageRankComputation 13/07/07 15:06:17 INFO yarn.GiraphYarnClient: Giraph: org.apache.giraph.examples.SimplePageRankComputation, Elapsed: 0.85 secs 13/07/07 15:06:17 INFO yarn.GiraphYarnClient: appattempt_1372875746593_0018_000001, State: ACCEPTED, Containers used: 1 13/07/07 15:06:18 ERROR yarn.GiraphYarnClient: Giraph: org.apache.giraph.examples.SimplePageRankComputation reports FAILED state, diagnostics show: Application application_1372875746593_0018 failed 1 times due to AM Container for appattempt_1372875746593_0018_000001 exited with exitCode: 1 due to:
.Failing this attempt.. Failing the application.
13/07/07 15:06:18 INFO yarn.GiraphYarnClient: Cleaning up HDFS distributed cache directory for Giraph job. 13/07/07 15:06:18 INFO yarn.GiraphYarnClient: Completed Giraph: org.apache.giraph.examples.SimplePageRankComputation: FAILED, total running time: 0 minutes, 1 seconds.

real    0m8.392s
user    0m8.825s
sys     0m1.492s

$ cat software/hadoop-2.0.5-alpha/logs/userlogs/application_1372875746593_0018/container_1372875746593_0018_01_000001/gam-stderr.log Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/giraph/yarn/GiraphApplicationMaster Caused by: java.lang.ClassNotFoundException: org.apache.giraph.yarn.GiraphApplicationMaster
        at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
Could not find the main class: org.apache.giraph.yarn.GiraphApplicationMaster. Program will exit.

Reply via email to