Thank you very much for your response.

I had built Giraph with the following command:

mvn -e -Phadoop_yarn -Dhadoop.version=2.0.5-alpha -DskipTests clean install

The patch for issue 688 had been already applied manually, since it wasn't part of the Git repository yet. A few days later I git-fetch'ed the latest trunk (which now includes the commit with the patch) and repeated the process, with the same result (I even deleted the contents of M2_REPO in-between).

The order of the -P and -D flags shouldn't matter, right? (I remember reading https://issues.apache.org/jira/browse/GIRAPH-629, which places the -P flag first.)

Perhaps a problem with the build profiles, omitting certain classes? Unfortunately, I know nothing about Maven to examine this issue myself. It could still be a user error. Should I submit a bug report?

Finally, advice regarding my previous question [see (1)] about conf/giraph-site.xml would be greatly appreciated.

Regards,
Nicholas

PS. For unrelated reasons, I'll have to re-install the whole software stack, so I can't try again anything (or submit a report) right away.


On 07/15/2013 09:02 PM, Eli Reisman wrote:
hi nicholas,

Looks like your build was not pulling in the Giraph yarn/ package classes, probably you build using something like:

mvn -Phadoop_2.0.5 clean install

try instead:

mvn -Dhadoop.version=2.0.5-alpha -Phadoop_yarn clean install

(assuming you have applied the patch for 2.0.5-alpha builds on Giraph I htink its GIRAPH-688, otherwise, the hadoop_yarn profile only builds against 2.0.3-alpha hadoop)




On Sun, Jul 7, 2013 at 7:29 AM, Nicholas Karkoulias <[email protected] <mailto:[email protected]>> wrote:

    Hello everyone.

    This is my first message to an Apache mailing list, so please
    excuse (and correct) possibly incorrect usage or netiquette issues :)

    I am a newbie to both Hadoop and Giraph (but not to Linux). After
    overcoming _several_ configuration-related hurdles, I have
    successfully built both (versions Hadoop 2.0.5-alpha and Giraph
    1.1.0-SNAPSHOT, using the patch for issue #688), and seem to have
    a properly working HDFS/Hadoop installation in pseudo-distributed
    mode. I can run Hadoop code just fine.

    However, Giraph fails during the computation (seemingly just
    before returning or writing the result – the time elapsed before
    the crash differs, depending on which example I try to run). See
    below for the error. I don't know whether it is a bug, or me doing
    something wrong.

    I'm using the YARN-enabled version of Giraph (and, thus, an
    external ZooKeeper service), assuming that Giraph will completely
    move to YARN _eventually_. (Is that correct?)

    Also, two other (somewhat unrelated) questions:
    (1) When is the file conf/giraph-site.xml actually parsed/used? Is
    it read by Hadoop? Should it be copied somewhere? I tried setting
    the ZooKeeper host:port in that file (giraph.zkList), but it was
    ignored and I finally had to add the property in the command line
    shown below... Any relevant documentation? (In general,
    documentation for certain Hadoop features, such as the
    configuration files, seems to be lacking...)
    (2) What's the correct process to submit a patch with a really
    simple typo correction in a string? (perhaps I should just contact
    the file author – it's nothing important)

    I'll append the shell commands I used (after $) and the output at
    the end of this message.

    Thank you in advance,
    Nicholas


    $ function giraphrunner(){ hadoop jar
    
/tmp/software/giraph/giraph-examples/target/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-2.0.5-alpha-jar-with-dependencies.jar
    org.apache.giraph.GiraphRunner -Dgiraph.zkList=localhost:2181 "$@"; }

    $ time giraphrunner
    org.apache.giraph.examples.SimplePageRankComputation -vif
    org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
    -vip /dir/tiny_graph.txt -of
    org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
    /dir/simplepagerank -w 1
    13/07/07 15:06:15 INFO utils.ConfigurationUtils: No edge input
    format specified. Ensure your InputFormat does not require one.
    13/07/07 15:06:15 INFO yarn.GiraphYarnClient: Final output path
    is: hdfs://localhost:9000/dir/simplepagerank
    13/07/07 15:06:15 INFO service.AbstractService:
    Service:org.apache.hadoop.yarn.client.YarnClientImpl is inited.
    13/07/07 15:06:15 INFO service.AbstractService:
    Service:org.apache.hadoop.yarn.client.YarnClientImpl is started.
    13/07/07 15:06:15 INFO yarn.GiraphYarnClient: Defaulting per-task
    heap size to 1024MB.
    13/07/07 15:06:15 INFO yarn.GiraphYarnClient: Obtained new
    Application ID: application_1372875746593_0018
    13/07/07 15:06:15 WARN conf.Configuration: mapred.job.id is
    deprecated. Instead, use mapreduce.job.id
    13/07/07 15:06:15 WARN conf.Configuration: mapred.output.dir is
    deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
    13/07/07 15:06:15 INFO yarn.YarnUtils: Registered file in
    LocalResources: giraph-conf.xml
    13/07/07 15:06:16 INFO yarn.GiraphYarnClient:
    ApplicationSumbissionContext for GiraphApplicationMaster launch
    container is populated.
    13/07/07 15:06:16 INFO client.YarnClientImpl: Submitted
    application application_1372875746593_0018 to ResourceManager at
    localhost/127.0.0.1:8032
    13/07/07 15:06:16 INFO yarn.GiraphYarnClient:
    GiraphApplicationMaster container request was submitted to
    ResourceManager for job: Giraph:
    org.apache.giraph.examples.SimplePageRankComputation
    13/07/07 15:06:17 INFO yarn.GiraphYarnClient: Giraph:
    org.apache.giraph.examples.SimplePageRankComputation, Elapsed:
    0.85 secs
    13/07/07 15:06:17 INFO yarn.GiraphYarnClient:
    appattempt_1372875746593_0018_000001, State: ACCEPTED, Containers
    used: 1
    13/07/07 15:06:18 ERROR yarn.GiraphYarnClient: Giraph:
    org.apache.giraph.examples.SimplePageRankComputation reports
    FAILED state, diagnostics show: Application
    application_1372875746593_0018 failed 1 times due to AM Container
    for appattempt_1372875746593_0018_000001 exited with  exitCode: 1
    due to:
    .Failing this attempt.. Failing the application.
    13/07/07 15:06:18 INFO yarn.GiraphYarnClient: Cleaning up HDFS
    distributed cache directory for Giraph job.
    13/07/07 15:06:18 INFO yarn.GiraphYarnClient: Completed Giraph:
    org.apache.giraph.examples.SimplePageRankComputation: FAILED,
    total running time: 0 minutes, 1 seconds.

    real    0m8.392s
    user    0m8.825s
    sys     0m1.492s

    $ cat
    
software/hadoop-2.0.5-alpha/logs/userlogs/application_1372875746593_0018/container_1372875746593_0018_01_000001/gam-stderr.log
    Exception in thread "main" java.lang.NoClassDefFoundError:
    org/apache/giraph/yarn/GiraphApplicationMaster
    Caused by: java.lang.ClassNotFoundException:
    org.apache.giraph.yarn.GiraphApplicationMaster
            at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
            at java.security.AccessController.doPrivileged(Native Method)
            at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
            at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
            at
    sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
            at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
    Could not find the main class:
    org.apache.giraph.yarn.GiraphApplicationMaster. Program will exit.



Reply via email to