Eli:
Thanks. I was coming to that conclusion. The documentation is out of sync and very sketchy. I was basing this off of the README file in the 0.2 snapshot(line 173). It should not make a difference is the
cluster being submitted to is running local or or remote.

What is really missing is some basic documentation on exactly how to run other examples other than the PageRankBenchmark. Depending on priorities hopefully I will get some time to mess around and
figure out one or two.

If anyone on the list has a small example program or test they are willing to share I would be most
appreciative as that would help my target users significantly.


On 2/26/2013 9:01 PM, Eli Reisman wrote:
Just to throw this out there: it has been noted (by me most recently) on the Giraph JIRA site that the tests aren't happy when you try to run them against a running cluster, they like to instantiate their own local resources (ZK, Hadoop single-node) for the tests. If your example jobs run with "hadoop jar" on the cluster than thats what matters, you're all set.


On Mon, Feb 25, 2013 at 4:03 PM, David Boyd <[email protected] <mailto:[email protected]>> wrote:

    Sandy:
       Yes.   Attached is the segment from the job tracker log file
    that shows the error and stack traces.

    The maven surefire report for the test shows an assertion failure
    on the following line from
    the test:
    assertTrue(job.run(true));
    
-------------------------------------------------------------------------------
    Test set: org.apache.giraph.io.TestJsonBase64Format
    
-------------------------------------------------------------------------------
    Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed:
    32.363 sec <<< FAILURE!
    testContinue(org.apache.giraph.io.TestJsonBase64Format) Time
    elapsed: 32.352 sec  <<< FAILURE!
    java.lang.AssertionError:
            at org.junit.Assert.fail(Assert.java:91)
            at org.junit.Assert.assertTrue(Assert.java:43)
            at org.junit.Assert.assertTrue(Assert.java:54)
            at
    
org.apache.giraph.io.TestJsonBase64Format.testContinue(TestJsonBase64Format.java:74)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
    Method)
            at
    
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
            at
    
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
            at java.lang.reflect.Method.invoke(Method.java:597)
            at
    
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
            at
    
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
            at
    
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
            at
    
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
            at
    
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
            at
    org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
            at
    
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
            at
    
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
            at
    org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
            at
    org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
            at
    org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
            at
    org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
            at
    org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
            at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
            at
    
org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:59)
            at
    
org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.executeTestSet(AbstractDirectoryTestSuite.java:120)
            at
    
org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.execute(AbstractDirectoryTestSuite.java:103)
            at org.apache.maven.surefire.Surefire.run(Surefire.java:169)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
    Method)
            at
    
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
            at
    
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
            at java.lang.reflect.Method.invoke(Method.java:597)
            at
    
org.apache.maven.surefire.booter.SurefireBooter.runSuitesInProcess(SurefireBooter.java:350)
            at
    
org.apache.maven.surefire.booter.SurefireBooter.main(SurefireBooter.java:1021)

    Below is the surefire report stack trace:


    On 2/25/2013 6:55 PM, Sandy Ryza wrote:
    Great to hear it helped.  Are you able to provide the full stack
    trace for that exception?

    thanks,
    Sandy


    On Mon, Feb 25, 2013 at 3:51 PM, David Boyd
    <[email protected]
    <mailto:[email protected]>> wrote:

        Sandy:
           Thanks that helps a great deal.  I am now at least getting
        to the point that the jobs show up in the job tracker.
        However, they all
        fail on initialization with the good old:
        java.io.FileNotFoundException: File 
/tmp/hadoop-mapred/mapred/staging/hdfs/.staging/job_201302211213_0055/job.jar 
does not exist
        This tells me that maven is either not specifying that the
        giraph-core jar file should be used as the job jar or I am
        missing
        something else in the set up.

        Attached is the job.xml file from one of the failed jobs and
        below is the relevant profile out of my pom.xml.
        I did upgrade to CDH4.1.3 just to see if that would help.
        Also, I have been running all sorts of jobs (benchmarks, and
        other tests) against this cluster for some time so I know
        that the cluster
        works well.

        Again, any help is appreciated.

        Relevant section of pom.xml:
        <profile>
        <id>hadoop_cdh4.1.3mr1</id>
              <properties>
        <hadoopmr1.version>2.0.0-mr1-cdh4.1.3</hadoopmr1.version>
        <hadoop.version>2.0.0-cdh4.1.3</hadoop.version>
        <munge.symbols>HADOOP_1_SECURITY,HADOOP_1_SECRET_MANAGER</munge.symbols>
              </properties>
              <dependencies>
                <!-- sorted lexicographically -->
                <dependency>
        <groupId>commons-net</groupId>
        <artifactId>commons-net</artifactId>
                </dependency>
                <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-client</artifactId>
        <version>${hadoopmr1.version}</version>
        <scope>provided</scope>
                </dependency>
                <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-common</artifactId>
        <version>${hadoop.version}</version>
        <scope>provided</scope>
                </dependency>
                <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-hdfs</artifactId>
        <version>${hadoop.version}</version>
        <scope>provided</scope>
                </dependency>
                <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-test</artifactId>
        <version>${hadoopmr1.version}</version>
        <scope>provided</scope>
                </dependency>
              </dependencies>
            </profile>



        On 2/25/2013 12:47 PM, Sandy Ryza wrote:
        Hi David,

        Moving this to cdh-user, as it is CDH-specific.

        CDH4 comes with two versions of mapreduce, MR1, and MR2.  It
        sounds like you are building against MR2
        (http://blog.cloudera.com/blog/2012/10/mr2-and-yarn-briefly-explained/).
         Do you know whether your cluster runs MR2/YARN or MR1?  If
        it runs, MR2, you can set mapreduce.framework.name
        <http://mapreduce.framework.name> to "yarn".  If it runs
        MR1, you can build against the MR1 jar by setting the
        version of your hadoop-client to 2.0.0-mr1-cdh4.1.1.
        
(https://ccp.cloudera.com/display/CDH4DOC/Managing+Hadoop+API+Dependencies+in+CDH4)
        Does that help?

        -Sandy


        On Mon, Feb 25, 2013 at 8:26 AM, David Boyd
        <[email protected]
        <mailto:[email protected]>> wrote:

            All:
               I am trying to get the Giraph 0.2 snapshot (pulled
            via GIT on Friday)
            to build and run with CDH4.

            I modified the pom.xml to provide a profile for my
            specific version (4.1.1).
            The build works (mvn -Phadoop_cdh4.1.1 clean package
            test) and passes
            all the tests.

            If I try to do the next step and submit to my cluster
            with the command:
            mvn -Phadoop_cdh4.1.1 test
            -Dprop.mapred.job.tracker=10.1.94.53:8021
            <http://10.1.94.53:8021>
            -Dgiraph.zkList=10.1.94.104:2181 <http://10.1.94.104:2181>

             the JSON test in core fails.  If I move that test out
            of the way a whole bunch of tests in examples
            fail.  They all fail with:

                java.io.IOException: Cannot initialize Cluster.
                Please check your
                configuration for mapreduce.framework.name
                <http://mapreduce.framework.name> and the correspond
                server
                addresses.


            I have tried passing mapreduce.framework.name
            <http://mapreduce.framework.name> as both local and
            classic.   I have also set those values in my
            mapreduce-site.xml.

            Interestingly I can run the pagerank benchmark in code
            with the command:

                hadoop jar
                
./giraph-core/target/giraph-0.2-SNAPSHOT-for-hadoop-2.0.0-cdh4.1.3-jar-with-dependencies.jar
                org.apache.giraph.benchmark.PageRankBenchmark
                -Dmapred.child.java-opts="-Xmx64g -Xms64g
                XX:+UseConcMarkSweepGC
                -XX:-UseGCOverheadLimit"
                -Dgiraph.zkList=10.1.94.104:2181
                <http://10.1.94.104:2181> -e 1 -s 3 -v
                -V 50000 -w 83

            And it completes just fine.

            I have searched high and low for documents and examples
            on how to run the example programs from other
            than maven but have not found any thing.

            Any help or suggestions  would be greatly appreciated.

            THanks.



-- ========= mailto:[email protected]
            <mailto:[email protected]> ============
            David W. Boyd
            Director, Engineering, Research and Development
            Data Tactics Corporation
            7901 Jones Branch, Suite 240
            Mclean, VA 22102
            office: +1-703-506-3735, ext 308
            <tel:%2B1-703-506-3735%2C%20ext%20308>
            fax: +1-703-506-6703 <tel:%2B1-703-506-6703>
            cell: +1-703-402-7908 <tel:%2B1-703-402-7908>
            ============== http://www.data-tactics.com/ ============

            The information contained in this message may be privileged
            and/or confidential and protected from disclosure.
            If the reader of this message is not the intended recipient
            or an employee or agent responsible for delivering this
            message
            to the intended recipient, you are hereby notified that any
            dissemination, distribution or copying of this communication
            is strictly prohibited.  If you have received this
            communication
            in error, please notify the sender immediately by
            replying to
            this message and deleting the material from any computer.





-- =========mailto:[email protected] ============
        David W. Boyd
        Director, Engineering, Research and Development
        Data Tactics Corporation
        7901 Jones Branch, Suite 240
        Mclean, VA 22102
office:+1-703-506-3735, ext 308 <tel:%2B1-703-506-3735%2C%20ext%20308> fax:+1-703-506-6703 <tel:%2B1-703-506-6703> cell:+1-703-402-7908 <tel:%2B1-703-402-7908>
        ==============http://www.data-tactics.com/  ============
        The information contained in this message may be privileged
        and/or confidential and protected from disclosure.
        If the reader of this message is not the intended recipient
        or an employee or agent responsible for delivering this message
        to the intended recipient, you are hereby notified that any
        dissemination, distribution or copying of this communication
        is strictly prohibited.  If you have received this communication
        in error, please notify the sender immediately by replying to
        this message and deleting the material from any computer.




-- =========mailto:[email protected] ============
    David W. Boyd
    Director, Engineering, Research and Development
    Data Tactics Corporation
    7901 Jones Branch, Suite 240
    Mclean, VA 22102
office:+1-703-506-3735, ext 308 <tel:%2B1-703-506-3735%2C%20ext%20308> fax:+1-703-506-6703 <tel:%2B1-703-506-6703> cell:+1-703-402-7908 <tel:%2B1-703-402-7908>
    ==============http://www.data-tactics.com/  ============
    The information contained in this message may be privileged
    and/or confidential and protected from disclosure.
    If the reader of this message is not the intended recipient
    or an employee or agent responsible for delivering this message
    to the intended recipient, you are hereby notified that any
    dissemination, distribution or copying of this communication
    is strictly prohibited.  If you have received this communication
    in error, please notify the sender immediately by replying to
    this message and deleting the material from any computer.




--
========= mailto:[email protected] ============
David W. Boyd
Director, Engineering, Research and Development
Data Tactics Corporation
7901 Jones Branch, Suite 240
Mclean, VA 22102
office:   +1-703-506-3735, ext 308
fax:     +1-703-506-6703
cell:     +1-703-402-7908
============== http://www.data-tactics.com/ ============
The information contained in this message may be privileged
and/or confidential and protected from disclosure.
If the reader of this message is not the intended recipient
or an employee or agent responsible for delivering this message
to the intended recipient, you are hereby notified that any
dissemination, distribution or copying of this communication
is strictly prohibited.  If you have received this communication
in error, please notify the sender immediately by replying to
this message and deleting the material from any computer.

Reply via email to