Glad it worked out. You'll also want to use -Phadoop_yarn for your Maven profile if you want to run on a YARN cluster without MRv2. the hadoop_2 profile uses MRv2 it doesn't build the YARN packages. This might be why you got a ClassNotFoundException for GiraphYarnTask?
The hadoop_yarn profile has not been maintained lately and if the hadoop_2 profile is working for you I'd stick with it for the time being. That said, I do think when it was last updated it was verified to run on Hadoop 2.2.0 or higher. On Tue, Dec 16, 2014 at 6:14 AM, Philipp Nolte <[email protected]> wrote: > > Wow. Thanks. That did it. > > I used the deprecated MR1 mapred.job.tracker property but should have used > the MR2 mapreduce.jobtracker.address property. > > For anyone curious, this is my working command: > > hadoop jar > giraphs-and-balloons-computation-0.0.1-for-hadoop-2.5.1-and-giraph-1.1.0-jar-with-dependencies.jar\ > org.apache.giraph.GiraphRunner\ > de.unipassau.fim.dimis.nolte.computation.FindSupertypeClustersComputation\ > -eif de.unipassau.fim.dimis.nolte.io.NTriplesToDirectedGraphInputFormat\ > -eip /user/hduser/input/equivalence.nt\ > -vof > de.unipassau.fim.dimis.nolte.io.SupertypeClustersVertexValueOutputFormat\ > -op /user/hduser/output/equivalence\ > -w 3\ > -ca mapreduce.jobtracker.address=master:5431 > > Thanks a lot for your help! > > Am 16.12.2014 um 13:00 schrieb Claudio Martella < > [email protected]>: > > try specifying the jobtracker by hand, via mapreduce.jobtracker.address. > if no jobtracker is defined giraph will try to run locally. it's a > misconfiguration of your cluster. > > On Tue, Dec 16, 2014 at 8:19 AM, Philipp Nolte <[email protected]> > wrote: >> >> Hello Roman, thanks for your answer. >> >> I built and installed Giraph 1.1.0 into my local repository using >> >> $ mvn clean install -DskipTests -Phadoop_2 -Dhadoop.version=2.5.1 >> >> My jar containing my Giraph computation was built using the following >> pom.xml and running >> >> pom.xml: https://gist.github.com/ptnplanet/2d1def1605adff37a622 >> >> I am using a default Hadoop 2.5.1 configuration. >> >> In my yarn-site.xml I’ve only got the yarn.resourcemanager.hostname set >> to „master“. I can verify using the resourcemanager’s web interface that >> all the nodemanagers are connected to the resourcemanager. >> >> My mapred-site.xml has mapreduce.framework.name set to „yarn“. >> >> I’ve got a Zookeeper running on „master“ and have set giraph.zkServerPort >> to 2181 and graph.zkList set to „master:2181“ in my giraph-site.xml. >> >> When not specifying the mapred.job.tracker property, I get a >> "checkLocalJobRunnerConfiguration: When using LocalJobRunner, must have >> only one worker since only 1 task at a time!“ error. >> >> There must be some misconfiguration somewhere. Hope you can help! >> >> Thanks in advance! >> >> Philipp >> >> >> Am 16.12.2014 um 03:17 schrieb Roman Shaposhnik <[email protected]>: >> >> On Sun, Dec 14, 2014 at 11:21 PM, Philipp Nolte <[email protected]> >> wrote: >> >> Maybe its just a configuration thing. >> >> >> How did you build Giraph in the first place? Also, what's >> your mapred-site.xml and yarn-site.xml in HADOOP_CONF_DIR? >> >> Finally, what version of Hadoop are you using? And from >> what vendor? >> >> I’ve tried running in giraph.SplitMasterWorker mode and its seems like >> hadoop is missing the worker nodes: >> >> Here is my command: >> $ hadoop jar >> giraphs-and-balloons-computation-0.0.1-for-hadoop-2.5.1-and-giraph-1.1.0-RC1-jar-with-dependencies.jar >> >> >> How did you build this JAR? >> >> -ca mapred.job.tracker=master:5431\ >> >> >> If your Giraph installation has access to a correctly >> configured Hadoop client you really don't need this >> line. >> >> Thanks, >> Roman. >> >> >> > > -- > Claudio Martella > > > >
