Here is my work log with some steps I need to prep for building Giraph: Requires Maven 3.x
mvn -version Install JDK 1.7 http://www.if-not-true-then-false.com/2010/install-sun-oracle-java-jdk-jre-7-on-fedora-centos-red-hat-rhel/ ## java ## sudo alternatives --install /usr/bin/java java /usr/java/jdk1.7.0_51/jre/bin/java 200000 ## javaws ## sudo alternatives --install /usr/bin/javaws javaws /usr/java/jdk1.7.0_51/jre/bin/javaws 200000 ## Java Browser (Mozilla) Plugin 32-bit ## sudo alternatives --install /usr/lib/mozilla/plugins/libjavaplugin.so libjavaplugin.so /usr/java/jdk1.7.0_51/jre/lib/i386/libnpjp2.so 200000 ## Java Browser (Mozilla) Plugin 64-bit ## sudo alternatives --install /usr/lib64/mozilla/plugins/libjavaplugin.so libjavaplugin.so.x86_64 /usr/java/jdk1.7.0_51/jre/lib/amd64/libnpjp2.so 200000 ## Install javac only if you installed JDK (Java Development Kit) package ## sudo alternatives --install /usr/bin/javac javac /usr/java/jdk1.7.0_51/bin/javac 200000 sudo alternatives --install /usr/bin/jar jar /usr/java/jdk1.7.0_51/bin/jar 200000 Check JDK export JAVA_HOME="/usr/java/jdk1.7.0_51" java -version Checkout sources git clone https://git-wip-us.apache.org/repos/asf/giraph.git<http://git-wip-us.apache.org/repos/asf/giraph.git> Apply the last version of the unmerged DOCU - patch wget https://issues.apache.org/jira/secure/attachment/12630040/GIRAPH-849.v3.patch git apply --stat GIRAPH-849.v3.patch<https://issues.apache.org/jira/secure/attachment/12630040/GIRAPH-849.v3.patch> git apply --check GIRAPH-849.v3.patch<https://issues.apache.org/jira/secure/attachment/12630040/GIRAPH-849.v3.patch> Build Giraph mvn -Phadoop_2 -fae -DskipTests clean install mvn -Phadoop_2 -DskipTests -Ddependency.locations.enabled=false site mvn -Phadoop_2 -DskipTests site:stage Do some cool work on doc and code ... ;-) Grep for some code: grep -r --include="*.java" WHAT WHERE Create the patch and submit it to JIRA and to the Review Board http://ariejan.net/2009/10/26/how-to-create-and-apply-a-patch-with-git/ git diff --no-prefix trunk > GIRAPH-{ISSUE_NUMBER}.patch You can skip the yello parts ... and maybe you need another profile, but I just use hadoop_2 right now. Good luck! MK On Sat, Mar 1, 2014 at 5:57 PM, Jyoti Yadav <[email protected]>wrote: > Hi Mirko.. > > Thanks for your reply.. All MapReduce programs are running fine on this > system. > And it is yarn setup. > > Please guide me how to bulid giraph with this hadoop version..Should I > need to install external zookeeper also.? > > Thanks in advance.. > > Jyoti > > > On Sat, Mar 1, 2014 at 6:31 PM, Mirko Kämpf <[email protected]>wrote: > >> Hello, >> >> if you build Giraph for hadoop 0.20.... the same jars will not work for >> hadoop version 2.2.0. >> Right now I build the profile -Phadoop_2 from curren the 1.1. branch in >> the git repo. >> >> How many nodes (physical servers or VMs) do you run on your 64 core >> system? >> What distro of Hadoop are working with? and is it a MRv1 or MRV2 (YARN) >> setup? >> >> Is your MapReduce system working properly ... can you run TerraSort for >> example? >> >> Cheers, >> Mirko >> >> >> >> On Sat, Mar 1, 2014 at 4:15 AM, Jyoti Yadav >> <[email protected]>wrote: >> >>> Anyone please reply ..Is it portability problem??.. Does giraph has any >>> issues with Hadoop 2.2.0?? >>> >>> Do I need to build Giraph on the new system ?? >>> >>> Thanks >>> >>> >>> >>> On Sat, Mar 1, 2014 at 2:28 PM, Jyoti Yadav >>> <[email protected]>wrote: >>> >>>> Hi Sebastian.. >>>> Thanks for the links given for big graphs.. >>>> >>>> Actually I want to tell you something about problem i am facing. >>>> >>>> Initially I was working with *hadoop 0.20.203* . I build Giraph >>>> there.. it was running fine. >>>> >>>> Now to test very big graph related problem and to compare the >>>> performance , I moved to new system which is of 64 cores and 512 GB memory >>>> and 3 TB storage. Instead to building Giraph in the new system, I just >>>> copied Giraph folder from my previous system to this new system. In this >>>> new system *hadoop version 2.2..0 * . I tried to execute >>>> SimpleSourceShortestPath algo on sample data set. It is throwing following >>>> exception. >>>> >>>> I gave following command to execute the job. >>>> >>>> hadoop jar >>>> /home/abcd2014/giraph/giraph-examples/target/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-0.20.203.0-jar-with-dependencies.jar >>>> org.apache.giraph.GiraphRunner -Dgiraph.SplitMasterWorker=true >>>> org.apache.giraph.examples.SimpleShortestPathsComputation -vif >>>> org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat >>>> -vip /user/abcd2014/giraph_input/tiny_graph.txt -vof >>>> org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op >>>> /user/abcd2014/output2/shortestpaths -w 1 >>>> >>>> >>>> >>>> 14/03/01 12:44:46 INFO utils.ConfigurationUtils: No edge input format >>>> specified. Ensure your InputFormat does not require one. >>>> 14/03/01 12:44:46 INFO utils.ConfigurationUtils: No edge output format >>>> specified. Ensure your OutputFormat does not require one. >>>> 14/03/01 12:44:46 INFO Configuration.deprecation: >>>> mapreduce.job.counters.limit is deprecated. Instead, use >>>> mapreduce.job.counters.max >>>> 14/03/01 12:44:46 INFO Configuration.deprecation: >>>> mapred.job.map.memory.mb is deprecated. Instead, use >>>> mapreduce.map.memory.mb >>>> 14/03/01 12:44:46 INFO Configuration.deprecation: >>>> mapred.job.reduce.memory.mb is deprecated. Instead, use >>>> mapreduce.reduce.memory.mb >>>> 14/03/01 12:44:46 INFO Configuration.deprecation: >>>> mapred.map.tasks.speculative.execution is deprecated. Instead, use >>>> mapreduce.map.speculative >>>> 14/03/01 12:44:46 INFO Configuration.deprecation: >>>> mapreduce.user.classpath.first is deprecated. Instead, use >>>> mapreduce.job.user.classpath.first >>>> 14/03/01 12:44:46 INFO Configuration.deprecation: >>>> mapred.map.max.attempts is deprecated. Instead, use >>>> mapreduce.map.maxattempts >>>> 14/03/01 12:44:46 INFO job.GiraphJob: run: Since checkpointing is >>>> disabled (default), do not allow any task retries (setting >>>> mapred.map.max.attempts = 0, old value = 4) >>>> 14/03/01 12:44:46 INFO Configuration.deprecation: mapred.job.tracker is >>>> deprecated. Instead, use mapreduce.jobtracker.address >>>> >>>> *Exception in thread "main" java.lang.IllegalArgumentException: >>>> checkLocalJobRunnerConfiguration: When using LocalJobRunner, you cannot run >>>> in split master / worker mode since there is only 1 task at a time! * >>>> at >>>> org.apache.giraph.job.GiraphJob.checkLocalJobRunnerConfiguration(GiraphJob.java:165) >>>> at org.apache.giraph.job.GiraphJob.run(GiraphJob.java:233) >>>> at org.apache.giraph.GiraphRunner.run(GiraphRunner.java:94) >>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) >>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) >>>> at org.apache.giraph.GiraphRunner.main(GiraphRunner.java:124) >>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>> at >>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>>> at >>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>>> at java.lang.reflect.Method.invoke(Method.java:606) >>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:212) >>>> >>>> >>>> >>>> Would you suggest me something to fix this...If you need any details >>>> further,please let me know... >>>> >>>> Thanks & Regards >>>> >>>> Jyoti >>>> >>>> >>>> >>>> >>>> On Sat, Mar 1, 2014 at 1:35 PM, Sebastian Schelter <[email protected]>wrote: >>>> >>>>> Hi Jyoti, >>>>> >>>>> You can find a couple of very large graphs in KONECT [1] and on the >>>>> website of the laboratory for web algorithmics from the University of >>>>> Milan >>>>> [2]. You will probably have to convert them to an appropriate format for >>>>> Giraph. >>>>> >>>>> Best, >>>>> Sebastian >>>>> >>>>> [1] http://konect.uni-koblenz.de/ >>>>> [2] http://law.di.unimi.it/datasets.php >>>>> >>>>> >>>>> On 03/01/2014 05:22 AM, Jyoti Yadav wrote: >>>>> >>>>>> Hi folks.. >>>>>> >>>>>> I got new system which is of 64 cores and 512 GB memory and 3 TB >>>>>> storage.I want to test the performance of Giraph on this system. >>>>>> Would anyone provide me the link for very large graph so that I can >>>>>> execute Single Source Shortest Path Example. For this algo to run >>>>>> graph >>>>>> should be weighted graph. and to feed it into the Giraph -input >>>>>> format is >>>>>> JsonLongDoubleFloatDouble >>>>>> >>>>>> Thanks in advance... >>>>>> With Regards >>>>>> >>>>>> Jyoti >>>>>> >>>>>> >>>>> >>>> >>> >> >> >> -- >> -- >> Mirko Kämpf >> >> *Trainer* @ Cloudera >> >> tel: +49 *176 20 63 51 99* >> skype: *kamir1604* >> [email protected] >> >> > -- -- Mirko Kämpf *Trainer* @ Cloudera tel: +49 *176 20 63 51 99* skype: *kamir1604* [email protected]
