Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "GitAndHadoop" page has been changed by ToddLipcon: http://wiki.apache.org/hadoop/GitAndHadoop?action=diff&rev1=14&rev2=15 == Checking out the source == - The first step is to create your own Git repository from the Apache repositories. There are separate repositories for all the different Hadoop sub-projects; this page looks at the core filesystem and MapReduce engine. + The first step is to create your own Git repository from the Apache repository. The hadoop subprojects (common, HDFS, and MapReduce) live inside a combined repo called `hadoop-common.git`. - Make your base hadoop directory - {{{ - mkdir hadoop - }}} - Change into this directory - {{{ - cd hadoop - }}} - Create "clones" of the Apache Git repositories {{{ git clone git://git.apache.org/hadoop-common.git - git clone git://git.apache.org/hadoop-hdfs.git - git clone git://git.apache.org/hadoop-mapreduce.git }}} The total download is well over 100MB, so the initial checkout process works best when the network is fast. Once downloaded, Git works offline. + + == Grafts for complete project history == + + The Hadoop project has undergone some movement in where its component parts have been versioned. Because of that, commands like `git log --follow` need to have a little help. To graft the history back together into a coherent whole, insert the following contents into `hadoop-common/.git/info/grafts`: + + {{{ + 5128a9a453d64bfe1ed978cf9ffed27985eeef36 6c16dc8cf2b28818c852e95302920a278d07ad0c + 6a3ac690e493c7da45bbf2ae2054768c427fd0e1 6c16dc8cf2b28818c852e95302920a278d07ad0c + 546d96754ffee3142bcbbf4563c624c053d0ed0d 6c16dc8cf2b28818c852e95302920a278d07ad0c + }}} + + You can then use commands like `git blame --follow` with success. == Forking onto GitHub == @@ -73, +74 @@ Next, symlink this file to every Hadoop module. Now a change in the file gets picked up by all three. {{{ - pushd hadoop-common; ln -s ../build.properties build.properties; popd + pushd common; ln -s ../build.properties build.properties; popd - pushd hadoop-hdfs; ln -s ../build.properties build.properties; popd + pushd hdfs; ln -s ../build.properties build.properties; popd - pushd hadoop-mapreduce; ln -s ../build.properties build.properties; popd + pushd mapreduce; ln -s ../build.properties build.properties; popd }}} You are now all set up to build. === Build Hadoop === - 1. In {{{hadoop-common/}}} run {{{ant mvn-install}}} + 1. In {{{common/}}} run {{{ant mvn-install}}} - 1. In {{{hadoop-hdfs/}}} run {{{ant mvn-install}}} + 1. In {{{hdfs/}}} run {{{ant mvn-install}}} - 1. In {{{hadoop-mapreduce/}}} run {{{ant mvn-install}}} + 1. In {{{mapreduce/}}} run {{{ant mvn-install}}} This Ant target not only builds the JAR files, it copies it to the local {{{${user.home}/.m2}}} directory, where it will be picked up by the "internal" resolver. You can check that this is taking place by running {{{ant ivy-report}}} on a project and seeing where it gets its dependencies.
