Dear Wiki user,
You have subscribed to a wiki page or wiki category on Hadoop Wiki for change
notification.
The GitAndHadoop page has been changed by AkiraAjisaka:
https://wiki.apache.org/hadoop/GitAndHadoop?action=diffrev1=17rev2=18
Comment:
Use Git for the SCM system for Hadoop instead of SVN
A lot of people use Git with Hadoop because they have their own patches to
make to Hadoop, and Git helps them manage it.
* GitHub provide some good lessons on git at [[http://learn.github.com]]
- * Apache serves up read-only Git versions of their source at
[[http://git.apache.org/]]. People cannot commit changes with Git; for that the
patches need to be applied to the SVN repositories
+ * Apache serves up read-only Git versions of their source at
[[http://git.apache.org/]]. Committers can commit changes to writable Git
repository. See HowToCommitWithGit
This page tells you how to work with Git. See HowToContribute for
instructions on building and testing Hadoop.
TableOfContents(4)
+
== Key Git Concepts ==
The key concepts of Git.
@@ -23, +24 @@
You need a copy of git on your system. Some IDEs ship with Git support; this
page assumes you are using the command line.
- Clone a local Git repository from the Apache repository. The Hadoop
subprojects (common, HDFS, and MapReduce) live inside a combined repository
called `hadoop-common.git`.
+ Clone a local Git repository from the Apache repository. The Hadoop
subprojects (common, HDFS, and MapReduce) live inside a combined repository
called `hadoop.git`.
{{{
- git clone git://git.apache.org/hadoop-common.git
+ git clone git://git.apache.org/hadoop.git
}}}
The total download is well over 100MB, so the initial checkout process works
best when the network is fast. Once downloaded, Git works offline -though you
will need to perform your initial builds online so that the build tools (Maven,
Ivy c) can download dependencies.
== Grafts for complete project history ==
- The Hadoop project has undergone some movement in where its component parts
have been versioned. Because of that, commands like `git log --follow` needs to
have a little help. To graft the history back together into a coherent whole,
insert the following contents into `hadoop-common/.git/info/grafts`:
+ The Hadoop project has undergone some movement in where its component parts
have been versioned. Because of that, commands like `git log --follow` needs to
have a little help. To graft the history back together into a coherent whole,
insert the following contents into `hadoop/.git/info/grafts`:
{{{
5128a9a453d64bfe1ed978cf9ffed27985eeef36
6c16dc8cf2b28818c852e95302920a278d07ad0c
@@ -49, +50 @@
1. Create a GitHub login at http://github.com/ ; Add your public SSH keys
1. Go to http://github.com/apache and search for the Hadoop and other Apache
projects you want (avro is handy alongside the others)
- 1. For each project, fork in the githb UI. This gives you your own
repository URL which you can then clone locally with {{{git clone}}}
+ 1. For each project, fork in the github UI. This gives you your own
repository URL which you can then clone locally with {{{git clone}}}
1. For each patch, branch.
- At the time of writing (December 2009), GitHub was updating its copy of the
Apache repositories every hour. As the Apache repositories were updating every
15 minutes, provided these frequencies are retained, a GitHub-fork derived
version will be at worst 1 hour and 15 minutes behind the ASF's SVN repository.
If you are actively developing on Hadoop, especially committing code into the
SVN repository, that is too long -work off the Apache repositories instead.
+ At the time of writing (December 2009), GitHub was updating its copy of the
Apache repositories every hour. As the Apache repositories were updating every
15 minutes, provided these frequencies are retained, a GitHub-fork derived
version will be at worst 1 hour and 15 minutes behind the ASF's Git repository.
If you are actively developing on Hadoop, especially committing code into the
Git repository, that is too long -work off the Apache repositories instead.
1. Clone the read-only repository from Github (their recommendation) or from
Apache (the ASF's recommendation)
1. in that clone, rename that repository apache: {{{git remote rename
origin apache}}}
1. Log in to [http://github.com]
1. Create a new repository (e.g hadoop-fork)
1. In the existing clone, add the new repository :
- {{{git remote add -f github
g...@github.com:MYUSERNAMEHERE/hadoop-common.git}}}
+ {{{git remote add -f github g...@github.com:MYUSERNAMEHERE/hadoop.git}}}
This gives you a local repository with two remote repositories: apache and
github. Apache has the trunk branch, which you can update whenever you want
to get the latest ASF version:
@@ -71, +72 @@
Your own branches can be merged with trunk, and pushed out to git hub. To
generate