[Hadoop Wiki] Update of "HowToContribute" by AaronKimba ll

Apache Wiki Wed, 16 Dec 2009 15:53:08 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.


The "HowToContribute" page has been changed by AaronKimball.
http://wiki.apache.org/hadoop/HowToContribute?action=diff&rev1=39&rev2=40

--------------------------------------------------

  = How to Contribute to Hadoop Common =
- 
  This page describes the mechanics of ''how'' to contribute software to Hadoop 
Common.  For ideas about ''what'' you might contribute, please see the 
ProjectSuggestions page.
  
  === Getting the source code ===
- 
  First of all, you need the Hadoop source code. The official location for 
Hadoop is the Apache SVN repository; Git is also supported, and useful if you 
want to make lots of local changes -and keep those changes under some form or 
private or public revision control.
  
  ==== SVN Access ====
- 
  Get the source code on your local drive using 
[[http://hadoop.apache.org/core/version_control.html|SVN]].  Most development 
is done on the "trunk":
  
  {{{
  svn checkout http://svn.apache.org/repos/asf/hadoop/common/trunk/ 
hadoop-common-trunk
  }}}
- 
- You may also want to develop against a specific release.  To do so, visit 
[[http://svn.apache.org/repos/asf/hadoop/common/tags/]] and find the release 
that you are interested in developing against.  To checkout this release, run:
+ You may also want to develop against a specific release.  To do so, visit 
http://svn.apache.org/repos/asf/hadoop/common/tags/ and find the release that 
you are interested in developing against.  To checkout this release, run:
  
  {{{
  svn checkout 
http://svn.apache.org/repos/asf/hadoop/common/tags/release-X.Y.Z/ 
hadoop-common-X.Y.Z
  }}}
- 
  If you prefer to use Eclipse for development, there are instructions for 
setting up SVN access from within Eclipse at EclipseEnvironment.
  
+ The Hadoop system is split into three separate projects: common, hdfs, and 
mapreduce. You'll also need to check out the other subprojects:
+ 
+ {{{
+ svn checkout http://svn.apache.org/repos/asf/hadoop/hdfs/trunk/ 
hadoop-hdfs-trunk
+ svn checkout http://svn.apache.org/repos/asf/hadoop/mapreduce/trunk/ 
hadoop-mapred-trunk
+ }}}
  ==== Git Access ====
- 
- See [[GitAndHadoop]]
+ See GitAndHadoop
  
  === Making Changes ===
- 
  Before you start, send a message to the 
[[http://hadoop.apache.org/core/mailing_lists.html|Hadoop developer mailing 
list]], or file a bug report in [[Jira]].  Describe your proposed changes and 
check that they fit in with what others are doing and have planned for the 
project.  Be patient, it may take folks a while to understand your requirements.
  
  Modify the source code and add some (very) nice features using your favorite 
IDE.<<BR>>
  
  But take care about the following points
+ 
   * All public classes and methods should have informative 
[[http://java.sun.com/j2se/javadoc/writingdoccomments/|Javadoc comments]].
    * Do not use @author tags.
   * Code should be formatted according to 
[[http://java.sun.com/docs/codeconv/|Sun's conventions]], with one exception:
@@ -51, +51 @@

    * You can run all the unit test with the command {{{ant test}}}, or you can 
run a specific unit test with the command {{{ant -Dtestcase=<class name without 
package prefix> test}}} (for example {{{ant -Dtestcase=TestFileSystem test}}})
  
  ==== Using Ant ====
- 
  Hadoop is built by Ant, a Java building tool.  This section will eventually 
describe how Ant is used within Hadoop.  To start, simply read a good Ant 
tutorial.  The following is a good tutorial, though keep in mind that Hadoop 
isn't structured according to the ways outlined in the tutorial.  Use the 
tutorial to get a basic understand of Ant but not to understand how Ant is used 
for Hadoop:
  
   * Good Ant tutorial: http://i-proving.ca/space/Technologies/Ant+Tutorial
  
- Although most Java IDEs ship with a version of Ant, having a command line 
version installed is invaluable. You can download a version from 
[[http://ant.apache.org/]].
+ Although most Java IDEs ship with a version of Ant, having a command line 
version installed is invaluable. You can download a version from 
http://ant.apache.org/.
  
- After installing Ant, you must make sure that it's networking support is 
configured for any proxy you have. Without that the build will not work, as the 
Hadoop builds will not be able to download their dependencies using 
[[http://ant.apache.org/ivy/ | Ivy]].
+ After installing Ant, you must make sure that it's networking support is 
configured for any proxy you have. Without that the build will not work, as the 
Hadoop builds will not be able to download their dependencies using 
[[http://ant.apache.org/ivy/|Ivy]].
  
  Tip: to see how Ant is set up, run
+ 
  {{{
  ant -diagnostics
  }}}
- 
  === Generating a patch ===
- 
  ==== Unit Tests ====
- 
  Please make sure that all unit tests succeed before constructing your patch 
and that no new javac compiler warnings are introduced by your patch.
  
  {{{
@@ -76, +73 @@

  > ant -Djavac.args="-Xlint -Xmaxwarns 1000" clean test tar
  }}}
  After a while, if you see
+ 
  {{{
  BUILD SUCCESSFUL
  }}}
  all is ok, but if you see
+ 
  {{{
  BUILD FAILED
  }}}
@@ -88, +87 @@

  Unit tests development guidelines HowToDevelopUnitTests
  
  ==== Javadoc ====
- 
  Please also check the javadoc.
  
  {{{
  > ant javadoc
  > firefox build/docs/api/index.html
  }}}
- 
  Examine all public classes you've changed to see that documentation is 
complete, informative, and properly formatted.  Your patch must not generate 
any javadoc warnings.
  
  ==== Creating a patch ====
  Check to see what files you have modified with:
+ 
  {{{
  svn stat
  }}}
- 
  Add any new files with:
+ 
  {{{
  svn add src/.../MyNewClass.java
  svn add src/.../TestMyNewClass.java
  }}}
- 
  In order to create a patch, type (from the base directory of hadoop):
  
  {{{
  svn diff > HADOOP-1234.patch
  }}}
- 
- This will report all modifications done on Hadoop sources on your local disk 
and save them into the ''HADOOP-1234.patch'' file.  Read the patch file.
+ This will report all modifications done on Hadoop sources on your local disk 
and save them into the ''HADOOP-1234.patch'' file.  Read the patch file. Make 
sure it includes ONLY the modifications required to fix a single issue.
- Make sure it includes ONLY the modifications required to fix a single issue.
  
  Please do not:
+ 
   * reformat code unrelated to the bug being fixed: formatting changes should 
be separate patches/commits.
   * comment out code that is now obsolete: just remove it.
   * insert comments around each change, marking the change: folks can use 
subversion to figure out what's changed and by whom.
   * make things public which are not required by end users.
  
  Please do:
+ 
   * try to adhere to the coding style of files you edit;
   * comment code whose function or rationale is not obvious;
   * update documentation (e.g., ''package.html'' files, this wiki, etc.)
  
  If you need to rename files in your patch:
+ 
   1. Write a shell script that uses 'svn mv' to rename the original files.
   1. Edit files as needed (e.g., to change package names).
   1. Create a patch file with 'svn diff --no-diff-deleted --notice-ancestry'.
   1. Submit both the shell script and the patch file.
+ 
  This way other developers can preview your change by running the script and 
then applying the patch.
  
  ==== Testing your patch ====
- 
  Before submitting your patch, you are encouraged to run the same tools that 
the automated Hudson patch test system will run on your patch.  This enables 
you to fix problems with your patch before you submit it.  The {{{test-patch}}} 
Ant target will run your patch through the same checks that Hudson currently 
does ''except'' for executing the core and contrib unit tests.
  
  To use this target, you must run it from a clean workspace (ie {{{svn stat}}} 
shows no modifications or additions).  From your clean workspace, run:
@@ -154, +152 @@

    -Dpatch.cmd=/path/to/patch \ (optional)
    test-patch
  }}}
- 
  At the end, you should get a message on your console that is similar to the 
comment added to Jira by Hudson's automated patch test system.  The scratch 
directory (which defaults to the value of {{{${user.home}/tmp}}}) will contain 
some output files that will be useful in determining what issues were found in 
the patch.
  
  Some things to note:
+ 
   * the optional cmd parameters will default to the ones in your {{{PATH}}} 
environment variable
   * the {{{grep}}} command must support the -o flag (GNU does)
   * the {{{patch}}} command must support the -E flag
   * you may need to explicitly set ANT_HOME.  Running {{{ant -diagnostics}}} 
will tell you the default value on your system.
  
  ==== Applying a patch ====
- 
  To apply a patch either you generated or found from JIRA, you can issue
+ 
  {{{
  patch -p0 < cool_patch.patch
  }}}
  if you just want to check whether the patch applies you can run patch with 
--dry-run option
+ 
  {{{
  patch -p0 --dry-run < cool_patch.patch
  }}}
- 
  If you are an Eclipse user, you can apply a patch by : 1. Right click project 
name in Package Explorer , 2. Team -> Apply Patch
  
+ ==== Changes that span projects ====
+ You may find that you need to modify both the common project and MapReduce or 
HDFS. Or perhaps you have changed something in common, and need to verify that 
these changes do not break the existing unit tests for HDFS and MapReduce. 
Hadoop's build system integrates with a local maven repository to support 
cross-project development. Use this general workflow for your development:
+ 
+  * Make your changes in common
+  * Run any unit tests there (e.g. 'ant test')
+  * ''Publish'' your new common jar to your local mvn repository:<<BR>>
+  {{{
+ common$ ant clean jar mvn-install
+ }}}
+  * Switch to the dependent project and make any changes there (e.g., that 
rely on a new API you introduced in common).
+  * When you are ready, recompile and test this -- using the local mvn 
repository instead of the public Hadoop repository:<<BR>>
+  {{{
+ mapred$ ant veryclean test -Dresolvers=internal
+ }}}
+ 
+  . The 'veryclean' target will clear the ivy cache used by any previous 
builds and force the build to query the upstream repository. Setting 
-Dresolvers=internal forces Hadoop to check your local build before going 
outside
+ 
+  * Finally, create separate patches for your common and hdfs/mapred changes, 
and file them as separate JIRA issues associated with the appropriate projects.
+ 
  === Contributing your work ===
- 
- Finally, patches should be ''attached'' to an issue report in 
[[http://issues.apache.org/jira/browse/HADOOP|Jira]] via the '''Attach File''' 
link on the issue's Jira. Please add a comment that asks for a code review 
following our [[CodeReviewChecklist| code review checklist]]. Please note that 
the attachment should be granted license to ASF for inclusion in ASF works (as 
per the [[http://www.apache.org/licenses/LICENSE-2.0|Apache License]] §5).
+ Finally, patches should be ''attached'' to an issue report in 
[[http://issues.apache.org/jira/browse/HADOOP|Jira]] via the '''Attach File''' 
link on the issue's Jira. Please add a comment that asks for a code review 
following our [[CodeReviewChecklist|code review checklist]]. Please note that 
the attachment should be granted license to ASF for inclusion in ASF works (as 
per the [[http://www.apache.org/licenses/LICENSE-2.0|Apache License]] §5).
  
  When you believe that your patch is ready to be committed, select the 
'''Submit Patch''' link on the issue's Jira.  Submitted patches will be 
automatically tested against "trunk" by 
[[http://hudson.zones.apache.org/hudson/view/Hadoop/|Hudson]], the project's 
continuous integration engine.  Upon test completion, Hudson will add a success 
("+1") message or failure ("-1") to your issue report in Jira.  If your issue 
contains multiple patch versions, Hudson tests the last patch uploaded.
  
@@ -197, +213 @@

  Should your patch receive a "-1" from the Hudson testing, select the 
'''Resume Progress''' on the issue's Jira, upload a new patch with necessary 
fixes, and then select the '''Submit Patch''' link again.
  
  Committers: for non-trivial changes, it is best to get another committer to 
review your patches before commit.  Use '''Submit Patch''' link like other 
contributors, and then wait for a "+1" from another committer before 
committing.  Please also try to frequently review things in the patch queues:
+ 
   * 
[[https://issues.apache.org/jira/secure/IssueNavigator.jspa?mode=hide&requestId=12311124|Hadoop
 Common Review Queue]]
   * 
[[https://issues.apache.org/jira/secure/IssueNavigator.jspa?mode=hide&requestId=12313301|Hadoop
 HDFS Review Queue]]
   * 
[[https://issues.apache.org/jira/secure/IssueNavigator.jspa?mode=hide&requestId=12313302|Hadoop
 MapReduce Review Queue]]
  
  == Jira Guidelines ==
- 
  Please comment on issues in Jira, making their concerns known.  Please also 
vote for issues that are a high priority for you.
  
  Please refrain from editing descriptions and comments if possible, as edits 
spam the mailing list and clutter Jira's "All" display, which is otherwise very 
useful.  Instead, preview descriptions and comments using the preview button 
(on the right) before posting them.  Keep descriptions brief and save more 
elaborate proposals for comments, since descriptions are included in Jira's 
automatically sent messages.  If you change your mind, note this in a new 
comment, rather than editing an older comment.  The issue should preserve this 
history of the discussion.
  
  == Stay involved ==
- 
  Contributors should join the 
[[http://hadoop.apache.org/core/mailing_lists.html|Hadoop mailing lists]].  In 
particular, the commit list (to see changes as they are made), the dev list (to 
join discussions of changes) and the user list (to help others).
  
  == See Also ==
- 
   * [[http://www.apache.org/dev/contributors.html|Apache contributor 
documentation]]
   * [[http://www.apache.org/foundation/voting.html|Apache voting 
documentation]]

[Hadoop Wiki] Update of "HowToContribute" by AaronKimba ll

Reply via email to