from:"Apache Wiki"

[Hadoop Wiki] Update of "HowToCommit" by MartonElek

2019-02-07 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToCommit" page has been changed by MartonElek:
https://wiki.apache.org/hadoop/HowToCommit?action=diff=42=43

Comment:
fix man doc generation (from ant to mvn)

  
  The end user documentation is maintained in the main repository (hadoop.git) 
and the results are committed to the hadoop-site during each release. The 
website itself is managed in the hadoop-site.git repository (both the source 
and the rendered form).
  
- To commit end-user documentation changes to trunk or a branch, ask the user 
to submit only changes made to the *.xml files in {{{src/docs}}}. Apply that 
patch, run {{{ant docs}}} to generate the html, and then commit. End-user 
documentation is only published to the web when releases are made, as described 
in HowToRelease.
+ To commit end-user documentation create a patch as usual and modify the 
content of src/site directory of any hadoop project (eg. 
./hadoop-common-project/hadoop-auth/src/site).  You can regenerate the docs 
with {{mvn site}}. End-user documentation is only published to the web when 
releases are made, as described in HowToRelease.
  
- To commit changes to the website and re-publish them: {{{
+ To commit changes to the website and re-publish them: 
  
+ {{{
  git clone https://gitbox.apache.org/repos/asf/hadoop-site.git -b asf-site
  #edit site under ./src
  hugo
@@ -101, +102 @@

  
  The commit will be reflected on Apache Hadoop site automatically.
  
- Note: you can check the rendering locally: with 'hugo serve && firefox 
http://localhost:1313' 
+ Note: you can check the rendering locally: with {{hugo serve && firefox 
http://localhost:1313}} 
  
  == Patches that break HDFS, YARN and MapReduce ==
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToCommit" by MartonElek

2019-02-07 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToCommit" page has been changed by MartonElek:
https://wiki.apache.org/hadoop/HowToCommit?action=diff=41=42

Comment:
Update site generation

   * [[http://www.apache.org/dev/new-committers-guide.html|Apache New Committer 
Guide]]
   * [[http://www.apache.org/dev/committers.html|Apache Committer FAQ]]
  
- The first act of a new core committer is typically to add their name to the 
[[http://hadoop.apache.org/common/credits.html|credits]] page.  This requires 
changing the XML source in 
http://svn.apache.org/repos/asf/hadoop/common/site/main/author/src/documentation/content/xdocs/who.xml.
 Once done, update the Hadoop website as described [[#Documentation|here]].
+ The first act of a new core committer is typically to add their name to the 
[[http://hadoop.apache.org/common/credits.html|credits]] page.  This requires 
changing the site source in 
https://github.com/apache/hadoop-site/blob/asf-site/src/who.md. Once done, 
update the Hadoop website as described [[#Documentation|here]] (TLDR; don't 
forget to regenerate the site with hugo, and commit the generated results, too).
  
  
  == Review ==
@@ -79, +79 @@

  <>
   Committing Documentation 
  
- Hadoop's official documentation is authored using 
[[http://forrest.apache.org/|Forrest]].  To commit documentation changes you 
must have Apache Forrest installed, and set the forrest directory on your 
{{{$FORREST_HOME}}}. Note that the current version ([[wget 
http://archive.apache.org/dist/forrest/0.9/apache-forrest-0.9.tar.gz|0.9]]) 
work properly with Java 8. Documentation is of two types:
+ Hadoop's official documentation is authored using 
[[https://gohugo.io/|hugo]].  To commit documentation changes you must have 
Hugo installed (single binary available for all the platforms, part of the 
package repositories, brew/pacman/yum...). Documentation is of two types:
+ 
   1. End-user documentation, versioned with releases; and,
-  1. The website.  This is maintained separately in subversion, republished as 
it is changed.
+  1. The website.  
+ 
+ The end user documentation is maintained in the main repository (hadoop.git) 
and the results are committed to the hadoop-site during each release. The 
website itself is managed in the hadoop-site.git repository (both the source 
and the rendered form).
  
  To commit end-user documentation changes to trunk or a branch, ask the user 
to submit only changes made to the *.xml files in {{{src/docs}}}. Apply that 
patch, run {{{ant docs}}} to generate the html, and then commit. End-user 
documentation is only published to the web when releases are made, as described 
in HowToRelease.
  
  To commit changes to the website and re-publish them: {{{
- svn co https://svn.apache.org/repos/asf/hadoop/common/site
- cd site/main
- $FORREST_HOME/tools/ant/bin/ant -Dforrest.home=$FORREST_HOME # Newer version 
of Ant does not work. Use the Ant bundled with forrest.
- firefox publish/index.html   # preview the changes
- svn stat # check for new pages
- svn add  # add any new pages
- svn commit
+ 
+ git clone https://gitbox.apache.org/repos/asf/hadoop-site.git -b asf-site
+ #edit site under ./src
+ hugo
+ # add both the ./src and ./content directories (source and rendered version)
+ git add .
+ git commit
+ git push 
  }}}
  
  The commit will be reflected on Apache Hadoop site automatically.
+ 
+ Note: you can check the rendering locally: with 'hugo serve && firefox 
http://localhost:1313' 
  
  == Patches that break HDFS, YARN and MapReduce ==
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToRelease" by MartonElek

2019-01-11 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToRelease" page has been changed by MartonElek:
https://wiki.apache.org/hadoop/HowToRelease?action=diff=100=101

Comment:
HADOOP-15205, dist profile is requierd to upload sources to the maven repo

   1. Push branch-X.Y.Z and the newly created tag to the remote repo.
   1. Deploy the maven artifacts, on your personal computer. Please be sure you 
have completed the prerequisite step of preparing the {{{settings.xml}}} file 
before the deployment. You might want to do this in private and clear your 
history file as your gpg-passphrase is in clear text.
   {{{
- mvn deploy -Psign -DskipTests -DskipShade
+ mvn deploy -Psign,dist -DskipTests -DskipShade
  }}}
   1. Copy release files to a public place and ensure they are readable. Note 
that {{{home.apache.org}}} only supports SFTP, so this may be easier with a 
graphical SFTP client like Nautilus, Konqueror, etc.
   {{{

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HadoopJavaVersions" by AkiraAjisaka

2019-01-10 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HadoopJavaVersions" page has been changed by AkiraAjisaka:
https://wiki.apache.org/hadoop/HadoopJavaVersions?action=diff=33=34

+ Moved to Confluence Wiki: 
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Java+Versions
+ 
+ The following contents are deprecated.
+ 
  = Hadoop Java Versions =
  Version 2.7 and later of Apache Hadoop requires Java 7. It is built and 
tested on both OpenJDK and Oracle (HotSpot)'s JDK/JRE.
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToCommit" by AkiraAjisaka

2018-12-30 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToCommit" page has been changed by AkiraAjisaka:
https://wiki.apache.org/hadoop/HowToCommit?action=diff=40=41

Comment:
Git repository is moved, changing the URL

  
  == Commit individual patches ==
  
- Hadoop uses git for the main source. The writable repo is at - 
https://git-wip-us.apache.org/repos/asf/hadoop.git
+ Hadoop uses git for the main source. The writable repo is at - 
https://gitbox.apache.org/repos/asf/hadoop.git
  
   Initial setup 
  We try to keep our history all linear and avoid merge commits. To this end, 
we highly recommend using git pull --rebase. In general, it is a good practice 
to have this ''always'' turned on. If you haven't done so already, you should 
probably run the following:

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "GitAndHadoop" by AkiraAjisaka

2018-12-07 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "GitAndHadoop" page has been changed by AkiraAjisaka:
https://wiki.apache.org/hadoop/GitAndHadoop?action=diff=26=27

Comment:
Fix typo

  Content moved to 
https://cwiki.apache.org/confluence/display/HADOOP/Git+And+Hadoop
  
- Please email common-...@hadoop.apache.org for cwiki access.
+ Please email common-...@hadoop.apache.org for cwiki access.
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToContribute" by AkiraAjisaka

2018-12-05 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToContribute" page has been changed by AkiraAjisaka:
https://wiki.apache.org/hadoop/HowToContribute?action=diff=119=120

Comment:
Fix url

  = How to Contribute to Hadoop =
- Content moved to Confluence - 
https://cwiki.apache.org/confluence/display/HADOOP/HowToContribute
+ Content moved to Confluence - 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
  
  Email common-...@hadoop.apache.org if you need write access to the cwiki.
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "Books" by Packt Publishing

2018-11-28 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Books" page has been changed by Packt Publishing:
https://wiki.apache.org/hadoop/Books?action=diff=54=55

  # Please don't have tracking URLs. We'll only cut them.
  }}}
  
+ === Hands-On Big Data Processing with Hadoop 3 (Video) ===
+ 
+ '''Name:'''  
[[https://www.packtpub.com/big-data-and-business-intelligence/hands-big-data-processing-hadoop-3-video|Hands-On
 Big Data Processing with Hadoop 3 (Video)]]
+ 
+ '''Author:''' Sudhanshu Saxena
+ 
+ '''Publisher:''' Packt
+ 
+ '''Date of Publishing:''' October 2018
+ 
+ Perform real-time data analytics, stream and batch processing on your 
application using Hadoop
+ 
  === Modern Big Data Processing with Hadoop ===
  
  '''Name:'''  [[https://www.amazon.com/dp/B0787KY8RH/|Modern Big Data 
Processing with Hadoop]]

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToRelease" by AkiraAjisaka

2018-11-25 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToRelease" page has been changed by AkiraAjisaka:
https://wiki.apache.org/hadoop/HowToRelease?action=diff=99=100

{{{
  svn ci -m "Publishing the bits for release ${version}"
  }}}
+   1. Usually binary tarball becomes larger than 300MB, so it cannot be 
directly uploaded to the distribution directory. Use the dev directory 
(https://dist.apache.org/repos/dist/dev/hadoop/) first and then move it to the 
distribution directory by {{{svn move}}}.
   1. Update upstream branches to make them aware of this new release:
1. Copy and commit the CHANGES.md and RELEASENOTES.md:
{{{

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToRelease" by MartonElek

2018-10-31 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToRelease" page has been changed by MartonElek:
https://wiki.apache.org/hadoop/HowToRelease?action=diff=98=99

  mvn versions:set -DnewVersion=X.Y.Z
  }}}
  
- Note: Please also also update the hadoop.version property in the root pom.xml 
(see HADOOP-15369)
+ Note: Please also also update the hadoop.version property in the root pom.xml 
and hadoop.assemblies.version in hadoop-project/pom.xml (see HADOOP-15369)
+ 
+ {{{
+ mvn versions:set-property -Dproperty=hadoop.version -DnewVersion=X.Y.Z
+ 
+ mvn versions:set-property -Dproperty=hadoop.assemblies.version 
-DnewVersion=X.Y.Z
+ }}}
  
  Now, for any branches in {trunk, branch-X, branch-X.Y, branch-X.Y.Z} that 
have changed, push them to the remote repo taking care of any conflicts.


-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToRelease" by MartonElek

2018-10-31 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToRelease" page has been changed by MartonElek:
https://wiki.apache.org/hadoop/HowToRelease?action=diff=97=98

Comment:
Reminder to change hadoop.version

  mvn versions:set -DnewVersion=X.Y.Z
  }}}
  
+ Note: Please also also update the hadoop.version property in the root pom.xml 
(see HADOOP-15369)
+ 
  Now, for any branches in {trunk, branch-X, branch-X.Y, branch-X.Y.Z} that 
have changed, push them to the remote repo taking care of any conflicts.
- 
+   
  {{{
  git push  
  }}}

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToRelease" by MartonElek

2018-10-26 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToRelease" page has been changed by MartonElek:
https://wiki.apache.org/hadoop/HowToRelease?action=diff=96=97

Comment:
Update site generation part.

1. effect the release of artifacts by selecting the staged repository and 
then clicking {{{Release}}}
1. If there were multiple RCs, simply drop the staging repositories 
corresponding to failed RCs.
   1. Wait 24 hours for release to propagate to mirrors.
-  1. Edit the website.
+  1. Edit the website (Generic docs about the new website generation can be 
found 
[[[https://cwiki.apache.org/confluence/display/HADOOP/How+to+generate+and+push+ASF+web+site+after+HADOOP-14163|here]])
1. Checkout the website if you haven't already
{{{
- svn co https://svn.apache.org/repos/asf/hadoop/common/site/main 
hadoop-common-site
+ git clone https://gitbox.apache.org/repos/asf/hadoop-site.git -b asf-site
- }}}
+   }}}
-   1. Update the documentation links in 
{{{author/src/documentation/content/xdocs/site.xml}}}.
-   1. Update the release news in 
{{{author/src/documentation/content/xdocs/releases.xml}}}.
-   1. Update the news on the home page 
{{{author/src/documentation/content/xdocs/index.xml}}}.
+   1. [[https://gohugo.io/getting-started/installing/|Install hugo]] if you 
haven't already ((tldr; apt-get install/pacman -S/brew install hugo)) 
+   1. Create the new release announcement
+   {{{
+ cat << EOF > src/release/${VERSION}.md
+ ---
+ title: Release ${VERSION} available
+ date: 201X-XX-XX
+ linked: true
+ ---
+ 
+ 
+ This is the first stable release of Apache Hadoop TODO line. It contains TODO 
bug fixes, improvements and enhancements since TODO.
+ 
+ Users are encouraged to read the [overview of major changes][1] since TODO.
+ For details of 435 bug fixes, improvements, and other enhancements since the 
previous TODO release, 
+ please check [release notes][2] and [changelog][3] 
+  detail the changes since TODO.
+ 
+ [1]: /docs/r${VERSION}/index.html
+ [2]: 
http://hadoop.apache.org/docs/r${VERSION}/hadoop-project-dist/hadoop-common/release/${VERSION}/RELEASENOTES.${VERSION}.html
+ [3]: 
http://hadoop.apache.org/docs/r${VERSION}/hadoop-project-dist/hadoop-common/release/${VERSION}/CHANGES.${VERSION}.html
+ 
+ EOF
+   }}} 
+   1. Note: update all the TODO + the date. '''Don't use date from the 
future''', it won't be rendered.
+   1. Remove the {{{linked: true}}} line from the previous release file, eg. 
from src/release/3.0.0.md. Docs/downloads of the releases with 
{{{linked:true}}} will be linked from the menu.
-   1. Copy the new release docs to svn and update the {{{docs/current}}} link, 
by doing the following:
+   1. add the docs and update the {{{content/docs/current}}} link, by doing 
the following:
{{{
- cd publish/docs
+ cd content/docs
  tar xvf /path/to/hadoop-${version}-site.tar.gz
  # Update current2, current, stable and stable2 as needed.
  # For example
@@ -191, +226 @@

  ln -s current2 current
  }}}
1. Similarly update the symlinks for stable if need be.
-   1. Add the documentation changes.
+   1. Check the rendering of the new site: {{{hugo serve && firefox 
http://localhost:1313}}}
+   1. Regenerate the site, review it, then commit it per the instructions in 
HowToCommit. (The generated HTML files also should be committed. Both src and 
the rendered site are in the same repo.)
{{{
+ hugo
+ git add .
+ git commit
+ git push
- svn add publish/docs/r${version}
- }}}
-   1. Regenerate the site, review it, then commit it per the instructions in 
HowToCommit.
-   {{{
- 
- 
- svn commit -m "Updated site for release X.Y.Z."
  }}}
   1. Send announcements to the user and developer lists once the site changes 
are visible.
   1. --(In JIRA, close issues resolved in the release.  Disable mail 
notifications for this bulk change.)-- Recommend '''not''' closing, since it 
prevents JIRAs from being edited and makes it more difficult to track backports.

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "ContributorsGroup" by SteveLoughran

2018-10-25 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "ContributorsGroup" page has been changed by SteveLoughran:
https://wiki.apache.org/hadoop/ContributorsGroup?action=diff=118=119

Comment:
remove  "Packt Publishing" because of random Python book spam

   * OtisGospodnetic
   * OwenOMalley
   * Pacoffre
-  * Packt Publishing
   * PatrickHunt
   * PatrickKling
   * Paul Broenen

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "Packt Publishing" by SteveLoughran

2018-10-25 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Packt Publishing" page has been changed by SteveLoughran:
https://wiki.apache.org/hadoop/Packt%20Publishing

Comment:
tell packt publishing they've been locked out

New page:
Hi, if this is your account I've locked you out from editing for a while, 
because that last book about Python and tk was clearly not hadoop related

I'll reenable you in a month or two, or you can work out my email address and 
we can discuss what is accepable. 

thanks. SteveLoughran

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "Books" by SteveLoughran

2018-10-25 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Books" page has been changed by SteveLoughran:
https://wiki.apache.org/hadoop/Books?action=diff=53=54

Comment:
cut a random python book out and about to lock down packt for a bit as 
punishment

  
  {{{#!wiki comment/dotted
  Attention people adding new entries.
+ # Only reference books about Hadoop and related programs, not random PHP 
stuff.
  # Please include publishing date and version of Hadoop the book is relevant 
to.
  # Please write this in a neutral voice, not "this book will help you", as 
that implies that the ASF has
  opinions on the matter. Someone will just edit the claims out.
@@ -15, +16 @@

  # Please don't have tracking URLs. We'll only cut them.
  }}}
  
- === Python GUI programming with Tkinter ===
- 
- '''Name:'''  [[https://www.amazon.com/dp/1788835883/|Python GUI Programming 
with Tkinter]]
- 
- '''Author:''' Alan D. Moore
- 
- '''Publisher:''' Packt
- 
- '''Date of Publishing:''' May 2018
- 
- Find out how to create visually stunning and feature-rich applications by 
empowering Python's built-in TKinter GUI toolkit
- 
  === Modern Big Data Processing with Hadoop ===
  
  '''Name:'''  [[https://www.amazon.com/dp/B0787KY8RH/|Modern Big Data 
Processing with Hadoop]]

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "GithubIntegration" by ArpitAgarwal

2018-09-06 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "GithubIntegration" page has been changed by ArpitAgarwal:
https://wiki.apache.org/hadoop/GithubIntegration?action=diff=2=3

Comment:
Removing content and leaving link to cwiki where new content resides.

- = Github Setup and Pull Requests (PRs) =
+ Content moved to 
https://cwiki.apache.org/confluence/display/HADOOP/GitHub+Integration
  
- There are several ways to setup Git for committers and contributors. 
Contributors can safely setup Git any way they choose but committers should 
take extra care since they can push new commits to the trunk at Apache and 
various policies there make backing out mistakes problematic. To keep the 
commit history clean take note of the use of `--squash` below when merging into 
`apache/trunk`.
+ Please email common-...@hadoop.apache.org for cwiki access.
  
- == Git setup for Committers ==
- 
- This describes setup for one local repo and two remotes. It allows you to 
push the code on your machine to either your Github repo or to 
git-wip-us.apache.org. You will want to fork github's apache/hadoop to your own 
account on github, this will enable Pull Requests of your own. Cloning this 
fork locally will set up "origin" to point to your remote fork on github as the 
default remote. So if you perform `git push origin trunk` it will go to github.
- 
- To attach to the apache git repo do the following:
- 
- {{{
- git remote add apache https://git-wip-us.apache.org/repos/asf/hadoop.git
- }}}
- 
- To check your remote setup:
- 
- {{{
- git remote -v
- }}}
- 
- you should see something like this:
- 
- {{{
- originhttps://github.com/your-github-id/hadoop.git (fetch)
- originhttps://github.com/your-github-id/hadoop.git (push)
- apachehttps://git-wip-us.apache.org/repos/asf/hadoop.git (fetch)
- apachehttps://git-wip-us.apache.org/repos/asf/hadoop.git (push)
- }}}
- 
- Now if you want to experiment with a branch everything, by default, points to 
your github account because `origin` is the. You can work as normal using only 
github until you are ready to merge with the apache remote. Some conventions 
will integrate with Apache Jira ticket numbers.
- 
- {{{
- git checkout -b feature/hadoop- # typically is a Jira ticket number
- #do some work on the branch
- git commit -a -m "doing some work"
- git push origin feature/hadoop- # notice pushing to **origin** not 
**apache**
- }}}
- 
- Once you are ready to commit to the apache remote you can merge and push them 
directly or better yet create a PR.
- 
- We recommend creating new branches under `feature/` to help group ongoing 
work, especially now that as of November 2015, forced updates are disabled on 
ASF branches. We hope to reinstate that ability on feature branches to aid 
development.
- 
- == How to create a PR (committers) ==
- 
- Push your branch to Github:
- 
- {{{
- git checkout `feature/hadoop-`
- git rebase apache/trunk # to make it apply to the current trunk
- git push origin `feature/hadoop-`
- }}}
- 
-  1. Go to your `feature/hadoop-` branch on Github. Since you forked it 
from Github's `apache/hadoop` it will default any PR to go to `apache/trunk`.
-  1. Click the green "Compare, review, and create pull request" button.
-  1. You can edit the to and from for the PR if it isn't correct. The "base 
fork" should be `apache/hadoop` unless you are collaborating separately with 
one of the committers on the list. The "base" will be trunk. Don't submit a PR 
to one of the other branches unless you know what you are doing. The "head 
fork" will be your forked repo and the "compare" will be your 
`feature/hadoop-` branch.
-  1. Click the "Create pull request" button and name the request "HADOOP-" 
all caps. This will connect the comments of the PR to the mailing list and Jira 
comments.
- From now on the PR lives on github's `apache/hadoop` repository. You use the 
commenting UI there.
- 
- If you are looking for a review or sharing with someone else say so in the 
comments but don't worry about automated merging of your PR —you will have to 
do that later. The PR is tied to your branch so you can respond to comments, 
make fixes, and commit them from your local repo. They will appear on the PR 
page and be mirrored to Jira and the mailing list.
- When you are satisfied and want to push it to Apache's remote repo proceed 
with Merging a PR
- 
- == How to create a PR (contributors) ==
- 
- Create pull requests: 
[[https://help.github.com/articles/creating-a-pull-request|GitHub PR docs]].
- 
- Pull requests are made to apache/hadoop repository on Github. In the Github 
UI you should pick the trunk branch to target the PR as described for 
committers. This will be reviewed and commented on so the merge is not 
automatic. This can be used for discussing a contributions in progress.
- 
- == Merging a PR (yours or contributors) ==
- 
- Start with reading
-

[Hadoop Wiki] Update of "GitAndHadoop" by ArpitAgarwal

2018-09-06 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "GitAndHadoop" page has been changed by ArpitAgarwal:
https://wiki.apache.org/hadoop/GitAndHadoop?action=diff=25=26

Comment:
Removing content and leaving link to cwiki where new content resides.

- = Git And Hadoop =
+ Content moved to 
https://cwiki.apache.org/confluence/display/HADOOP/Git+And+Hadoop
  
- A lot of people use Git with Hadoop because they have their own patches to 
make to Hadoop, and Git helps them manage it.
+ Please email common-...@hadoop.apache.org for cwiki access.
  
-  * GitHub provide some good lessons on git at [[http://learn.github.com]]
-  * Apache serves up read-only Git versions of their source at 
[[http://git.apache.org/]]. Committers can commit changes to writable Git 
repository. See HowToCommit
- 
- This page tells you how to work with Git. See HowToContribute for 
instructions on building and testing Hadoop.
- <>
- 
- 
- == Key Git Concepts ==
- The key concepts of Git.
- 
-  * Git doesn't store changes, it snapshots the entire source tree. Good for 
fast switch and rollback, bad for binaries. (as an enhancement, if a file 
hasn't changed, it doesn't re-replicate it).
-  * Git stores all "events" as SHA1 checksummed objects; you have deltas, tags 
and commits, where a commit describes the status of items in the tree.
-  * Git is very branch centric; you work in your own branch off local or 
central repositories
-  * You had better enjoy merging.
- 
- 
- == Checking out the source ==
- 
- You need a copy of git on your system. Some IDEs ship with Git support; this 
page assumes you are using the command line.
- 
- Clone a local Git repository from the Apache repository. The Hadoop 
subprojects (common, HDFS, and MapReduce) live inside a combined repository 
called `hadoop.git`.
- 
- {{{
- git clone git://git.apache.org/hadoop.git
- }}}
- 
- '''Committers:''' for read/write access use 
- {{{
- https://git-wip-us.apache.org/repos/asf/hadoop.git
- }}}
- 
- The total download is a few hundred MB, so the initial checkout process works 
best when the network is fast. Once downloaded, Git works offline -though you 
will need to perform your initial builds online so that the build tools can 
download dependencies.
- 
- == Grafts for complete project history ==
- 
- The Hadoop project has undergone some movement in where its component parts 
have been versioned. Because of that, commands like `git log --follow` needs to 
have a little help. To graft the history back together into a coherent whole, 
insert the following contents into `hadoop/.git/info/grafts`:
- 
- {{{
- # Project split
- 5128a9a453d64bfe1ed978cf9ffed27985eeef36 
6c16dc8cf2b28818c852e95302920a278d07ad0c
- 6a3ac690e493c7da45bbf2ae2054768c427fd0e1 
6c16dc8cf2b28818c852e95302920a278d07ad0c
- 546d96754ffee3142bcbbf4563c624c053d0ed0d 
6c16dc8cf2b28818c852e95302920a278d07ad0c
- # Project un-split in new writable git repo
- a196766ea07775f18ded69bd9e8d239f8cfd3ccc 
928d485e2743115fe37f9d123ce9a635c5afb91a
- cd66945f62635f589ff93468e94c0039684a8b6d 
77f628ff5925c25ba2ee4ce14590789eb2e7b85b
- }}}
- 
- You can then use commands like `git blame --follow` with success.
- 
- == Forking onto GitHub ==
- 
- You can create your own fork of the ASF project. This is required if you want 
to contribute patches by submitting pull requests. However you can choose to 
skip this step and attach patch files directly on Apache Jiras.
- 
-  1. Create a GitHub login at http://github.com/ ; Add your public SSH keys
-  1. Go to https://github.com/apache/hadoop/
-  1. Click fork in the github UI. This gives you your own repository URL.
-  1. In the existing clone, add the new repository: 
-  {{{git remote add -f github g...@github.com:MYUSERNAMEHERE/hadoop.git}}}
- 
- This gives you a local repository with two remote repositories: {{{origin}}} 
and {{{github}}}. {{{origin}}} has the Apache branches, which you can update 
whenever you want to get the latest ASF version:
- 
- {{{
-  git checkout -b trunk origin/trunk
-  git pull origin
- }}}
- 
- Your own branches can be merged with trunk, and pushed out to GitHub. To 
generate patches for attaching to Apache JIRAs, check everything in to your 
specific branch, merge that with (a recently pulled) trunk, then diff the two:
- {{{ git diff trunk > ../hadoop-patches/HADOOP-XYX.patch }}}
- 
- 
- == Branching ==
- 
- Git makes it easy to branch. The recommended process for working with Apache 
projects is: one branch per JIRA issue. That makes it easy to isolate 
development and track the development of each change. It does mean if you have 
your own branch that you release, one that merges in more than one issue, you 
have to invest some effort in merging everything in. Try not to make changes in 
different branches that are hard to merge, and learn your way round the git 
rebase command to handle changes across branches. Better yet: do not use rebase 
once you have created a chain of

[Hadoop Wiki] Update of "HowToContribute" by ArpitAgarwal

2018-09-06 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToContribute" page has been changed by ArpitAgarwal:
https://wiki.apache.org/hadoop/HowToContribute?action=diff=118=119

Comment:
Removing content and leaving link to cwiki where new content resides.

  = How to Contribute to Hadoop =
- This page describes the mechanics of ''how'' to contribute software to Apache 
Hadoop.  For ideas about ''what'' you might contribute, please see the 
ProjectSuggestions page.
+ Content moved to Confluence - 
https://cwiki.apache.org/confluence/display/HADOOP/HowToContribute
  
- <>
+ Email common-...@hadoop.apache.org if you need write access to the cwiki.
  
- == Dev Environment Setup ==
- Here are some things you will need to build and test Hadoop. Be prepared to 
invest some time to set up a working Hadoop dev environment. Try getting the 
project to build and test locally first before  you start writing code.
- 
- === Get the source code ===
- First of all, you need the Hadoop source code. The official location for 
Hadoop is the Apache Git repository. See GitAndHadoop
- 
- === Read BUILDING.txt ===
- Once you have the source code, we strongly recommend reading BUILDING.txt 
located in the root of the source tree. It has up to date information on how to 
build Hadoop on various platforms along with some workarounds for 
platform-specific quirks. The latest 
[[https://git-wip-us.apache.org/repos/asf?p=hadoop.git;a=blob;f=BUILDING.txt|BUILDING.txt]]
 for the current trunk can also be viewed on the web.
- 
- 
- === Integrated Development Environment (IDE) ===
- You are free to use whatever IDE you prefer or your favorite text editor. 
Note that:
-  * Building and testing is often done on the command line or at least via the 
Maven support in the IDEs.
-  * Set up the IDE to follow the source layout rules of the project.
-  * Disable any added value "reformat" and "strip trailing spaces" features as 
it can create extra noise when reviewing patches.
- 
- === Build Tools ===
-  * A Java Development Kit. The Hadoop developers recommend 
[[http://java.com/|Oracle Java 8]]. You may also use 
[[http://openjdk.java.net/|OpenJDK]].
-  * Google Protocol Buffers. Check out the ProtocolBuffers guide for help 
installing protobuf.
-  * [[http://maven.apache.org/|Apache Maven]] version 3 or later (for Hadoop 
0.23+)
-  * The Java API javadocs.
- Ensure these are installed by executing {{{mvn}}}, {{{git}}} and {{{javac}}} 
respectively.
- 
- As the Hadoop builds use the external Maven repository to download artifacts, 
Maven needs to be set up with the proxy settings needed to make external HTTP 
requests. The first build of every Hadoop project needs internet connectivity 
to download Maven dependencies.
-  1. Be online for that first build, on a good network
-  1. To set the Maven proxy setttings, see 
http://maven.apache.org/guides/mini/guide-proxies.html
-  1. Because Maven doesn't pass proxy settings down to the Ant tasks it runs 
[[https://issues.apache.org/jira/browse/HDFS-2381|HDFS-2381]] some parts of the 
Hadoop build may fail. The fix for this is to pass down the Ant proxy settings 
in the build Unix: {{{mvn $ANT_OPTS}}}; Windows: {{{mvn %ANT_OPTS%}}}.
-  1. Tomcat is always downloaded, even when building offline.  Setting 
{{{-Dtomcat.download.url}}} to a local copy and {{{-Dtomcat.version}}} to the 
version pointed to by the URL will avoid that download.
- 
- 
- === Native libraries ===
- On Linux, you need the tools to create the native libraries: LZO headers,zlib 
headers, gcc, OpenSSL headers, cmake, protobuf dev tools, and libtool, and the 
GNU autotools (automake, autoconf, etc).
- 
- For RHEL (and hence also CentOS):
- {{{
- yum -y install  lzo-devel  zlib-devel  gcc gcc-c++ autoconf automake libtool 
openssl-devel fuse-devel cmake
- }}}
- 
- For Debian and Ubuntu:
- {{{
- apt-get -y install maven build-essential autoconf automake libtool cmake 
zlib1g-dev pkg-config libssl-dev libfuse-dev
- }}}
- 
- Native libraries are mandatory for Windows. For instructions see 
Hadoop2OnWindows.
- 
- === Hardware Setup ===
-  * Lots of RAM, especially if you are using a modern IDE. ECC RAM is 
recommended in large-RAM systems.
-  * Disk Space. Always handy.
-  * Network Connectivity. Hadoop tests are not guaranteed to all work if a 
machine does not have a network connection -and especially if it does not know 
its own name.
-  * Keep your computer's clock up to date via an NTP server, and set up the 
time zone correctly. This is good for avoiding change-log confusion.
- 
- == Making Changes ==
- Before you start, send a message to the 
[[http://hadoop.apache.org/core/mailing_lists.html|Hadoop developer mailing 
list]], or file a bug report in [[Jira]].  Describe your proposed changes and 
check that they fit in with what others are doing and have planned for the 
project.  Be patient, it may take folks a while to understand your 
requirements.  If you want to

[Hadoop Wiki] Update of "Books" by Packt Publishing

2018-09-06 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Books" page has been changed by Packt Publishing:
https://wiki.apache.org/hadoop/Books?action=diff=52=53

  
  == Hadoop Videos ==
  
+ === Hands-On Big Data Analysis with Hadoop 3 (Video) ===
+ 
+ '''Name:'''  
[[https://www.packtpub.com/big-data-and-business-intelligence/hands-big-data-analysis-hadoop-3-video|Hands-On
 Big Data Analysis with Hadoop 3 (Video)]]
+ 
+ '''Author:''' Tomasz Lelek
+ 
+ '''Publisher:''' Packt
+ 
+ '''Date of Publishing:''' August 2018
+ 
+ Perform real-time data analytics with Hadoop
+ 
+ 
  === Hands-On Beginner’s Guide on Big Data and Hadoop 3 (Video) ===
  
  '''Name:'''  
[[https://www.packtpub.com/application-development/hands-beginner%E2%80%99s-guide-big-data-and-hadoop-3-video|Hands-On
 Beginner’s Guide on Big Data and Hadoop 3 (Video)]]

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "Books" by Packt Publishing

2018-08-13 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Books" page has been changed by Packt Publishing:
https://wiki.apache.org/hadoop/Books?action=diff=51=52

  
  == Hadoop Videos ==
  
+ === Hands-On Beginner’s Guide on Big Data and Hadoop 3 (Video) ===
+ 
+ '''Name:'''  
[[https://www.packtpub.com/application-development/hands-beginner%E2%80%99s-guide-big-data-and-hadoop-3-video|Hands-On
 Beginner’s Guide on Big Data and Hadoop 3 (Video)]]
+ 
+ '''Author:''' Milind Jagre
+ 
+ '''Publisher:''' Packt
+ 
+ '''Date of Publishing:''' July 2018
+ 
+ Effectively store, manage, and analyze large Datasets with HDFS, SQOOP, YARN, 
and MapReduce
+ 
+ 
  === Hadoop Administration and Cluster Management (Video) ===
  
  '''Name:'''  
[[https://www.packtpub.com/big-data-and-business-intelligence/hadoop-administration-and-cluster-management-video|Hadoop
 Administration and Cluster Management (Video)]]

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "AmazonS3" by SteveLoughran

2018-06-27 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "AmazonS3" page has been changed by SteveLoughran:
https://wiki.apache.org/hadoop/AmazonS3?action=diff=23=24

Comment:
purge down to the minimum, point people at troubleshooting, tell them not to 
mix JARs.

  = S3 Support in Apache Hadoop =
  
+ Apache Hadoop ships with a connector to S3 called "S3A", with the url prefix 
"s3a:"; its previous connectors "s3", and "s3n" are deprecated and/or deleted 
from recent Hadoop versions.
- [[http://aws.amazon.com/s3|Amazon S3]] (Simple Storage Service) is a data 
storage service. You are billed
- monthly for storage and data transfer. Transfer between S3 and [[AmazonEC2]] 
instances in the same geographical location are free. Most importantly, the 
data is preserved when a transient Hadoop cluster is shut down
  
- This makes use of S3 common in Hadoop clusters on EC2. It is also used 
sometimes for backing up remote cluster.
- 
- Hadoop provides multiple filesystem clients for reading and writing to and 
from Amazon S3 or compatible service.
+  1. Consult the 
[[http://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html|Latest
 Hadoop documentation]] for the specifics on using any the S3A connector.
+  1. For Hadoop 2.x releases, the latest 
[[https://github.com/apache/hadoop/blob/branch-2/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md|troubleshooting
 documentation]].
+  1. For Hadoop 3.x releases, the latest 
[[https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/troubleshooting_s3a.md|troubleshooting
 documentation]].
  
  
- === Recommended: S3A (URI scheme: s3a://) - Hadoop 2.7+ ===
+ == S3 Support in Amazon EMR ==
  
+ Amazon's EMR Service is based upon Apache Hadoop, but contains modifications 
and their own closed-source S3 client. Consult 
[[http://docs.aws.amazon.com/ElasticMapReduce/latest/ManagementGuide/emr-plan-file-systems.html|Amazon's
 documentation on this]].
+ Only Amazon can provide support and/or field bug reports related to their S3 
support.
- '''S3A is the recommended S3 Client for Hadoop 2.7 and later'''
- 
- A successor to the S3 Native, s3n:// filesystem, the S3a: system uses 
Amazon's libraries to interact with S3. This allows S3a to support larger files 
(no more 5GB limit), higher performance operations and more. The filesystem is 
intended to be a replacement for/successor to S3 Native: all objects accessible 
from s3n:// URLs should also be accessible from s3a simply by replacing the URL 
schema.
- 
- S3A has been usable in production since Hadoop 2.7, and is undergoing active 
maintenance for enhanced security, scalability and performance.
- 
- History
- 
-  1. Hadoop 2.6: Initial Implementation: 
[[https://issues.apache.org/jira/browse/HADOOP-10400|HADOOP-10400]]
-  2. Hadoop 2.7: Production Ready: 
[[https://issues.apache.org/jira/browse/HADOOP-11571|HADOOP-11571]]
-  3. Hadoop 2.8: Performance, robustness and security 
[[https://issues.apache.org/jira/browse/HADOOP-11694|HADOOP-11694]]
-  4. Hadoop 2.9: Even more features: 
[[https://issues.apache.org/jira/browse/HADOOP-13204|HADOOP-13204]]
- 
- July 2016: For details of ongoing work on S3a, consult 
[[www.slideshare.net/HadoopSummit/hadoop-cloud-storage-object-store-integration-in-production|Hadoop
 & Cloud Storage: Object Store Integration in Production]]
- 
- '''important:''' S3A requires the exact version of the amazon-aws-sdk against 
which Hadoop was built (and is bundled with). If you try to upgrade the library 
by dropping in a later version, things will break.
  
  
- === Unmainteained: S3N FileSystem (URI scheme: s3n://) ===
+ == Important: Classpath setup ==
  
+  1. The S3A connector is implemented in the hadoop-aws JAR. If it is not on 
the classpath: stack trace.
+  1. Do not attempt to mix a "hadoop-aws" version with other hadoop artifacts 
from different versions. They must be from exactly the same release. Otherwise: 
stack trace.
+  1. The S3A connector is depends on AWS SDK JARs. If they are not on the 
classpath: stack trace.
+  1. Do not attempt to use an amazon S3 SDK JAR different from the one which 
the hadoop version was built with. Otherwise: stack trace highly likely.
+  1. The normative list of dependencies of a specific version of the 
hadoop-aws JAR are stored in Maven, which can be viewed on 
[[http://mvnrepository.com/artifact/org.apache.hadoop/hadoop-aws|mvnrepsitory]].
- '''S3N is the S3 Client for Hadoop 2.6 and earlier. From Hadoop 2.7+, switch 
to s3a'''
- 
- A native filesystem for reading and writing regular files on S3.With this 
filesystem is that you can access files on S3 that were written with other 
tools. Conversely, other tools can access files written using Hadoop. The S3N 
code is stable and widely used, but is not adding any new features (which is 
why it remains stable).
- 
- S3N requires a compatible version of the jets3t

[Hadoop Wiki] Update of "Books" by Packt Publishing

2018-06-07 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Books" page has been changed by Packt Publishing:
https://wiki.apache.org/hadoop/Books?action=diff=50=51

  
  == Hadoop Videos ==
  
+ === Hadoop Administration and Cluster Management (Video) ===
+ 
+ '''Name:'''  
[[https://www.packtpub.com/big-data-and-business-intelligence/hadoop-administration-and-cluster-management-video|Hadoop
 Administration and Cluster Management (Video)]]
+ 
+ '''Author:''' Gurmukh Singh
+ 
+ '''Publisher:''' Packt
+ 
+ '''Date of Publishing:''' May 2018
+ 
+ Planning, deploying, managing, monitoring and performance-tuning your Hadoop 
cluster with Apache Hadoop
+ 
+ 
  === Solving 10 Hadoop'able Problems (Video) ===
  
  '''Name:'''  
[[https://www.packtpub.com/big-data-and-business-intelligence/solving-10-hadoopable-problems-video|Solving
 10 Hadoop'able Problems (Video)]]
@@ -487, +500 @@

  Need solutions to your big data problems? Here are 10 real-world projects 
demonstrating problems solved using Hadoop
  
  === Learn By Example: Hadoop, MapReduce for Big Data problems (Video) ===
- 
+ 
  '''Name:'''  
[[https://www.packtpub.com/big-data-and-business-intelligence/learn-example-hadoop-mapreduce-big-data-problems-video|Learn
 By Example: Hadoop, MapReduce for Big Data problems (Video)]]
  
  '''Author:''' Loonycorn

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "Books" by Packt Publishing

2018-06-07 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Books" page has been changed by Packt Publishing:
https://wiki.apache.org/hadoop/Books?action=diff=49=50

  # Please don't have tracking URLs. We'll only cut them.
  }}}
  
+ === Python GUI programming with Tkinter ===
+ 
+ '''Name:'''  [[https://www.amazon.com/dp/1788835883/|Python GUI Programming 
with Tkinter]]
+ 
+ '''Author:''' Alan D. Moore
+ 
+ '''Publisher:''' Packt
+ 
+ '''Date of Publishing:''' May 2018
+ 
+ Find out how to create visually stunning and feature-rich applications by 
empowering Python's built-in TKinter GUI toolkit
+ 
  === Modern Big Data Processing with Hadoop ===
  
  '''Name:'''  [[https://www.amazon.com/dp/B0787KY8RH/|Modern Big Data 
Processing with Hadoop]]
@@ -24, +36 @@

  '''Publisher:''' Packt
  
  '''Date of Publishing:''' March 2018
+ 
+ A comprehensive guide to design, build and execute effective Big Data 
strategies using Hadoop
  
  === Deep Learning with Hadoop ===
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToReleasePreDSBCR" by KonstantinShvachko

2018-04-17 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToReleasePreDSBCR" page has been changed by KonstantinShvachko:
https://wiki.apache.org/hadoop/HowToReleasePreDSBCR?action=diff=89=90

Comment:
Change the deploy command so that it uploads source and javadoc artifacts on 
Nexus.

   1. --(Use 
[[https://builds.apache.org/job/HADOOP2_Release_Artifacts_Builder|this Jenkins 
job]] to create the final release files)--
Create final release files
   {{{
- mvn clean package -Psrc -Pdist -Pnative -Dtar -DskipTests
+ mvn clean deploy -Psign,src,dist,native -Dtar -DskipTests
- mvn deploy -Psign -DskipTests
  mvn site site:stage -DskipTests
  }}}
+  1. Make sure that on [[https://repository.apache.org|Nexus]] all artifacts 
have corresponding sources and javaDoc jars. 
   1. Copy release files to the distribution directory
1. Check out the corresponding svn repo if need be
{{{

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "Books" by Packt Publishing

2018-04-09 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Books" page has been changed by Packt Publishing:
https://wiki.apache.org/hadoop/Books?action=diff=48=49

  
  === Modern Big Data Processing with Hadoop ===
  
- '''Name:'''  
[[https://www.packtpub.com/big-data-and-business-intelligence/modern-big-data-processing-hadoop|Modern
 Big Data Processing with Hadoop]]
+ '''Name:'''  [[https://www.amazon.com/dp/B0787KY8RH/|Modern Big Data 
Processing with Hadoop]]
  
  '''Author:''' V. Naresh Kumar, Prashant Shindgikar
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "Books" by Packt Publishing

2018-04-08 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Books" page has been changed by Packt Publishing:
https://wiki.apache.org/hadoop/Books?action=diff=47=48

  # Please don't have tracking URLs. We'll only cut them.
  }}}
  
+ === Modern Big Data Processing with Hadoop ===
+ 
+ '''Name:'''  
[[https://www.packtpub.com/big-data-and-business-intelligence/modern-big-data-processing-hadoop|Modern
 Big Data Processing with Hadoop]]
+ 
+ '''Author:''' V. Naresh Kumar, Prashant Shindgikar
+ 
+ '''Publisher:''' Packt
+ 
+ '''Date of Publishing:''' March 2018
  
  === Deep Learning with Hadoop ===
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "PoweredBy" by XingWang

2018-04-05 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "PoweredBy" page has been changed by XingWang:
https://wiki.apache.org/hadoop/PoweredBy?action=diff=444=445

Comment:
added Moesif.com. 

 * ''Automatic PDF creation & IR ''
 * ''2 node cluster (Windows Vista/CYGWIN, & CentOS) for developing 
MapReduce programs. ''
  
+  * ''[[https://www.moesif.com/|Moesif API Insights]] ''
+   * ''We use Hadoop for ETL and processing time series event data for 
alerts/notifications along with visualizations for frontend.''
+   * ''2 master nodes and 6 data nodes running on Azure using HDInsight''
+ 
   * ''[[http://www.mylife.com/|MyLife]] ''
* ''18 node cluster (Quad-Core AMD Opteron 2347, 1TB/node storage) ''
* ''Powers data for search and aggregation ''

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "Books" by Packt Publishing

2018-03-09 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Books" page has been changed by Packt Publishing:
https://wiki.apache.org/hadoop/Books?action=diff=46=47

  
  
  == Hadoop Videos ==
+ 
+ === Solving 10 Hadoop'able Problems (Video) ===
  
+ '''Name:'''  
[[https://www.packtpub.com/big-data-and-business-intelligence/solving-10-hadoopable-problems-video|Solving
 10 Hadoop'able Problems (Video)]]
+ 
+ '''Author:''' Tomasz Lelek
+ 
+ '''Publisher:''' Packt
+ 
+ '''Date of Publishing:''' February 2018
+ 
+ Need solutions to your big data problems? Here are 10 real-world projects 
demonstrating problems solved using Hadoop
+ 
  === Learn By Example: Hadoop, MapReduce for Big Data problems (Video) ===
- 
+ 
  '''Name:'''  
[[https://www.packtpub.com/big-data-and-business-intelligence/learn-example-hadoop-mapreduce-big-data-problems-video|Learn
 By Example: Hadoop, MapReduce for Big Data problems (Video)]]
  
  '''Author:''' Loonycorn

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "Books" by Packt Publishing

2018-02-21 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Books" page has been changed by Packt Publishing:
https://wiki.apache.org/hadoop/Books?action=diff=45=46

  
  == Hadoop Videos ==
  
+ === Learn By Example: Hadoop, MapReduce for Big Data problems (Video) ===
+ 
+ '''Name:'''  
[[https://www.packtpub.com/big-data-and-business-intelligence/learn-example-hadoop-mapreduce-big-data-problems-video|Learn
 By Example: Hadoop, MapReduce for Big Data problems (Video)]]
+ 
+ '''Author:''' Loonycorn
+ 
+ '''Publisher:''' Packt
+ 
+ '''Date of Publishing:''' Jan 2018
+ 
+ A hands-on workout in Hadoop, MapReduce and the art of thinking "parallel"
  
  === The Ultimate Hands-on Hadoop (Video) ===
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "Books" by SteveLoughran

2018-02-13 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Books" page has been changed by SteveLoughran:
https://wiki.apache.org/hadoop/Books?action=diff=44=45

Comment:
revert. That's Hbase, not hadoop. And I'm thinking we should cut all videos out 
from here

  
  
  == Hadoop Videos ==
- === Learn by Example : HBase - The Hadoop Database (Video) ===
  
- '''Name:'''  
[[https://www.packtpub.com/application-development/learn-example-hbase-hadoop-database-video|Learn
 by Example : HBase - The Hadoop Database (Video)]]
- 
- '''Author:''' Loonycorn
- 
- '''Publisher:''' Packt
- 
- '''Date of Publishing:''' December  2017
- 
- 25 solved examples to get you up to speed with HBase
  
  === The Ultimate Hands-on Hadoop (Video) ===
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "Books" by Packt Publishing

2018-01-17 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Books" page has been changed by Packt Publishing:
https://wiki.apache.org/hadoop/Books?action=diff=43=44

  
  
  == Hadoop Videos ==
+ === Learn by Example : HBase - The Hadoop Database (Video) ===
+ 
+ '''Name:'''  
[[https://www.packtpub.com/application-development/learn-example-hbase-hadoop-database-video|Learn
 by Example : HBase - The Hadoop Database (Video)]]
+ 
+ '''Author:''' Loonycorn
+ 
+ '''Publisher:''' Packt
+ 
+ '''Date of Publishing:''' December  2017
+ 
+ 25 solved examples to get you up to speed with HBase
+ 
  === The Ultimate Hands-on Hadoop (Video) ===
  
  '''Name:'''  
[[https://www.packtpub.com/big-data-and-business-intelligence/ultimate-hands-hadoop-video
 | The Ultimate Hands-on Hadoop (Video)]]

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToReleasePreDSBCR" by KonstantinShvachko

2017-12-16 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToReleasePreDSBCR" page has been changed by KonstantinShvachko:
https://wiki.apache.org/hadoop/HowToReleasePreDSBCR?action=diff=88=89

1. Update the symlinks to current2 and stable2. The release directory 
usually contains just two releases, the most recent from two branches.
1. Commit the changes (it requires a PMC privilege)
{{{
+ svn add hadoop-${version}
  svn ci -m "Publishing the bits for release ${version}"
  }}}
   1. In [[https://repository.apache.org|Nexus]]

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToReleasePreDSBCR" by KonstantinShvachko

2017-12-15 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToReleasePreDSBCR" page has been changed by KonstantinShvachko:
https://wiki.apache.org/hadoop/HowToReleasePreDSBCR?action=diff=87=88

  git tag -s rel/release-X.Y.Z -m "Hadoop X.Y.Z release"
  git push origin rel/release-X.Y.Z
  }}}
-  1. Use 
[[https://builds.apache.org/job/HADOOP2_Release_Artifacts_Builder|this Jenkins 
job]] to create the final release files
+  1. --(Use 
[[https://builds.apache.org/job/HADOOP2_Release_Artifacts_Builder|this Jenkins 
job]] to create the final release files)--
+   Create final release files
+  {{{
+ mvn clean package -Psrc -Pdist -Pnative -Dtar -DskipTests
+ mvn deploy -Psign -DskipTests
+ mvn site site:stage -DskipTests
+ }}}
   1. Copy release files to the distribution directory
1. Check out the corresponding svn repo if need be
{{{

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToCommit" by EricYang

2017-12-15 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToCommit" page has been changed by EricYang:
https://wiki.apache.org/hadoop/HowToCommit?action=diff=39=40

   1. Set the assignee if it is not set. If you cannot set the contributor to 
the assignee, you need to add the contributor into Contributors role in the 
project. Please see [[#Roles|Adding Contributors role]] for the detail.
  
  This How-to-commit 
[[http://www.youtube.com/watch?v=txW3m7qWdzw=youtu.be|video]] has 
guidance on the commit process, albeit using svn. Most of the process is still 
the same, except that we now use git instead. 
+ 
+  Merging a feature branch 
+ When merging a feature branch to trunk, use no fast forward option to provide 
a single commit to digest history of the feature branch.  Commit history of 
feature branch will remain in feature branch.
+ {{{
+ # Start a new feature
+ git checkout -b new-feature trunk
+ # Edit some files
+ git add 
+ git commit -m "Start a feature"
+ # Edit some files
+ git add 
+ git commit -m "Finish a feature"
+ # Merge in the new-feature branch
+ git checkout trunk
+ git merge --no-ff new-feature
+ }}}
  
  <>
   Committing Documentation 

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToReleasePreDSBCR" by KonstantinShvachko

2017-12-07 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToReleasePreDSBCR" page has been changed by KonstantinShvachko:
https://wiki.apache.org/hadoop/HowToReleasePreDSBCR?action=diff=86=87


  
  }}}
+  1. Verify that CHANGES.txt reflect all relevant commits since the previous 
release. Add and commit missing ones to CHANGES.txt.
  
  = Branching =
  When releasing Hadoop X.Y.Z, the following branching changes are required. 
Note that a release can match more than one of the following if-conditions. For 
a major release, one needs to make the changes for minor and point releases as 
well. Similarly, a new minor release is also a new point release.

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "PoweredBy" by DavidTing

2017-12-07 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "PoweredBy" page has been changed by DavidTing:
https://wiki.apache.org/hadoop/PoweredBy?action=diff=443=444

* ''Data mining ''
* ''Machine learning ''
  
+  * ''[[https://fquotes.com/|FQuotes]] ''
+   * ''We use Hadoop for analyzing quotes, quote authors and quote topics. ''
+ 
   * ''[[http://freestylers.jp/|Freestylers]] - Image retrieval engine ''
* ''We, the Japanese company Freestylers, use Hadoop to build the image 
processing environment for image-based product recommendation system mainly on 
Amazon EC2, from April 2009. ''
* ''Our Apache Hadoop environment produces the original database for fast 
access from our web application. ''

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "Ozone" by ArpitAgarwal

2017-10-17 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Ozone" page has been changed by ArpitAgarwal:
https://wiki.apache.org/hadoop/Ozone?action=diff=5=6

- The ''Ozone Quick Start Guide'' has moved to the 
[[https://cwiki.apache.org/confluence/display/HADOOP/Ozone|Apache Confluence 
wiki]].
+ The [[https://cwiki.apache.org/confluence/display/HADOOP/Ozone|Ozone Quick 
Start Guide]] has moved to the ''Apache Confluence wiki''.
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "Ozone" by ArpitAgarwal

2017-10-17 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Ozone" page has been changed by ArpitAgarwal:
https://wiki.apache.org/hadoop/Ozone?action=diff=4=5

- <>
+ The ''Ozone Quick Start Guide'' has moved to the 
[[https://cwiki.apache.org/confluence/display/HADOOP/Ozone|Apache Confluence 
wiki]].
  
- = Introduction =
- Ozone is an Object Store for Hadoop that is currently under development. See 
the Ozone Apache Jira 
[[https://issues.apache.org/jira/browse/HDFS-7240|HDFS-7240]] for more details. 
Ozone is currently in a prototype phase.
- 
- This wiki page is intended as a guide for Ozone contributors.
- 
- = Compiling Ozone =
- Setup your development environment if you haven't done so already 
([[https://wiki.apache.org/hadoop/HowToContribute|Instructions here]]). Switch 
to the HDFS-7240 branch, apply the in-progress patch for 
[[https://issues.apache.org/jira/browse/HDFS-10363|HDFS-10363]] and build a 
Hadoop distribution as usual.
- 
- = Configuration =
- Create a new ozone-site.xml file in your Hadoop configuration directory and 
add the following settings for a bare minimal configuration.
- 
- {{{
- 
-   
- ozone.enabled
- true
-   
- 
-   
- ozone.handler.type
- local
-   
- 
-   
- ozone.scm.client.address
- 127.0.0.1:9860
-   
- 
- }}}
- 
- The default client port is 9860 and the default service port is 9861. These 
ports are used by clients and DataNodes respectively to connect to the 
StorageContainerManager service.
- 
- These port numbers can be changed with the `ozone.scm.client.address` and 
`ozone.scm.datanode.address` settings respectively.
- 
- = Starting Services =
- Format the HDFS NameNode and start the NameNode and DataNode services as 
usual. Then stop the NameNode and start the Ozone StorageContainerManager using 
the shell command
- {{{
- $ hdfs --daemon start scm
- }}}
- 
- The requirement to first start then stop the NameNode will be fixed soon.
- 
- = Performing Ozone REST operations =
- 
[[https://issues.apache.org/jira/secure/attachment/12799549/ozone_user_v0.pdf|Ozone
 REST API specification]]
- 

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToRelease" by AndrewWang

2017-10-03 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToRelease" page has been changed by AndrewWang:
https://wiki.apache.org/hadoop/HowToRelease?action=diff=95=96

  svn commit -m "Updated site for release X.Y.Z."
  }}}
   1. Send announcements to the user and developer lists once the site changes 
are visible.
-  1. In JIRA, close issues resolved in the release.  Disable mail 
notifications for this bulk change.
+  1. --(In JIRA, close issues resolved in the release.  Disable mail 
notifications for this bulk change.)-- Recommend '''not''' closing, since it 
prevents JIRAs from being edited and makes it more difficult to track backports.
  
  = See Also =
   * [[http://www.apache.org/dev/release.html|Apache Releases FAQ]]

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToRelease" by AndrewWang

2017-09-28 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToRelease" page has been changed by AndrewWang:
https://wiki.apache.org/hadoop/HowToRelease?action=diff=94=95

Comment:
More detailed instructions on how to bulk update JIRA versions

  
  = Preparation =
   1. If you have not already done so, 
[[http://www.apache.org/dev/release-signing.html#keys-policy|append your code 
signing key]] to the 
[[https://dist.apache.org/repos/dist/release/hadoop/common/KEYS|KEYS]] file. 
Once you commit your changes, they will automatically be propagated to the 
website. Also 
[[http://www.apache.org/dev/release-signing.html#keys-policy|upload your key to 
a public key server]] if you haven't. End users use the KEYS file (along with 
the [[http://www.apache.org/dev/release-signing.html#web-of-trust|web of 
trust]]) to validate that releases were done by an Apache committer. For more 
details on signing releases, see 
[[http://www.apache.org/dev/release-signing.html|Signing Releases]] and 
[[http://www.apache.org/dev/mirror-step-by-step.html?Step-By-Step|Step-By-Step 
Guide to Mirroring Releases]].
-  1. Bulk update JIRA to unassign from this release all issues that are open 
non-blockers
+  1. Bulk update JIRA to unassign from this release all issues that are open 
non-blockers. This is involved since you can only bulk change issues within the 
same project, so minimally requires four bulk changes for each of HADOOP, HDFS, 
MAPREDUCE, and YARN. Editing the "Target Version/s" field is also a blind 
write, so you need to be careful not to lose any other fix versions that are 
set. For updating 3.0.0-beta1 to 3.0.0, the process looked like this:
+   1. Start with this query: 
+   {{{
+ project in (HADOOP, HDFS, YARN, MAPREDUCE) AND "Target Version/s" = 
3.0.0-beta1 and statusCategory != Done
+ }}}
+   1. Filter this list down until it's only issues with a Target Version of 
just "3.0.0-beta1". My query ended up looking like:
+   {{{
+ project in (HADOOP, HDFS, YARN, MAPREDUCE) AND "Target Version/s" = 
3.0.0-beta1 and "Target Versions/" not in (2.9.0, 2.8.3, 2.8.2) AND 
statusCategory != Done
+ }}}
+   1. Do the bulk update for each project individually to set the target 
version to 3.0.0.
+   1. Check the query for the next most common set of target versions and 
again filter it down:
+   {{{
+ project in (HADOOP, HDFS, YARN, MAPREDUCE) AND "Target Version/s" = 
3.0.0-beta1 and "Target Version/s" = 2.9.0 and statusCategory != Done
+ project in (HADOOP, HDFS, YARN, MAPREDUCE) AND "Target Version/s" = 
3.0.0-beta1 and "Target Version/s" = 2.9.0 and "Target Version/s" not in 
(2.8.2, 2.8.3) and statusCategory != Done
+ }}}
+   1. Do the bulk update for each project individually to set the target 
version field to (3.0.0, 2.9.0).
+   1. Return to the original query. If there aren't too many, update the 
remaining straggler issues by hand (faster than doing the bulk edits):
+   {{{
+ project in (HADOOP, HDFS, YARN, MAPREDUCE) AND "Target Version/s" = 
3.0.0-beta1 and statusCategory != Done
+ }}}
+ 
   1. Send follow-up notification to the developer list that this was done.
   1. To deploy artifacts to the Apache Maven repository create 
{{{~/.m2/settings.xml}}}:
   {{{

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "Books" by Packt Publishing

2017-07-26 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Books" page has been changed by Packt Publishing:
https://wiki.apache.org/hadoop/Books?action=diff=42=43

Comment:
New video course added

  
  
  == Hadoop Videos ==
+ === The Ultimate Hands-on Hadoop (Video) ===
  
+ '''Name:'''  
[[https://www.packtpub.com/big-data-and-business-intelligence/ultimate-hands-hadoop-video
 | The Ultimate Hands-on Hadoop (Video)]]
  
+ '''Author:''' Frank Kane
+ 
+ '''Publisher:''' Packt
+ 
+ '''Date of Publishing:''' June 2017
+ 
+ Design distributed systems that manage Big Data using Hadoop and related 
technologies.
+ 
  === Getting Started with Hadoop 2.x (Video) ===
  
  '''Name:'''  
[[https://www.packtpub.com/networking-and-servers/getting-started-hadoop-2x-video|Getting
 Started with Hadoop 2.x  (Video)]]

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToRelease" by AndrewWang

2017-05-26 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToRelease" page has been changed by AndrewWang:
https://wiki.apache.org/hadoop/HowToRelease?action=diff=93=94

Comment:
Fix skipShade profile for deploy step

   1. Push branch-X.Y.Z and the newly created tag to the remote repo.
   1. Deploy the maven artifacts, on your personal computer. Please be sure you 
have completed the prerequisite step of preparing the {{{settings.xml}}} file 
before the deployment. You might want to do this in private and clear your 
history file as your gpg-passphrase is in clear text.
   {{{
- mvn deploy -Psign -DskipTests -DskipShading
+ mvn deploy -Psign -DskipTests -DskipShade
  }}}
   1. Copy release files to a public place and ensure they are readable. Note 
that {{{home.apache.org}}} only supports SFTP, so this may be easier with a 
graphical SFTP client like Nautilus, Konqueror, etc.
   {{{

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "Books" by Packt Publishing

2017-05-05 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Books" page has been changed by Packt Publishing:
https://wiki.apache.org/hadoop/Books?action=diff=41=42

Comment:
Added new video

  '''Publisher:''' Manning
  
  '''Date of Publishing (est.):''' October 2015
- 
- 
+ 
+ 
  
  == Hadoop Videos ==
  
  
+ === Getting Started with Hadoop 2.x (Video) ===
+ 
+ '''Name:'''  
[[https://www.packtpub.com/networking-and-servers/getting-started-hadoop-2x-video|Getting
 Started with Hadoop 2.x  (Video)]]
+ 
+ '''Author:''' A K M Zahiduzzaman
+ 
+ '''Publisher:''' Packt
+ 
+ '''Date of Publishing:''' April 30, 2017
+ 
+ Build a strong foundation by exploring Hadoop ecosystem with real-world 
examples.
+ 
  === Taming Big Data with MapReduce and Hadoop - Hands On! (Video) ===
  
  '''Name:'''  
[[https://www.packtpub.com/big-data-and-business-intelligence/taming-big-data-mapreduce-and-hadoop-hands-video|Taming
 Big Data with MapReduce and Hadoop - Hands On! (Video)]]
@@ -463, +475 @@

  '''Date of Publishing:''' September 12, 2016
  
  Master the art of processing Big Data using Hadoop and MapReduce with the 
help of real-world examples.
- 
+ 
+ 
  
  Hadoop in Action introduces the subject and shows how to write programs in 
the MapReduce style. It starts with a few easy examples and then moves quickly 
to show Hadoop use in more complex data analysis tasks. Included are best 
practices and design patterns of MapReduce programming.
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "S3ABadRequest" by SteveLoughran

2017-04-18 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "S3ABadRequest" page has been changed by SteveLoughran:
https://wiki.apache.org/hadoop/S3ABadRequest

Comment:
Initial set of Bad Request causes

New page:
a5a4867f3b HADOOP-14120

= Troubleshooting S3A Bad Request Errors =

The S3A client can see the error message "Bad Request" for many reasons —it is 
the standard response from
Amazon S3 if it could not satisfy the request *for any reason*.

The main issues are covered in the 
[[http://hadoop.apache.org/docs/current//hadoop-aws/tools/hadoop-aws/index.html#Troubleshooting_S3A|Troubleshooting
 S3A]] section of the hadoop-aws module's documentation.


== Common Causes of Bad Request Error Messages ==

=== Credentials ===

 * Your credentials are wrong.
 * Somehow the credentials have not been set properly before the S3A Filesystem 
instance was created. As a single instance per bucket is created per-JVM, the 
first configuration used to connect to a bucket is the one used thereafter.
 * You've been trying to set the credentials in the URI, but got the 
URL-escaping wrong. Stop trying to do that, it's a security disaster. Embrace 
per-bucket configuration.
 * You are trying to use per-bucket configuration for the credentials, but got 
the bucket name wrong there.
 * You are using session credentials, and the session has expired.


=== Endpoints ===

 * You are trying to use a V4 auth endpoint without declaring the endpoint of 
that region in the {{{fs.s3a.endpoint}}}.
 * You are trying to use a V3 auth endpoint but have set up S3 to use an 
explicit V4 auth endpoint. As they do not redirect to the central endpoint, you 
must declare the relevant endpoint explicitly.
 * You are trying to use a private S3 service but have forgotten to set the 
{{{fs.s3a.endpoint}}}; AWS is rejecting your private login.
 * You are trying to talk to a private S3 service but somehow it is talking to 
an HTTP page rather than an implementation of the S3 REST API.

=== Encryption ===

 * You are trying to use SSE-C with a key that cannot decrypt the remote data.
 * You are trying to work with a bucket which is configured to require 
encryption, but the client doesn't use it.

=== Classpath ===

 * A version of Joda-time incompatible with the JVM is on the classpath. It 
must be version 2.9.1 or later.

=== System ===

 * The client machine doesn't know when it is. Check the clock and the timezone 
settings.
 * Your DNS setup is returning the wrong IP address for the endpoint.
 * Your network is a mess.

As you can see, there is a wide variety of possible causes, spread across: 
credential setup, endpoint configuration, system configuration and other 
aspects of the S3A client. We are hampered in helping diagnose this by the need 
to keep those credentials secret.

== Logging at lower levels ==

The AWS SDK and the Apache HTTP components can be configured to log at
more detail, as can S3A itself.

{{{
log4j.logger.org.apache.hadoop.fs.s3a=DEBUG
log4j.logger.com.amazonaws.request=DEBUG
log4j.logger.org.apache.http=DEBUG
log4j.logger.org.apache.http.wire=ERROR
}}}

Be aware that logging HTTP headers may leak sensitive AWS account information,
so the output should not be shared.

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "PoweredBy" by RemySaissy

2017-03-24 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "PoweredBy" page has been changed by RemySaissy:
https://wiki.apache.org/hadoop/PoweredBy?action=diff=440=441

Comment:
Criteo company description updated.

  
   * ''[[http://criteo.com|Criteo]] - Criteo is a global leader in online 
performance advertising ''
* ''[[http://labs.criteo.com/blog|Criteo R]] uses Hadoop as a 
consolidated platform for storage, analytics and back-end processing, including 
Machine Learning algorithms ''
-   * ''We currently have a dedicated cluster of 1117 nodes, 39PB storage, 75TB 
RAM, 22000 cores running full steam 24/7, and growing by the day ''
-   * ''Each node has 24 HT cores, 96GB RAM, 42TB HDD ''
-   * ''Hardware and platform management is done through 
[[http://www.getchef.com/|Chef]], we run YARN ''
-   * ''We run a mix of ad-hoc Hive queries for BI, 
[[http://www.cascading.org/|Cascading]] jobs, raw mapreduce jobs, and streaming 
[[http://www.mono-project.com/|Mono]] jobs, as well as some Pig ''
-   * ''To be delivered in Q2 2015 a second cluster of 600 nodes, each 48HT 
cores, 256GB RAM, 96TB HDD ''
+   * ''We have 5 clusters in total, 2 of which are production, each with a 
corresponding pre-production and an experimental one ''
+   * ''More than 47,896 cores in ~2,560 machines running Hadoop (> 4,300 
machines by the end of 2017) ''
+   * ''Our main cluster: 1,353 machines (24 cores w 15*6TB disk & 256GB RAM) ''
+* ''Growth to ~3,000 machines by the end of 2017 ''
+   * ''We run a mix of ''
+* ''Ad-hoc Hive queries for BI ''
+* ''Cascading/Scalding jobs ''
+* ''Mapreduce jobs ''
+* ''Spark jobs ''
+* ''Streaming Mono jobs ''
  
   * ''[[http://www.crs4.it|CRS4]] ''
* ''Hadoop deployed dynamically on subsets of a 400-node cluster ''

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "Misty" by SteveLoughran

2017-03-24 Thread Apache Wiki

Dear wiki user,

You have subscribed to a wiki page "Hadoop Wiki" for change notification.

The page "Misty" has been deleted by SteveLoughran:

https://wiki.apache.org/hadoop/Misty?action=diff=1=2

Comment:
junk user page

- ##master-page:HomepageTemplate
- #format wiki
- #language en
- == @``ME@ ==
  
- Email: <>
- ## You can even more obfuscate your email address by adding more uppercase 
letters followed by a leading and trailing blank.
- 
- ...
- 
- 
- CategoryHomepage
- 

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "Books" by Packt Publishing

2017-03-14 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Books" page has been changed by Packt Publishing:
https://wiki.apache.org/hadoop/Books?action=diff=40=41

Comment:
Added new book

  }}}
  
  
+ === Deep Learning with Hadoop ===
+ 
+ '''Name:'''  
[[https://www.packtpub.com/big-data-and-business-intelligence/deep-learning-hadoop|Deep
 Learning with Hadoop]]
+ 
+ '''Author:''' Dipayan Dev
+ 
+ '''Publisher:''' Packt
+ 
+ '''Date of Publishing:''' February 2017
+ 
+ Build, implement and scale distributed deep learning models for large-scale 
datasets.
+ 
  === Hadoop Blueprints ===
  
  '''Name:'''  
[[https://www.packtpub.com/big-data-and-business-intelligence/hadoop-blueprints|Hadoop
 Blueprints]]

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToSetupYourDevelopmentEnvironment" by SteveLoughran

2017-02-14 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToSetupYourDevelopmentEnvironment" page has been changed by 
SteveLoughran:
https://wiki.apache.org/hadoop/HowToSetupYourDevelopmentEnvironment?action=diff=34=35

Comment:
add the details on OSX install, especially protoc setup now that homebrew 1.x 
doesn't support protobuf 2.5

  This page describes how to get your environment setup and is IDE agnostic.
  
  = Requirements =
-  * Java 6 or 7
-  * Maven
+  * Java 7 or 8 (Branch 2) or Java 8 (trunk)
+  * Maven 3.3 or later
   * Your favorite IDE
+  * Protobuf 2.5.0
  
  = Setup Your Development Environment in Linux =
  
- The instructions below talk about how to get an environment setup using the 
command line to build, control source, and test.  These instructions are 
therefore IDE independent.  Take a look at EclipseEnvironment for instructions 
on how to configure Eclipse to build, control source, and test.  If you prefer 
ItelliJ IDEA, then take a look [[HadoopUnderIDEA| here]]
+ The instructions below talk about how to get an environment setup using the 
command line to build, control source, and test.  These instructions are 
therefore IDE independent.  Take a look at EclipseEnvironment for instructions 
on how to configure Eclipse to build, control source, and test.  If you prefer 
IntelliJ IDEA, then take a look [[HadoopUnderIDEA| here]]
  
-  * Choose a good place to put your code.  You will eventually use your source 
code to run Hadoop, so choose wisely. For example ~/code/hadoop.
+  * Choose a good place to put your code.  You will eventually use your source 
code to run Hadoop, so choose wisely. For example {{{~/code/hadoop}}}.
-  * Get the source.  This is documented in HowToContribute.  Put the source in 
~/code/hadoop (or whatever you chose) so that you have 
~/code/hadoop/hadoop-common
+  * Get the source.  This is documented in HowToContribute.  Put the source in 
{{{~/code/hadoop (or whatever you chose) so that you have 
{{{~/code/hadoop/hadoop-common}}}
-  * cd into ''hadoop-common'', or whatever you named the directory
+  * cd into {{{hadoop-common}}}, or whatever you named the directory
-  * attempt to run ''mvn install''
+  * attempt to run {{{mvn install}}} . To build without tests: {{{mvn install 
-DskipTests}}}
*  If you get any strange errors (other than JUnit test failures and 
errors), then consult the ''Build Errors'' section below.
   * follow GettingStartedWithHadoop to learn how to run Hadoop.
*  If you run in to any problems, refer to the ''Runtime Errors'' below, 
along with the troubleshooting document here: TroubleShooting
+ 
+ = Setup Your Development Environment in OSX =
+ 
+ 
+ The Linux instructions match, except that:
+ 
+ XCode is needed for the command line compiler and other tools. 
+ 
+ 
+ protobuf 2.5.0 needs to be built by hand, as macports and homebrew no longer 
ship that version.
+ 
+ Follow the instructions in the building from source 
[[http://sleepythread.blogspot.co.uk/2013/11/installing-protoc-25x-compiler-google.html|Installing
 protoc 2.5.x compiler on mac]] ''but change the URL for the protobuf archive 
to 
[[https://github.com/google/protobuf/releases/download/v2.5.0/protobuf-2.5.0.tar.gz]]''.
 
+ 
+ To verify that protobuf is  correctly installed, the command {{{protoc 
--version}}} must print out the string {{{libprotoc 2.5.0}}}.
+ 
  
  = Run HDFS in pseudo-distributed mode from the dev tree =
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "Ozone" by ArpitAgarwal

2017-02-02 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Ozone" page has been changed by ArpitAgarwal:
https://wiki.apache.org/hadoop/Ozone?action=diff=3=4

  The requirement to first start then stop the NameNode will be fixed soon.
  
  = Performing Ozone REST operations =
- Content arriving soon.
+ 
[[https://issues.apache.org/jira/secure/attachment/12799549/ozone_user_v0.pdf|Ozone
 REST API specification]]
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToRelease" by AndrewWang

2017-01-25 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToRelease" page has been changed by AndrewWang:
https://wiki.apache.org/hadoop/HowToRelease?action=diff=92=93

1. Update the news on the home page 
{{{author/src/documentation/content/xdocs/index.xml}}}.
1. Copy the new release docs to svn and update the {{{docs/current}}} link, 
by doing the following:
{{{
- tar xvf 
/www/www.apache.org/dist/hadoop/core/hadoop-${version}/hadoop-${version}.tar.gz
- cp -rp hadoop-${version}/share/doc/hadoop publish/docs/r${version}
- rm -r hadoop-${version}
  cd publish/docs
+ tar xvf /path/to/hadoop-${version}-site.tar.gz
  # Update current2, current, stable and stable2 as needed.
  # For example
  rm current2 current

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "PoweredBy" by DavidTing

2017-01-24 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "PoweredBy" page has been changed by DavidTing:
https://wiki.apache.org/hadoop/PoweredBy?action=diff=439=440

 * ''Each (commodity) node has 8 cores and 12 TB of storage. ''
 * ''We are heavy users of both streaming as well as the Java APIs. We have 
built a higher level data warehousing framework using these features called 
Hive (see the http://hadoop.apache.org/hive/). We have also developed a FUSE 
implementation over HDFS. ''
  
-  * ''[[http://www.follownews.com/|FollowNews]] ''
+  * ''[[https://www.follownews.com/|FollowNews]] ''
* ''We use Hadoop for storing logs, news analysis, tag analysis. ''
  
   * ''[[http://www.foxaudiencenetwork.com|FOX Audience Network]] ''
@@ -437, +437 @@

 * ''Apache Hive, Apache Avro, Apache Kafka, and other bits and pieces... ''
* ''We use these things for discovering People You May Know and 
[[http://www.linkedin.com/careerexplorer/dashboard|other]] 
[[http://inmaps.linkedinlabs.com/|fun]] 
[[http://www.linkedin.com/skills/|facts]]. ''
  
+  * ''[[https://www.livebet.com|LiveBet]] ''
+   * ''We use Hadoop for storing logs, odds analysis, markets analysis. ''
+ 
   * ''[[http://www.lookery.com|Lookery]] ''
* ''We use Hadoop to process clickstream and demographic data in order to 
create web analytic reports. ''
* ''Our cluster runs across Amazon's EC2 infrastructure and makes use of 
the streaming module to use Python for most operations. ''

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToRelease" by AndrewWang

2017-01-20 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToRelease" page has been changed by AndrewWang:
https://wiki.apache.org/hadoop/HowToRelease?action=diff=91=92

   1. Push branch-X.Y.Z and the newly created tag to the remote repo.
   1. Deploy the maven artifacts, on your personal computer. Please be sure you 
have completed the prerequisite step of preparing the {{{settings.xml}}} file 
before the deployment. You might want to do this in private and clear your 
history file as your gpg-passphrase is in clear text.
   {{{
- mvn deploy -DskipTests
+ mvn deploy -Psign -DskipTests -DskipShading
  }}}
   1. Copy release files to a public place and ensure they are readable. Note 
that {{{home.apache.org}}} only supports SFTP, so this may be easier with a 
graphical SFTP client like Nautilus, Konqueror, etc.
   {{{

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToRelease" by AndrewWang

2017-01-20 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToRelease" page has been changed by AndrewWang:
https://wiki.apache.org/hadoop/HowToRelease?action=diff=90=91

  }}}
   1. While it should fail {{{create-release}}} if there are issues, 
doublecheck the rat log to find and fix any potential licensing issues.
   {{{
-  grep 'Rat check' target/artifacts/mvn_apache_rat.log
+  grep 'Rat check' patchprocess/mvn_apache_rat.log
  }}}
   1. Check that release files look ok - e.g. install it somewhere fresh and 
run examples from tutorial, do a fresh build, read the release notes looking 
for WARNINGs, etc.
   1. Set environment variable version for later steps. {{{export 
version=X.Y.Z-RCN}}}

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "ConnectionRefused" by SteveLoughran

2017-01-06 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "ConnectionRefused" page has been changed by SteveLoughran:
https://wiki.apache.org/hadoop/ConnectionRefused?action=diff=16=17

Comment:
fix name

   1. If you are using a Hadoop-based product from a third party, -please use 
the support channels provided by the vendor.
   1. Please do not file bug reports related to your problem, as they will be 
closed as [[http://wiki.apache.org/hadoop/InvalidJiraIssues|Invalid]]
  
- See also 
[[http://serverfault.com/questions/725262/what-causes-the-connection-refused-message|Stack
 Overflow]]
+ See also 
[[http://serverfault.com/questions/725262/what-causes-the-connection-refused-message|Server
 Overflow]]
  
  None of these are Hadoop problems, they are hadoop, host, network and 
firewall configuration issues. As it is your cluster, 
[[YourNetworkYourProblem|only you can find out and track down the problem.]]
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "ConnectionRefused" by SteveLoughran

2017-01-06 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "ConnectionRefused" page has been changed by SteveLoughran:
https://wiki.apache.org/hadoop/ConnectionRefused?action=diff=15=16

Comment:
ref to stack overflow

   1. If you are using a Hadoop-based product from a third party, -please use 
the support channels provided by the vendor.
   1. Please do not file bug reports related to your problem, as they will be 
closed as [[http://wiki.apache.org/hadoop/InvalidJiraIssues|Invalid]]
  
- None of these are Hadoop problems, they are host, network and firewall 
configuration issues. As it is your cluster, [[YourNetworkYourProblem|only you 
can find out and track down the problem.]]
+ See also 
[[http://serverfault.com/questions/725262/what-causes-the-connection-refused-message|Stack
 Overflow]]
  
+ None of these are Hadoop problems, they are hadoop, host, network and 
firewall configuration issues. As it is your cluster, 
[[YourNetworkYourProblem|only you can find out and track down the problem.]]
+ 

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "ConnectionRefused" by SteveLoughran

2017-01-03 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change
notification.

The "ConnectionRefused" page has been changed by SteveLoughran:
https://wiki.apache.org/hadoop/ConnectionRefused?action=diff=13=14

Comment:
link to ambari port list

If the application or cluster is not working, and this message appears in the
log, then it is more serious.

+ The exception text declares both the hostname and the port to which the
connection failed. The port can be used to identify the service. For example,
port 9000 is the HDFS port. Consult the
[[https://ambari.apache.org/1.2.5/installing-hadoop-using-ambari/content/reference_chap2.html|Ambari
port reference]], and/or those of the supplier of your Hadoop management tools.
+
1. Check the hostname the client using is correct. If it's in a Hadoop
configuration option: examine it carefully, try doing an ping by hand.
1. Check the IP address the client is trying to talk to for the hostname is
correct.
+ 1. Make sure the destination address in the exception isn't 0.0.0.0 -this
means that you haven't actually configured the client with the real address for
that service, and instead it is picking up the server-side property telling it
to listen on every port for connections.
- 1. Make sure the destination address in the exception isn't 0.0.0.0 -this
means that you haven't actually configured the client with the real address for
that.
- service, and instead it is picking up the server-side property telling it to
listen on every port for connections.
1. If the error message says the remote service is on "127.0.0.1" or
"localhost" that means the configuration file is telling the client that the
service is on the local server. If your client is trying to talk to a remote
system, then your configuration is broken.
1. Check that there isn't an entry for your hostname mapped to 127.0.0.1 or
127.0.1.1 in /etc/hosts (Ubuntu is notorious for this).
1. Check the port the client is trying to talk to using matches that the
server is offering a service on.

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "Books" by Packt Publishing

2016-12-13 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Books" page has been changed by Packt Publishing:
https://wiki.apache.org/hadoop/Books?action=diff=39=40

Comment:
URL change

  Unlock the power of your data with Hadoop 2.X ecosystem and its data 
warehousing techniques across large data sets.
  
  === Hadoop Explained (Free eBook Download) ===
- '''Name:''' 
[[https://www.packtpub.com/packt/free-ebook/hadoop-explained|Hadoop Explained]]
+ '''Name:''' 
[[https://www.packtpub.com/packt/free-ebook/hadoop-explained-2|Hadoop 
Explained]]
  
  '''Author:''' Aravind Shenoy
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "GitAndHadoop" by ArpitAgarwal

2016-11-22 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "GitAndHadoop" page has been changed by ArpitAgarwal:
https://wiki.apache.org/hadoop/GitAndHadoop?action=diff=24=25

Comment:
Remove some obsolete instructions

  
  == Forking onto GitHub ==
  
- You can create your own fork of the ASF project, put in branches and stuff as 
you desire. GitHub prefer you to explicitly fork their copies of Hadoop.
+ You can create your own fork of the ASF project. This is required if you want 
to contribute patches by submitting pull requests. However you can choose to 
skip this step and attach patch files directly on Apache Jiras.
  
   1. Create a GitHub login at http://github.com/ ; Add your public SSH keys
+  1. Go to https://github.com/apache/hadoop/
+  1. Click fork in the github UI. This gives you your own repository URL.
-  1. Go to http://github.com/apache and search for the Hadoop and other Apache 
projects you want (avro is handy alongside the others)
-  1. For each project, fork in the github UI. This gives you your own 
repository URL which you can then clone locally with {{{git clone}}}
-  1. For each patch, branch.
- 
- At the time of writing (December 2009), GitHub was updating its copy of the 
Apache repositories every hour. As the Apache repositories were updating every 
15 minutes, provided these frequencies are retained, a GitHub-fork derived 
version will be at worst 1 hour and 15 minutes behind the ASF's Git repository. 
If you are actively developing on Hadoop, especially committing code into the 
Git repository, that is too long -work off the Apache repositories instead. 
- 
-  1. Clone the read-only repository from Github (their recommendation) or from 
Apache (the ASF's recommendation)
-  1. in that clone, rename that repository "apache": {{{git remote rename 
origin apache}}}
-  1. Log in to [http://github.com]
-  1. Create a new repository (e.g hadoop-fork)
-  1. In the existing clone, add the new repository : 
+  1. In the existing clone, add the new repository: 
   {{{git remote add -f github g...@github.com:MYUSERNAMEHERE/hadoop.git}}}
  
- This gives you a local repository with two remote repositories: "apache" and 
"github". Apache has the trunk branch, which you can update whenever you want 
to get the latest ASF version:
+ This gives you a local repository with two remote repositories: {{{origin}}} 
and {{{github}}}. {{{origin}}} has the Apache branches, which you can update 
whenever you want to get the latest ASF version:
  
  {{{
-  git checkout trunk
-  git pull apache
+  git checkout -b trunk origin/trunk
+  git pull origin
  }}}
  
- Your own branches can be merged with trunk, and pushed out to git hub. To 
generate patches for submitting as JIRA patches, check everything in to your 
specific branch, merge that with (a recently pulled) trunk, then diff the two:
+ Your own branches can be merged with trunk, and pushed out to GitHub. To 
generate patches for attaching to Apache JIRAs, check everything in to your 
specific branch, merge that with (a recently pulled) trunk, then diff the two:
- {{{ git diff --no-prefix trunk > ../hadoop-patches/HADOOP-XYX.patch }}}
+ {{{ git diff trunk > ../hadoop-patches/HADOOP-XYX.patch }}}
- 
- If you are working deep in the code it's not only convenient to have a 
directory full of patches to the JIRA issues, it's convenient to have that 
directory a git repository that is pushed to a remote server, such as 
[[https://github.com/steveloughran/hadoop-patches|this example]]. Why? It helps 
you move patches from machine to machine without having to do all the updating 
and merging. From a pure-git perspective this is wrong: it loses history, but 
for a mixed workflow it doesn't matter so much.
  
  
  == Branching ==

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "GitAndHadoop" by ArpitAgarwal

2016-11-22 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "GitAndHadoop" page has been changed by ArpitAgarwal:
https://wiki.apache.org/hadoop/GitAndHadoop?action=diff=23=24

Comment:
Remove obsolete svn-bridge migration info.

  }}}
  
  You can then use commands like `git blame --follow` with success.
- 
- == Migrating private branches to the new git commit history ==
- 
- The migration from svn to git changed the commit ids for anyone tracking the 
history of the project via the svn to git bridge. This means that private 
forks/branches will not rebase to the new versions. Follow the 
MigratingPrivateGitBranches instructions.
- 
  
  == Forking onto GitHub ==
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToContribute" by AkiraAjisaka

2016-11-17 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToContribute" page has been changed by AkiraAjisaka:
https://wiki.apache.org/hadoop/HowToContribute?action=diff=117=118

Comment:
Update Java version from 7 to 8.

   * Disable any added value "reformat" and "strip trailing spaces" features as 
it can create extra noise when reviewing patches.
  
  === Build Tools ===
-  * A Java Development Kit. The Hadoop developers recommend 
[[http://java.com/|Oracle Java 7]]. You may also use 
[[http://openjdk.java.net/|OpenJDK]].
+  * A Java Development Kit. The Hadoop developers recommend 
[[http://java.com/|Oracle Java 8]]. You may also use 
[[http://openjdk.java.net/|OpenJDK]].
   * Google Protocol Buffers. Check out the ProtocolBuffers guide for help 
installing protobuf.
   * [[http://maven.apache.org/|Apache Maven]] version 3 or later (for Hadoop 
0.23+)
   * The Java API javadocs.

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToRelease" by SangjinLee

2016-10-19 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToRelease" page has been changed by SangjinLee:
https://wiki.apache.org/hadoop/HowToRelease?action=diff=89=90

  ## page was copied from HowToReleasePostMavenization
  ''This page is prepared for Hadoop Core committers. You need committer rights 
to create a new  Hadoop Core release.''
  
- These instructions have been updated to use dev-support/bin/create-release. 
Earlier versions of this document are at HowToReleaseWithSvnAndAnt and 
HowToReleasePostMavenization and [[HowToReleasePreDSBCR]]
+ These instructions have been updated to use dev-support/bin/create-release. 
Earlier versions of this document are at HowToReleaseWithSvnAndAnt and 
HowToReleasePostMavenization and [[HowToReleasePreDSBCR]]. For releasing from 
the 2.6.x or the 2.7.x line, you'll need to consult [[HowToReleasePreDSBCR]] to 
find applicable steps.
  
  <>
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "Books" by Packt Publishing

2016-10-12 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Books" page has been changed by Packt Publishing:
https://wiki.apache.org/hadoop/Books?action=diff=38=39

Comment:
Book links added

  }}}
  
  
+ === Hadoop Blueprints ===
+ 
+ '''Name:'''  
[[https://www.packtpub.com/big-data-and-business-intelligence/hadoop-blueprints|Hadoop
 Blueprints]]
+ 
+ '''Authors:''' Anurag Shrivastava, Tanmay Deshpande
+ 
+ '''Publisher:''' Packt
+ 
+ '''Date of Publishing:''' September 2016
+ 
+ Use Hadoop to solve business problems by learning from a rich set of 
real-life case studies.
+ 
  === Hadoop: Data Processing and Modelling ===
  
  '''Name:'''  
[[https://www.packtpub.com/big-data-and-business-intelligence/hadoop-data-processing-and-modelling|Hadoop:
 Data Processing and Modelling]]
@@ -423, +435 @@

  
  '''Date of Publishing (est.):''' October 2015
  
+ 
+ 
+ == Hadoop Videos ==
+ 
+ 
+ === Taming Big Data with MapReduce and Hadoop - Hands On! (Video) ===
+ 
+ '''Name:'''  
[[https://www.packtpub.com/big-data-and-business-intelligence/taming-big-data-mapreduce-and-hadoop-hands-video|Taming
 Big Data with MapReduce and Hadoop - Hands On! (Video)]]
+ 
+ '''Author:''' Frank Kane
+ 
+ '''Publisher:''' Packt
+ 
+ '''Date of Publishing:''' September 12, 2016
+ 
+ Master the art of processing Big Data using Hadoop and MapReduce with the 
help of real-world examples.
+ 
+ 
  Hadoop in Action introduces the subject and shows how to write programs in 
the MapReduce style. It starts with a few easy examples and then moves quickly 
to show Hadoop use in more complex data analysis tasks. Included are best 
practices and design patterns of MapReduce programming.
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "AmazonS3" by YongjunZhang

2016-10-10 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "AmazonS3" page has been changed by YongjunZhang:
https://wiki.apache.org/hadoop/AmazonS3?action=diff=21=22

  
  === Unmainteained: S3N FileSystem (URI scheme: s3n://) ===
  
- '''S3A is the S3 Client for Hadoop 2.6 and earlier. From Hadoop 2.7+, switch 
to s3a'''
+ '''S3N is the S3 Client for Hadoop 2.6 and earlier. From Hadoop 2.7+, switch 
to s3a'''
  
  A native filesystem for reading and writing regular files on S3.With this 
filesystem is that you can access files on S3 that were written with other 
tools. Conversely, other tools can access files written using Hadoop. The S3N 
code is stable and widely used, but is not adding any new features (which is 
why it remains stable).
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Trivial Update of "HowToContribute" by QwertyManiac

2016-10-06 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToContribute" page has been changed by QwertyManiac:
https://wiki.apache.org/hadoop/HowToContribute?action=diff=116=117

Comment:
Add gcc-c++ to RHEL instructions

  
  For RHEL (and hence also CentOS):
  {{{
- yum -y install  lzo-devel  zlib-devel  gcc autoconf automake libtool 
openssl-devel fuse-devel cmake
+ yum -y install  lzo-devel  zlib-devel  gcc gcc-c++ autoconf automake libtool 
openssl-devel fuse-devel cmake
  }}}
  
  For Debian and Ubuntu:

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Trivial Update of "HowToContribute" by QwertyManiac

2016-10-06 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToContribute" page has been changed by QwertyManiac:
https://wiki.apache.org/hadoop/HowToContribute?action=diff=116=117

Comment:
Add missing cmake to RHEL instructions

  
  For RHEL (and hence also CentOS):
  {{{
- yum -y install  lzo-devel  zlib-devel  gcc autoconf automake libtool 
openssl-devel fuse-devel cmake
+ yum -y install  lzo-devel  zlib-devel  gcc gcc-c++ autoconf automake libtool 
openssl-devel fuse-devel cmake
  }}}
  
  For Debian and Ubuntu:

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "Books" by Packt Publishing

2016-09-21 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Books" page has been changed by Packt Publishing:
https://wiki.apache.org/hadoop/Books?action=diff=37=38

Comment:
Book added

  }}}
  
  
+ === Hadoop: Data Processing and Modelling ===
+ 
+ '''Name:'''  
[[https://www.packtpub.com/big-data-and-business-intelligence/hadoop-data-processing-and-modelling|Hadoop:
 Data Processing and Modelling]]
+ 
+ '''Authors:''' Garry Turkington, Tanmay Deshpande, Sandeep Karanth
+ 
+ '''Publisher:''' Packt
+ 
+ '''Date of Publishing:''' August 2016
+ 
+ Unlock the power of your data with Hadoop 2.X ecosystem and its data 
warehousing techniques across large data sets.
+ 
  === Hadoop Explained (Free eBook Download) ===
  '''Name:''' 
[[https://www.packtpub.com/packt/free-ebook/hadoop-explained|Hadoop Explained]]
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "Books" by Packt Publishing

2016-09-21 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Books" page has been changed by Packt Publishing:
https://wiki.apache.org/hadoop/Books?action=diff=36=37

Comment:
Added a free eBook

  }}}
  
  
+ === Hadoop Explained (Free eBook Download) ===
+ '''Name:''' 
[[https://www.packtpub.com/packt/free-ebook/hadoop-explained|Hadoop Explained]]
+ 
+ '''Author:''' Aravind Shenoy
+ 
+ '''Publisher:''' Packt Publishing
+ 
+ Learn how MapReduce organizes and processes large sets of data and discover 
the advantages of Hadoop - from scalability to security, see how Hadoop handles 
huge amounts of data with care
+ 
  === Hadoop Real-World Solutions Cookbook- Second Edition ===
  '''Name:''' 
[[https://www.packtpub.com/big-data-and-business-intelligence/hadoop-real-world-solutions-cookbook-second-edition|Hadoop
 Real-World Solutions Cookbook- Second Edition]]
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToRelease" by AndrewWang

2016-09-06 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToRelease" page has been changed by AndrewWang:
https://wiki.apache.org/hadoop/HowToRelease?action=diff=88=89

Comment:
Update website build instructions to point to HowToCommit instead

{{{
  svn add publish/docs/r${version}
  }}}
-   1. Regenerate the site, review it, then commit it.
+   1. Regenerate the site, review it, then commit it per the instructions in 
HowToCommit.
{{{
- ant -Dforrest.home=$FORREST_HOME -Djava5.home=/usr/local/jdk1.5
- firefox publish/index.html
+ 
+ 
  svn commit -m "Updated site for release X.Y.Z."
  }}}
   1. Send announcements to the user and developer lists once the site changes 
are visible.

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToCommit" by AndrewWang

2016-08-29 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToCommit" page has been changed by AndrewWang:
https://wiki.apache.org/hadoop/HowToCommit?action=diff=37=38

Comment:
update fix version instructions

  1. Cherry-pick the changes to other appropriate branches via {{{git 
cherry-pick -x }}}. The -x option records the source commit, and 
reuses the original commit message. Resolve any conflicts.
  1. If the conflicts are major, it is preferable to produce a new patch 
for that branch, review it separately and commit it. When committing an edited 
patch to other branches, please follow the same steps and make sure to include 
the JIRA number and description of changes in the commit message.
  1. When backporting to branch-2.7 or older branches, we need to update 
CHANGES.txt.
-  1. Resolve the issue as fixed, thanking the contributor.  Always set the 
"Fix Version" at this point, but please only set a single fix version, the 
earliest release in which the change will appear. '''Special case'''- when 
committing to a ''non-mainline'' branch (such as branch-0.22 or branch-0.23 
ATM), please set fix-version to either 2.x.x or 3.x.x appropriately too.
+  1. Resolve the issue as fixed, thanking the contributor.  Follow the rules 
specified at [[https://hadoop.apache.org/versioning.html|Apache Hadoop Release 
Versioning]] for how to set fix versions appropriately, it's important for 
tracking purposes with concurrent release lines.
   1. Set the assignee if it is not set. If you cannot set the contributor to 
the assignee, you need to add the contributor into Contributors role in the 
project. Please see [[#Roles|Adding Contributors role]] for the detail.
  
  This How-to-commit 
[[http://www.youtube.com/watch?v=txW3m7qWdzw=youtu.be|video]] has 
guidance on the commit process, albeit using svn. Most of the process is still 
the same, except that we now use git instead. 

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "Roadmap" by AndrewWang

2016-07-21 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Roadmap" page has been changed by AndrewWang:
https://wiki.apache.org/hadoop/Roadmap?action=diff=61=62

  For more details on how releases are created, see HowToRelease.
  
  == Hadoop 3.x Releases ==
+ === Planned for hadoop-3.0.0 ===
+  * HADOOP
+   * Classpath isolation on by default 
[[https://issues.apache.org/jira/browse/HADOOP-11656|HADOOP-11656]]
+  * HDFS
+  * YARN
+  * MAPREDUCE
+ 
+ 
- === hadoop-3.0 ===
+ === hadoop-3.0.0-alpha1 ===
   * HADOOP
* Move to JDK8+
-   * Classpath isolation on by default 
[[https://issues.apache.org/jira/browse/HADOOP-11656|HADOOP-11656]]
* Shell script rewrite 
[[https://issues.apache.org/jira/browse/HADOOP-9902|HADOOP-9902]]
* Move default ports out of ephemeral range 
[[https://issues.apache.org/jira/browse/HDFS-9427|HDFS-9427]]
   * HDFS
* Removal of hftp in favor of webhdfs 
[[https://issues.apache.org/jira/browse/HDFS-5570|HDFS-5570]]
* Support for more than two standby NameNodes 
[[https://issues.apache.org/jira/browse/HDFS-6440|HDFS-6440]]
* Support for Erasure Codes in HDFS 
[[https://issues.apache.org/jira/browse/HDFS-7285|HDFS-7285]]
+   * Intra-datanode balancer 
[[https://issues.apache.org/jira/browse/HDFS-1312|HDFS-1312]]
   * YARN
+   * YARN Timeline Service v.2 
[[https://issues.apache.org/jira/browse/YARN-2928|YARN-2928]]
   * MAPREDUCE
* Derive heap size or mapreduce.*.memory.mb automatically 
[[https://issues.apache.org/jira/browse/MAPREDUCE-5785|MAPREDUCE-5785]]
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Trivial Update of "HowToRelease" by VinodKumarVavilapalli

2016-07-20 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToRelease" page has been changed by VinodKumarVavilapalli:
https://wiki.apache.org/hadoop/HowToRelease?action=diff=87=88

  
   1. Check if the release year for Web UI footer is updated (the property 
{{{}}} in {{{hadoop-project/pom.xml}}}). If not, create a JIRA to 
update the property value to the right year, and propagate the fix from trunk 
to all necessary branches. Consider the voting time needed before publishing, 
it's better to use the year of (current time + voting time) here, to be 
consistent with the publishing time.
   1. In JIRA, ensure that only issues in the "Fixed" state have a "Fix 
Version" set to release X.Y.Z.
-  1. In JIRA, "release" the version, setting the date to the expected 
end-of-vote date.  Visit the "Administer Project" page, then the "Manage 
versions" page.   You need to have the "Admin" role in HADOOP, HDFS, MAPREDUCE, 
and YARN. This ensures that the release notes and changes file have the correct 
date to match the actual release date.
   1. Verify that $HOME/.gpg defaults to the key listed in the KEYS file.
   1. For the Apache release, a machine capable of running Docker- and 
Internet- capable, build the release candidate with {{{create-release}}}. 
Unless the {{{--logdir}}} is given, logs will be in the {{{patchprocess/}}} 
directory. Artifacts will be in the target/artifacts NOTE: This will take quite 
a while, since it will download and build the entire source tree, including 
documentation and native components, from scratch to avoid maven repository 
caching issues hiding issues with the source release.
   {{{
@@ -117, +116 @@

  
  = Publishing =
  In 5 days if [[http://hadoop.apache.org/bylaws#Decision+Making|the release 
vote passes]], the release may be published.
- 
+  1. In JIRA, "release" the version, setting the date to the end-of-vote date. 
Visit the "Administer Project" page, then the "Manage versions" page. You need 
to have the "Admin" role in HADOOP, HDFS, MAPREDUCE, and YARN.
   1. Set environment variable version for later steps. {{{export 
version=X.Y.Z}}}
   1. Tag the release. Do it from the release branch and push the created tag 
to the remote repository:
   {{{

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToRelease" by AndrewWang

2016-07-15 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToRelease" page has been changed by AndrewWang:
https://wiki.apache.org/hadoop/HowToRelease?action=diff=86=87

Comment:
update instructions for uploading to home.apache.org

   {{{
  mvn deploy -DskipTests
  }}}
-  1. Copy release files to a public place and ensure they are readable.
+  1. Copy release files to a public place and ensure they are readable. Note 
that {{{home.apache.org}}} only supports SFTP, so this may be easier with a 
graphical SFTP client like Nautilus, Konqueror, etc.
   {{{
- ssh home.apache.org mkdir public_html/hadoop-${version}
- scp -p hadoop-${version}*.tar.gz* 
home.apache.org:public_html/hadoop-${version}
- ssh home.apache.org chmod -R a+r public_html/hadoop-${version}
+ sftp home.apache.org
+ > cd public_html
+ > mkdir hadoop-${version}
+ > put -r /home/hadoop/hadoop-${version}
+ 
+ > bye
  }}}
   1. Log into [[https://repository.apache.org|Nexus]], select "{{{Staging}}} 
Repositories" from the left navigation pane, select the check-box against the 
specific hadoop repository, and {{{close}}} the release.
   1. Call a release vote on common-dev at hadoop.apache.org. It's usually a 
good idea to start the release vote on Monday so that people will have a chance 
to verify the release candidate during the week. 
[[https://www.mail-archive.com/common-dev@hadoop.apache.org/msg13339.html|Example]]

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "AmazonS3" by SteveLoughran

2016-07-15 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "AmazonS3" page has been changed by SteveLoughran:
https://wiki.apache.org/hadoop/AmazonS3?action=diff=20=21

Comment:
lots more on S3a and why to use it, warnings of state of s3n and deprecation of 
s3

  = S3 Support in Apache Hadoop =
  
  [[http://aws.amazon.com/s3|Amazon S3]] (Simple Storage Service) is a data 
storage service. You are billed
- monthly for storage and data transfer. Transfer between S3 and [[AmazonEC2]] 
instances in the same geographical location are free. This makes use of
- S3 attractive for Hadoop users who run clusters on EC2.
+ monthly for storage and data transfer. Transfer between S3 and [[AmazonEC2]] 
instances in the same geographical location are free. Most importantly, the 
data is preserved when a transient Hadoop cluster is shut down
+ 
+ This makes use of S3 common in Hadoop clusters on EC2. It is also used 
sometimes for backing up remote cluster.
  
  Hadoop provides multiple filesystem clients for reading and writing to and 
from Amazon S3 or compatible service.
  
- === S3 Native FileSystem (URI scheme: s3n) ===
  
- A native filesystem for reading and writing regular files on S3. The 
advantage of this filesystem is that you can access files on S3 that were 
written with other tools. Conversely, other tools can access files written 
using Hadoop. The S3N code is stable and widely used, but is not adding any new 
features (which is why it remains stable). S3N requires a suitable version of 
the jets3t JAR on the classpath.
+ === Recommended: S3A (URI scheme: s3a://) - Hadoop 2.7+ ===
  
- === S3A (URI scheme: s3a) ===
+ '''S3A is the recommended S3 Client for Hadoop 2.7 and later'''
  
  A successor to the S3 Native, s3n:// filesystem, the S3a: system uses 
Amazon's libraries to interact with S3. This allows S3a to support larger files 
(no more 5GB limit), higher performance operations and more. The filesystem is 
intended to be a replacement for/successor to S3 Native: all objects accessible 
from s3n:// URLs should also be accessible from s3a simply by replacing the URL 
schema.
  
- S3A has been considered usable in production since Hadoop 2.7, and is 
undergoing active maintenance for enhanced security, scalability and 
performance.
+ S3A has been usable in production since Hadoop 2.7, and is undergoing active 
maintenance for enhanced security, scalability and performance.
  
- '''important:''' S3A requires the exact version of the amazon-aws-sdk against 
which Hadoop was built (and is bundled with).
+ History
  
- === S3 Block FileSystem (URI scheme: s3) ===
+  1. Hadoop 2.6: Initial Implementation: 
[[https://issues.apache.org/jira/browse/HADOOP-10400|HADOOP-10400]]
+  2. Hadoop 2.7: Production Ready: 
[[https://issues.apache.org/jira/browse/HADOOP-11571|HADOOP-11571]]
+  3. Hadoop 2.8: Performance, robustness and security 
[[https://issues.apache.org/jira/browse/HADOOP-11694|HADOOP-11694]]
+  4. Hadoop 2.9: Even more features: 
[[https://issues.apache.org/jira/browse/HADOOP-13204|HADOOP-13204]]
  
+ July 2016: For details of ongoing work on S3a, consult 
[[www.slideshare.net/HadoopSummit/hadoop-cloud-storage-object-store-integration-in-production|Hadoop
 & Cloud Storage: Object Store Integration in Production]]
+ 
+ '''important:''' S3A requires the exact version of the amazon-aws-sdk against 
which Hadoop was built (and is bundled with). If you try to upgrade the library 
by dropping in a later version, things will break.
+ 
+ 
+ === Unmainteained: S3N FileSystem (URI scheme: s3n://) ===
+ 
+ '''S3A is the S3 Client for Hadoop 2.6 and earlier. From Hadoop 2.7+, switch 
to s3a'''
+ 
+ A native filesystem for reading and writing regular files on S3.With this 
filesystem is that you can access files on S3 that were written with other 
tools. Conversely, other tools can access files written using Hadoop. The S3N 
code is stable and widely used, but is not adding any new features (which is 
why it remains stable).
+ 
+ S3N requires a compatible version of the jets3t JAR on the classpath.
+ 
+ Since Hadoop 2.6, all work on S3 integration has been with S3A. S3N is not 
maintained except for security risks —this helps guarantee security. Most bug 
reports against S3N will be closed as WONTFIX and the text "use S3A". Please 
switch to S3A if you can -and do try it before filing bug reports against S3N.
+ 
+ 
+ === (Deprecated) S3 Block FileSystem (URI scheme: s3://) ===
+ 
+ '''S3 is deprecated and will be removed from Hadoop 2.3'''
+ 
- '''important:''' this section covers the s3:// filesystem support inside 
Apache Hadoop. The one in Amazon EMR is different —see the details at the 
bottom of this page.
+ '''important:''' this section covers the s3:// filesystem support from the 
Apache Software Foundation. The one in Amazon EMR is different —see the details 
at the bottom of this page.
  
  A block-based filesystem backed by S3. Files are stored

[Hadoop Wiki] Update of "LibHDFS" by AkiraAjisaka

2016-06-28 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "LibHDFS" page has been changed by AkiraAjisaka:
https://wiki.apache.org/hadoop/LibHDFS?action=diff=12=13

Comment:
Fix broken link to libhdfs test cases

  <>
  = Examples =
  
-   The 
[[http://svn.apache.org/viewvc/hadoop/core/trunk/src/c++/libhdfs/hdfs_test.c|test
 cases]] for libhdfs provide some good examples on how to use libhdfs.
+   The 
[[https://git-wip-us.apache.org/repos/asf?p=hadoop.git;a=tree;f=hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests|test
 cases]] for libhdfs provide some good examples on how to use libhdfs.
  
  <>
  <>

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToCommit" by AkiraAjisaka

2016-06-23 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToCommit" page has been changed by AkiraAjisaka:
https://wiki.apache.org/hadoop/HowToCommit?action=diff=36=37

Comment:
Committer need to update CHANGES.txt when backporting to branch-2.7 or older 
branches.

   1. '''Push changes to remote repo:''' Build and run a test to ensure it is 
all still kosher. Push the changes to the remote (main) repo using {{{git push 
 }}}.
   1. '''Backporting to other branches:''' If the changes were to trunk, we 
might want to apply them to other appropriate branches. 
  1. Cherry-pick the changes to other appropriate branches via {{{git 
cherry-pick -x }}}. The -x option records the source commit, and 
reuses the original commit message. Resolve any conflicts.
- 1. If the conflicts are major, it is preferable to produce a new patch 
for that branch, review it separately and commit it. When committing an edited 
patch to other branches, please follow the same steps and make sure to include 
the JIRA number and description of changes in the commit message. 
+ 1. If the conflicts are major, it is preferable to produce a new patch 
for that branch, review it separately and commit it. When committing an edited 
patch to other branches, please follow the same steps and make sure to include 
the JIRA number and description of changes in the commit message.
+ 1. When backporting to branch-2.7 or older branches, we need to update 
CHANGES.txt.
   1. Resolve the issue as fixed, thanking the contributor.  Always set the 
"Fix Version" at this point, but please only set a single fix version, the 
earliest release in which the change will appear. '''Special case'''- when 
committing to a ''non-mainline'' branch (such as branch-0.22 or branch-0.23 
ATM), please set fix-version to either 2.x.x or 3.x.x appropriately too.
   1. Set the assignee if it is not set. If you cannot set the contributor to 
the assignee, you need to add the contributor into Contributors role in the 
project. Please see [[#Roles|Adding Contributors role]] for the detail.
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToReleasePreDSBCR" by SomeOtherAccount

2016-06-14 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToReleasePreDSBCR" page has been changed by SomeOtherAccount:
https://wiki.apache.org/hadoop/HowToReleasePreDSBCR?action=diff=83=84

  ## page was renamed from HowToReleasePostMavenizationWithGit
  ## page was copied from HowToReleasePostMavenization
  ''This page is prepared for Hadoop Core committers. You need committer rights 
to create a new  Hadoop Core release.''
+ 
+ 
+ '''WARNING: These instructions use the ASF Jenkins servers to build a release 
artifact.  This is against the ASF release policies!'''
  
  The current version of this page is available at HowToRelease
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Trivial Update of "HowToRelease" by SomeOtherAccount

2016-06-14 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToRelease" page has been changed by SomeOtherAccount:
https://wiki.apache.org/hadoop/HowToRelease?action=diff=85=86

   1. In JIRA, ensure that only issues in the "Fixed" state have a "Fix 
Version" set to release X.Y.Z.
   1. In JIRA, "release" the version, setting the date to the expected 
end-of-vote date.  Visit the "Administer Project" page, then the "Manage 
versions" page.   You need to have the "Admin" role in HADOOP, HDFS, MAPREDUCE, 
and YARN. This ensures that the release notes and changes file have the correct 
date to match the actual release date.
   1. Verify that $HOME/.gpg defaults to the key listed in the KEYS file.
-  1. On a Docker- and Internet- capable machine, build the release candidate 
with {{{create-release}}}. Unless the {{{--logdir}}} is given, logs will be in 
the {{{patchprocess/}}} directory. Artifacts will be in the target/artifacts 
NOTE: This will take quite a while, since it will download and build the entire 
source tree, including documentation and native components, from scratch to 
avoid maven repository caching issues hiding issues with the source release.
+  1. For the Apache release, a machine capable of running Docker- and 
Internet- capable, build the release candidate with {{{create-release}}}. 
Unless the {{{--logdir}}} is given, logs will be in the {{{patchprocess/}}} 
directory. Artifacts will be in the target/artifacts NOTE: This will take quite 
a while, since it will download and build the entire source tree, including 
documentation and native components, from scratch to avoid maven repository 
caching issues hiding issues with the source release.
   {{{
   dev-support/bin/create-release --asfrelease --docker --dockercache
  }}}

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Trivial Update of "HowToRelease" by SomeOtherAccount

2016-06-08 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToRelease" page has been changed by SomeOtherAccount:
https://wiki.apache.org/hadoop/HowToRelease?action=diff=84=85

   cp target/artifacts/RELEASENOTES.md 
hadoop-common-project/hadoop-common/src/site/markdown/release/${version}/RELEASENOTES.${version}.md
   cp target/artifacts/CHANGES.md 
hadoop-common-project/hadoop-common/src/site/markdown/release/${version}/CHANGES.${version}.md
  }}}
-   1. Update {{{hadoop-project-dist/pom.xml}}} to point to this new stable 
version of the API and commit the change.
+   1. Copy the jdiff xml files for this version to their appropriate directory.
+   {{{
+   cp 
hadoop-hdfs-project/hadoop-hdfs/target/site/jdiff/xml/Apache_Hadoop_HDFS_${version}.xml
 hadoop-hdfs-project/hadoop-hdfs/dev-support/jdiff
+   }}}
+   1. Update {{{hadoop-project-dist/pom.xml}}}
{{{
   X.Y.Z
  }}}

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Trivial Update of "HowToRelease" by SomeOtherAccount

2016-06-07 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToRelease" page has been changed by SomeOtherAccount:
https://wiki.apache.org/hadoop/HowToRelease?action=diff=83=84

  ## page was copied from HowToReleasePostMavenization
  ''This page is prepared for Hadoop Core committers. You need committer rights 
to create a new  Hadoop Core release.''
  
- These instructions have been updated to use dev-support/bin/create-release. 
Earlier versions of this document are at HowToReleaseWithSvnAndAnt and 
HowToReleasePostMavenization and HowToReleasePreDSBCR
+ These instructions have been updated to use dev-support/bin/create-release. 
Earlier versions of this document are at HowToReleaseWithSvnAndAnt and 
HowToReleasePostMavenization and [[HowToReleasePreDSBCR]]
  
  <>
  
- '''READ ALL OF THESE INSTRUCTIONS THOROUGHLY BEFORE PROCEEDING!
+ '''READ ALL OF THESE INSTRUCTIONS THOROUGHLY BEFORE PROCEEDING! '''
- '''
  
  = Preparation =
   1. If you have not already done so, 
[[http://www.apache.org/dev/release-signing.html#keys-policy|append your code 
signing key]] to the 
[[https://dist.apache.org/repos/dist/release/hadoop/common/KEYS|KEYS]] file. 
Once you commit your changes, they will automatically be propagated to the 
website. Also 
[[http://www.apache.org/dev/release-signing.html#keys-policy|upload your key to 
a public key server]] if you haven't. End users use the KEYS file (along with 
the [[http://www.apache.org/dev/release-signing.html#web-of-trust|web of 
trust]]) to validate that releases were done by an Apache committer. For more 
details on signing releases, see 
[[http://www.apache.org/dev/release-signing.html|Signing Releases]] and 
[[http://www.apache.org/dev/mirror-step-by-step.html?Step-By-Step|Step-By-Step 
Guide to Mirroring Releases]].
@@ -71, +70 @@

  mvn versions:set -DnewVersion=X.Y.Z
  }}}
  
- 
  Now, for any branches in {trunk, branch-X, branch-X.Y, branch-X.Y.Z} that 
have changed, push them to the remote repo taking care of any conflicts.
  
  {{{
@@ -87, +85 @@

   1. On a Docker- and Internet- capable machine, build the release candidate 
with {{{create-release}}}. Unless the {{{--logdir}}} is given, logs will be in 
the {{{patchprocess/}}} directory. Artifacts will be in the target/artifacts 
NOTE: This will take quite a while, since it will download and build the entire 
source tree, including documentation and native components, from scratch to 
avoid maven repository caching issues hiding issues with the source release.
   {{{
   dev-support/bin/create-release --asfrelease --docker --dockercache
-  }}}
+ }}}
   1. While it should fail {{{create-release}}} if there are issues, 
doublecheck the rat log to find and fix any potential licensing issues.
   {{{
   grep 'Rat check' target/artifacts/mvn_apache_rat.log
-  }}}
+ }}}
   1. Check that release files look ok - e.g. install it somewhere fresh and 
run examples from tutorial, do a fresh build, read the release notes looking 
for WARNINGs, etc.
   1. Set environment variable version for later steps. {{{export 
version=X.Y.Z-RCN}}}
   1. Tag the release candidate:
   {{{
   git tag -s release-$version -m "Release candidate - $version"
-  }}}
+ }}}
   1. Push branch-X.Y.Z and the newly created tag to the remote repo.
   1. Deploy the maven artifacts, on your personal computer. Please be sure you 
have completed the prerequisite step of preparing the {{{settings.xml}}} file 
before the deployment. You might want to do this in private and clear your 
history file as your gpg-passphrase is in clear text.
   {{{
@@ -135, +133 @@

  svn ci -m "Publishing the bits for release ${version}"
  }}}
   1. Update upstream branches to make them aware of this new release:
-1. Copy and commit the CHANGES.md and RELEASENOTES.md:
+   1. Copy and commit the CHANGES.md and RELEASENOTES.md:
-{{{
+   {{{
   cp target/artifacts/RELEASENOTES.md 
hadoop-common-project/hadoop-common/src/site/markdown/release/${version}/RELEASENOTES.${version}.md
   cp target/artifacts/CHANGES.md 
hadoop-common-project/hadoop-common/src/site/markdown/release/${version}/CHANGES.${version}.md
-}}}
+ }}}
-1. Update {{{hadoop-project-dist/pom.xml}}} to point to this new stable 
version of the API and commit the change.
+   1. Update {{{hadoop-project-dist/pom.xml}}} to point to this new stable 
version of the API and commit the change.
-{{{
+   {{{
   X.Y.Z
-}}}
+ }}}
   1. In [[https://repository.apache.org|Nexus]]
1. effect the release of artifacts by selecting the staged repository and 
then clicking {{{Release}}}
1. If there were multiple RCs, simply drop the staging repositories 
corresponding to failed RCs.

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "HowToRelease" by SomeOtherAccount

2016-06-07 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToRelease" page has been changed by SomeOtherAccount:
https://wiki.apache.org/hadoop/HowToRelease?action=diff=83=84

Comment:
Rewrite based upon the new dev-support/bin/create-release script

  ## page was copied from HowToReleasePostMavenization
  ''This page is prepared for Hadoop Core committers. You need committer rights 
to create a new  Hadoop Core release.''
  
- These instructions have been updated to use dev-support/bin/create-release. 
Earlier versions of this document are at HowToReleaseWithSvnAndAnt and 
HowToReleasePostMavenization and HowToReleasePreDSBCR
+ These instructions have been updated to use dev-support/bin/create-release. 
Earlier versions of this document are at HowToReleaseWithSvnAndAnt and 
HowToReleasePostMavenization and [[HowToReleasePreDSBCR]]
  
  <>
  
- '''READ ALL OF THESE INSTRUCTIONS THOROUGHLY BEFORE PROCEEDING!
+ '''READ ALL OF THESE INSTRUCTIONS THOROUGHLY BEFORE PROCEEDING! '''
- '''
  
  = Preparation =
   1. If you have not already done so, 
[[http://www.apache.org/dev/release-signing.html#keys-policy|append your code 
signing key]] to the 
[[https://dist.apache.org/repos/dist/release/hadoop/common/KEYS|KEYS]] file. 
Once you commit your changes, they will automatically be propagated to the 
website. Also 
[[http://www.apache.org/dev/release-signing.html#keys-policy|upload your key to 
a public key server]] if you haven't. End users use the KEYS file (along with 
the [[http://www.apache.org/dev/release-signing.html#web-of-trust|web of 
trust]]) to validate that releases were done by an Apache committer. For more 
details on signing releases, see 
[[http://www.apache.org/dev/release-signing.html|Signing Releases]] and 
[[http://www.apache.org/dev/mirror-step-by-step.html?Step-By-Step|Step-By-Step 
Guide to Mirroring Releases]].
@@ -71, +70 @@

  mvn versions:set -DnewVersion=X.Y.Z
  }}}
  
- 
  Now, for any branches in {trunk, branch-X, branch-X.Y, branch-X.Y.Z} that 
have changed, push them to the remote repo taking care of any conflicts.
  
  {{{
@@ -87, +85 @@

   1. On a Docker- and Internet- capable machine, build the release candidate 
with {{{create-release}}}. Unless the {{{--logdir}}} is given, logs will be in 
the {{{patchprocess/}}} directory. Artifacts will be in the target/artifacts 
NOTE: This will take quite a while, since it will download and build the entire 
source tree, including documentation and native components, from scratch to 
avoid maven repository caching issues hiding issues with the source release.
   {{{
   dev-support/bin/create-release --asfrelease --docker --dockercache
-  }}}
+ }}}
   1. While it should fail {{{create-release}}} if there are issues, 
doublecheck the rat log to find and fix any potential licensing issues.
   {{{
   grep 'Rat check' target/artifacts/mvn_apache_rat.log
-  }}}
+ }}}
   1. Check that release files look ok - e.g. install it somewhere fresh and 
run examples from tutorial, do a fresh build, read the release notes looking 
for WARNINGs, etc.
   1. Set environment variable version for later steps. {{{export 
version=X.Y.Z-RCN}}}
   1. Tag the release candidate:
   {{{
   git tag -s release-$version -m "Release candidate - $version"
-  }}}
+ }}}
   1. Push branch-X.Y.Z and the newly created tag to the remote repo.
   1. Deploy the maven artifacts, on your personal computer. Please be sure you 
have completed the prerequisite step of preparing the {{{settings.xml}}} file 
before the deployment. You might want to do this in private and clear your 
history file as your gpg-passphrase is in clear text.
   {{{
@@ -135, +133 @@

  svn ci -m "Publishing the bits for release ${version}"
  }}}
   1. Update upstream branches to make them aware of this new release:
-1. Copy and commit the CHANGES.md and RELEASENOTES.md:
+   1. Copy and commit the CHANGES.md and RELEASENOTES.md:
-{{{
+   {{{
   cp target/artifacts/RELEASENOTES.md 
hadoop-common-project/hadoop-common/src/site/markdown/release/${version}/RELEASENOTES.${version}.md
   cp target/artifacts/CHANGES.md 
hadoop-common-project/hadoop-common/src/site/markdown/release/${version}/CHANGES.${version}.md
-}}}
+ }}}
-1. Update {{{hadoop-project-dist/pom.xml}}} to point to this new stable 
version of the API and commit the change.
+   1. Update {{{hadoop-project-dist/pom.xml}}} to point to this new stable 
version of the API and commit the change.
-{{{
+   {{{
   X.Y.Z
-}}}
+ }}}
   1. In [[https://repository.apache.org|Nexus]]
1. effect the release of artifacts by selecting the staged repository and 
then clicking {{{Release}}}
1. If there were multiple RCs, simply drop the staging repositories 
corresponding to failed RCs.

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail:

[Hadoop Wiki] Update of "UnixShellScriptProgrammingGuide" by SomeOtherAccount

2016-05-31 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "UnixShellScriptProgrammingGuide" page has been changed by SomeOtherAccount:
https://wiki.apache.org/hadoop/UnixShellScriptProgrammingGuide?action=diff=20=21

Comment:
More dynamic subcommands updates

  ## page was renamed from ShellScriptProgrammingGuide
  = Introduction =
- 
  With [[https://issues.apache.org/jira/browse/HADOOP-9902|HADOOP-9902]], the 
shell script code base has been refactored, with common functions and utilities 
put into a shell library (hadoop-functions.sh).  Here are some tips and tricks 
to get the most out of using this functionality:
  
  = The Skeleton =
- 
  All properly built shell scripts contain the following sections:
  
   1. `hadoop_usage` function that contains an alphabetized list of subcommands 
and their description.  This is used when the user directly asks for help, a 
command line syntax error, etc.
  
-  2. `HADOOP_LIBEXEC_DIR` configured.  This should be the location of where 
`hadoop-functions.sh`, `hadoop-config.sh`, etc, are located.
+  1. `HADOOP_LIBEXEC_DIR` configured.  This should be the location of where 
`hadoop-functions.sh`, `hadoop-config.sh`, etc, are located.
  
-  3. `HADOOP_NEW_CONFIG=true`.  This tells the rest of the system that the 
code being executed is aware that it is using the new shell API and it will 
call the routines it needs to call on its own.  If this isn't set, then several 
default actions that were done in Hadoop 2.x and earlier are executed and 
several key parts of the functionality are lost.
+  1. `HADOOP_NEW_CONFIG=true`.  This tells the rest of the system that the 
code being executed is aware that it is using the new shell API and it will 
call the routines it needs to call on its own.  If this isn't set, then several 
default actions that were done in Hadoop 2.x and earlier are executed and 
several key parts of the functionality are lost.
  
-  4. `$HADOOP_LIBEXEC_DIR/abc-config.sh` is executed, where abc is the 
subproject.  HDFS scripts should call `hdfs-config.sh`. MAPRED scripts should 
call `mapred-config.sh` YARN scripts should call `yarn-config.sh`.  Everything 
else should call `hadoop-config.sh`. This does a lot of standard 
initialization, processes standard options, etc. This is also what provides 
override capabilities for subproject specific environment variables. For 
example, the system will normally ignore `yarn-env.sh`, but `yarn-config.sh` 
will activate those settings.
+  1. `$HADOOP_LIBEXEC_DIR/abc-config.sh` is executed, where abc is the 
subproject.  HDFS scripts should call `hdfs-config.sh`. MAPRED scripts should 
call `mapred-config.sh` YARN scripts should call `yarn-config.sh`.  Everything 
else should call `hadoop-config.sh`. This does a lot of standard 
initialization, processes standard options, etc. This is also what provides 
override capabilities for subproject specific environment variables. For 
example, the system will normally ignore `yarn-env.sh`, but `yarn-config.sh` 
will activate those settings.
  
-  5. At this point, this is where the majority of your code goes.  Programs 
should process the rest of the arguments and doing whatever their script is 
supposed to do.
+  1. At this point, this is where the majority of your code goes.  Programs 
should process the rest of the arguments and doing whatever their script is 
supposed to do.
  
-  6. Before executing a Java program (preferably via hadoop_java_exec) or 
giving user output, call `hadoop_finalize`.  This finishes up the configuration 
details: adds the user class path, fixes up any missing Java properties, 
configures library paths, etc.  
+  1. Before executing a Java program (preferably via hadoop_java_exec) or 
giving user output, call `hadoop_finalize`.  This finishes up the configuration 
details: adds the user class path, fixes up any missing Java properties, 
configures library paths, etc.
  
-  7. Either an `exit` or an `exec`.  This should return 0 for success and 1 or 
higher for failure.
+  1. Either an `exit` or an `exec`.  This should return 0 for success and 1 or 
higher for failure.
  
- = Adding a Subcommand to an Existing Script =
+ = Adding a Subcommand to an Existing Script (NOT hadoop-tools-based) =
- 
  In order to add a new subcommand, there are two things that need to be done:
  
   1. Add a line to that script's `hadoop_usage` function that lists the name 
of the subcommand and what it does.  This should be alphabetized.
  
-  2. Add an additional entry in the case conditional. Depending upon what is 
being added, several things may need to be done:
+  1. Add an additional entry in the case conditional. Depending upon what is 
being added, several things may need to be done:
+   a. Set the `HADOOP_CLASSNAME` to the Java method. b. Add 
$HADOOP_CLIENT_OPTS to $HADOOP_OPTS (or, for YARN apps, $YARN_CLIENT_OPTS to 
$YARN_OPTS) if this is an interactive application or for some other reason

[Hadoop Wiki] Update of "HowToCommit" by AkiraAjisaka

2016-05-17 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToCommit" page has been changed by AkiraAjisaka:
https://wiki.apache.org/hadoop/HowToCommit?action=diff=34=35

Comment:
Fix how to commit changes to the website

   1. End-user documentation, versioned with releases; and,
   1. The website.  This is maintained separately in subversion, republished as 
it is changed.
  
- To commit end-user documentation changes to trunk or a branch, ask the user 
to submit only changes made to the *.xml files in {{{src/docs}}}. Apply that 
patch, run {{{ant docs}}} to generate the html, and then commit.  End-user 
documentation is only published to the web when releases are made, as described 
in HowToRelease.
+ To commit end-user documentation changes to trunk or a branch, ask the user 
to submit only changes made to the *.xml files in {{{src/docs}}}. Apply that 
patch, run {{{ant docs}}} to generate the html, and then commit. End-user 
documentation is only published to the web when releases are made, as described 
in HowToRelease.
  
  To commit changes to the website and re-publish them: {{{
  svn co https://svn.apache.org/repos/asf/hadoop/common/site
@@ -75, +75 @@

  svn stat # check for new pages
  svn add  # add any new pages
  svn commit
- ssh people.apache.org
- cd /www/hadoop.apache.org/common
- svn up
  }}}
  
+ The commit will be reflected on Apache Hadoop site automatically.
- Changes to website (''via svn up'') might take up to an hour to be reflected 
on Apache Hadoop site.
- 
  
  == Patches that break HDFS, YARN and MapReduce ==
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Trivial Update of "Ozone" by ArpitAgarwal

2016-05-16 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Ozone" page has been changed by ArpitAgarwal:
https://wiki.apache.org/hadoop/Ozone?action=diff=2=3

+ <>
+ 
  = Introduction =
  Ozone is an Object Store for Hadoop that is currently under development. See 
the Ozone Apache Jira 
[[https://issues.apache.org/jira/browse/HDFS-7240|HDFS-7240]] for more details. 
Ozone is currently in a prototype phase.
  

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Trivial Update of "Ozone" by ArpitAgarwal

2016-05-16 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Ozone" page has been changed by ArpitAgarwal:
https://wiki.apache.org/hadoop/Ozone?action=diff=1=2

  This wiki page is intended as a guide for Ozone contributors.
  
  = Compiling Ozone =
- Setup your development environment if you haven't done so already 
([[https://wiki.apache.org/hadoop/HowToContribute|Instructions here]]). Switch 
to the HDFS-7240 branch and build a Hadoop distribution as usual.
+ Setup your development environment if you haven't done so already 
([[https://wiki.apache.org/hadoop/HowToContribute|Instructions here]]). Switch 
to the HDFS-7240 branch, apply the in-progress patch for 
[[https://issues.apache.org/jira/browse/HDFS-10363|HDFS-10363]] and build a 
Hadoop distribution as usual.
  
  = Configuration =
  Create a new ozone-site.xml file in your Hadoop configuration directory and 
add the following settings for a bare minimal configuration.

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "Ozone" by ArpitAgarwal

2016-05-16 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Ozone" page has been changed by ArpitAgarwal:
https://wiki.apache.org/hadoop/Ozone?action=diff=1=2

  This wiki page is intended as a guide for Ozone contributors.
  
  = Compiling Ozone =
- Setup your development environment if you haven't done so already 
([[https://wiki.apache.org/hadoop/HowToContribute|Instructions here]]). Switch 
to the HDFS-7240 branch and build a Hadoop distribution as usual.
+ Setup your development environment if you haven't done so already 
([[https://wiki.apache.org/hadoop/HowToContribute|Instructions here]]). Switch 
to the HDFS-7240 branch, apply the in-progress patch for 
[[https://issues.apache.org/jira/browse/HDFS-10363|HDFS-10363]] and build a 
Hadoop distribution as usual.
  
  = Configuration =
  Create a new ozone-site.xml file in your Hadoop configuration directory and 
add the following settings for a bare minimal configuration.

-
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[Hadoop Wiki] Update of "Defining Hadoop" by SteveLoughran

2016-05-13 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Defining Hadoop" page has been changed by SteveLoughran:
https://wiki.apache.org/hadoop/Defining%20Hadoop?action=diff=17=18

Comment:
review and update, prefix Hadoop with Apache in more places

  
  Derivative works may choose to declare that they are ''Powered by 
Apache Hadoop''. Please see our 
[[http://www.apache.org/foundation/marks/faq/#poweredby|FAQ entry on Powered By 
naming styles]].
  
- There have been cases in the past where this policy has been unclear, and 
some products were named like ''XYZ distribution of Hadoop''. Such existing 
vendors of derivative works have been required to change their product names to 
become compliant with the current Apache Trademark Policy - most are in the 
process of doing so. No other supplier of derivative works of Apache Hadoop may 
describe their products in such a way.
- 
  == Domain Names ==
  
  The use of the name ''Hadoop'' in domain names is covered by the 
[[http://www.apache.org/foundation/marks/domains.html| Apache Third Party 
Domain Name Branding Policy]].
@@ -45, +43 @@

  
   * The definition of the signatures of the Hadoop interfaces and classes is 
the Apache Source tree, under revision control.
   * The definition of semantics of the Hadoop interfaces and classes is the 
Apache Source tree, including its test classes.
-  * The verification that the actual semantics of an Apache Hadoop release is 
compatible with the expected semantics is that the test suites in the Apache 
codebase pass, and that Hadoop users within the open source community have 
tested the release running at production scale in their datacentres.
+  * The verification that the actual semantics of an Apache Hadoop release is 
compatible with the expected semantics is that the test suites in the Apache 
codebase pass, and that Hadoop users within the open source community have 
tested the release running at production scale in their datacenters.
   * Bug reports can highlight incompatibility with expectations of community 
users, and once incorporated into tests form part of the compatibility testing.
   * Beta testing of forthcoming releases of Apache Hadoop are of great value 
in finding unexpected problems, and so not only benefit the product, they 
benefit the beta testers, who can more confident that their code will work in 
the final release.
   * The Hadoop source tree has annotations to mark any interface as Public or 
Private, and Stable vs Unstable, independently of the Java public/private 
annotations.
@@ -84, +82 @@

  
   "Automotive Hadoop" is a trademark of Joe's Automotive."
  
- Bad: Unless this is for a wrench or other product completely unrelated to 
computer software, this is a clear infringement on Apache's Hadoop registered 
mark.
+ This is a clear infringement on Apache's Hadoop registered mark, a mark held 
in many countries.
  
  === INAPPROPRIATE: Camshaft: it's a Hadoop for the Automotive industry ===
  
- It's good that Joe has created his own product name and brand, but saying "a 
Hadoop" is trouble. If it does contain Hadoop-related artifacts, then it breaks 
the trademark rules. If it doesn't contain ASF code, then it falls foul of the 
Generic Trademark problem: the ASF don't want their products to be generified, 
and will send a note reminding Joe of their rights and obligations.
+ It's good that Joe has created his own product name and brand, but saying "a 
Hadoop" is trouble. If it does contain Apache Hadoop-related artifacts, then it 
breaks the trademark rules. If it doesn't contain ASF code, then it falls foul 
of the Generic Trademark problem: the ASF don't want their products to be 
generified, and will send a note reminding Joe of their rights and obligations.
  
  === APPROPRIATE: Camshaft: Joe's datamining solution for the Automotive 
industry ===
  
@@ -96, +94 @@

  
  Good: it defines a new product "Camshaft", and opts to use the Apache Hadoop 
brand to emphasize its heritage. The marketing text sells the product.
  
- === APPROPRIATE: Automotive Joe's "Hadoop for Automotive Engineers" ===
+ === APPROPRIATE: Automotive Joe's "Apache Hadoop for Automotive Engineers" ===
  
   "Continuing Automotive Joe's best selling series, including the popular 
titles "Spark Gap tuning" and "Datacenter fabric: architecture and 
implementation", the book "Hadoop for Automotive Engineers" explains Apache 
Hadoop in an easy and practical way. As with the rest of the series, the cover 
is designed to be easy to wipe oil off. "
  
- Good: provided it credits Apache properly inside, this appears to be a good 
book title. Furthermore, because it's the "Automotive Joe" book series, and not 
"Automotive Joe's Hadoop" series, the series doesn't infringe anything.  Please 
see our [[http://www.apache.org/foundation/marks/faq/#booktitle|FAQ entry on 
using Apache marks in book titles]].
+ Good: provided it credits Apache properly, this

[Hadoop Wiki] Update of "AmazonS3" by SteveLoughran

2016-05-04 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "AmazonS3" page has been changed by SteveLoughran:
https://wiki.apache.org/hadoop/AmazonS3?action=diff=19=20

Comment:
update s3a docs, callout AWS, change heading levels

+ = S3 Support in Apache Hadoop =
+ 
  [[http://aws.amazon.com/s3|Amazon S3]] (Simple Storage Service) is a data 
storage service. You are billed
- monthly for storage and data transfer. Transfer between S3 and [[AmazonEC2]] 
is free. This makes use of
+ monthly for storage and data transfer. Transfer between S3 and [[AmazonEC2]] 
instances in the same geographical location are free. This makes use of
  S3 attractive for Hadoop users who run clusters on EC2.
  
  Hadoop provides multiple filesystem clients for reading and writing to and 
from Amazon S3 or compatible service.
  
-  S3 Native FileSystem (URI scheme: s3n)::
+ === S3 Native FileSystem (URI scheme: s3n) ===
-  A native filesystem for reading and writing regular files on S3. The 
advantage of this filesystem is that you can access files on S3 that were 
written with other tools. Conversely, other tools can access files written 
using Hadoop. The disadvantage is the 5GB limit on file size imposed by S3.
  
+ A native filesystem for reading and writing regular files on S3. The 
advantage of this filesystem is that you can access files on S3 that were 
written with other tools. Conversely, other tools can access files written 
using Hadoop. The S3N code is stable and widely used, but is not adding any new 
features (which is why it remains stable). S3N requires a suitable version of 
the jets3t JAR on the classpath.
-  S3A (URI scheme: s3a)::
-  A successor to the S3 Native, s3n fs, the S3a: system uses Amazon's 
libraries to interact with S3. This allows S3a to support larger files (no more 
5GB limit), higher performance operations and more. The filesystem is intended 
to be a replacement for/successor to S3 Native: all objects accessible from 
s3n:// URLs should also be accessible from s3a simply by replacing the URL 
schema.
  
+ === S3A (URI scheme: s3a) ===
+ 
+ A successor to the S3 Native, s3n:// filesystem, the S3a: system uses 
Amazon's libraries to interact with S3. This allows S3a to support larger files 
(no more 5GB limit), higher performance operations and more. The filesystem is 
intended to be a replacement for/successor to S3 Native: all objects accessible 
from s3n:// URLs should also be accessible from s3a simply by replacing the URL 
schema.
+ 
+ S3A has been considered usable in production since Hadoop 2.7, and is 
undergoing active maintenance for enhanced security, scalability and 
performance.
+ 
+ '''important:''' S3A requires the exact version of the amazon-aws-sdk against 
which Hadoop was built (and is bundled with).
+ 
-  S3 Block FileSystem (URI scheme: s3)::
+ === S3 Block FileSystem (URI scheme: s3) ===
+ 
+ '''important:''' this section covers the s3:// filesystem support inside 
Apache Hadoop. The one in Amazon EMR is different —see the details at the 
bottom of this page.
+ 
-  A block-based filesystem backed by S3. Files are stored as blocks, just like 
they are in HDFS. This permits efficient implementation of renames. This 
filesystem requires you to dedicate a bucket for the filesystem - you should 
not use an existing bucket containing files, or write other files to the same 
bucket. The files stored by this filesystem can be larger than 5GB, but they 
are not interoperable with other S3 tools.
+ A block-based filesystem backed by S3. Files are stored as blocks, just like 
they are in HDFS. This permits efficient implementation of renames. This 
filesystem requires you to dedicate a bucket for the filesystem - you should 
not use an existing bucket containing files, or write other files to the same 
bucket. The files stored by this filesystem can be larger than 5GB, but they 
are not interoperable with other S3 tools. Nobody is/should be uploading data 
to S3 via this scheme any more; it will eventually be removed from Hadoop 
entirely. Consider it (as of May 2016), deprecated.
+ 
  
  S3 can be used as a convenient repository for data input to and output for 
analytics applications using either S3 filesystem.
  Data in S3 outlasts Hadoop clusters on EC2, so they should be where 
persistent data must be kept.
  
  Note that by using S3 as an input you lose the data locality optimization, 
which may be significant. The general best practise is to copy in data using 
`distcp` at the start of a workflow, then copy it out at the end, using the 
transient HDFS in between.
  
- = History =
+ == History ==
   * The S3 block filesystem was introduced in Hadoop 0.10.0 
([[http://issues.apache.org/jira/browse/HADOOP-574|HADOOP-574]]).
   * The S3 native filesystem was introduced in Hadoop 0.18.0 
([[http://issues.apache.org/jira/browse/HADOOP-930|HADOOP-930]]) and rename 
support was added in Hadoop 0.19.0

[Hadoop Wiki] Trivial Update of "PoweredBy" by SteveLoughran

2016-04-14 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "PoweredBy" page has been changed by SteveLoughran:
https://wiki.apache.org/hadoop/PoweredBy?action=diff=438=439

Comment:
add a title with term "Apache Hadoop"; use "commercial support" as linktext for 
distributions and commercial support

+ = Powered by Apache Hadoop =
+ 
- This page documents an alphabetical list of institutions that are using 
Hadoop for educational or production uses. Companies that offer services on or 
based around Hadoop are listed in [[Distributions and Commercial Support]]. 
Please include details about your cluster hardware and size. Entries without 
this may be mistaken for spam references and deleted.'' ''
+ This page documents an alphabetical list of institutions that are using 
Apache Hadoop for educational or production uses. Companies that offer services 
on or based around Hadoop are listed in [[Distributions and Commercial 
Support|Commercial Support]]. Please include details about your cluster 
hardware and size. Entries without this may be mistaken for spam references and 
deleted.'' ''
  
  To add entries you need write permission to the wiki, which you can get by 
subscribing to the common-...@hadoop.apache.org mailing list and asking for 
permissions on the wiki account username you've registered yourself as. If you 
are using Apache Hadoop in production you ought to consider getting involved in 
the development process anyway, by filing bugs, testing beta releases, 
reviewing the code and turning your notes into shared documentation. Your 
participation in this process will ensure your needs get met.
  
@@ -70, +72 @@

  
   * ''[[http://atxcursions.com/|ATXcursions]] ''
* ''Two applications that are side products/projects of a local tour 
company: 1. Sentiment analysis of review websites and social media data. 
Targeting the tourism industry. 2. Marketing tool that analyzes the most 
valuable/useful reviewers from sites like Tripadvisor and Yelp as well as 
social media. Lets marketers and business owners find community members most 
relevant to their businesses. ''
-   * ''Using Apache Hadoop, HDFS, Hive, and HBase.'' 
+   * ''Using Apache Hadoop, HDFS, Hive, and HBase.''
* ''3 node cluster, 4 cores, 4GB RAM.''
  
  
@@ -88, +90 @@

* ''35 Node Cluster ''
* ''We have been running our cluster with no downtime for over 2 ½ years 
and have successfully handled over 75 Million files on a 64 GB Namenode with 50 
TB cluster storage. ''
* ''We are heavy MapReduce and Apache HBase users and use Apache Hadoop 
with Apache HBase for semi-supervised Machine Learning, AI R, Image 
Processing & Analysis, and Apache Lucene index sharding using katta. ''
-   
+ 
   * ''[[http://www.beebler.com|Beebler]] ''
* ''14 node cluster (each node has: 2 dual core CPUs, 2TB storage, 8GB RAM) 
''
* ''We use Apache Hadoop for matching dating profiles ''
@@ -421, +423 @@

   * ''[[http://www.legolas-media.com|Legolas Media]] ''
  
   * ''[[http://www.linkedin.com|LinkedIn]] ''
-   * ''We have multiple grids divided up based upon purpose.
+   * ''We have multiple grids divided up based upon purpose.
* ''Hardware: ''
 * ''~800 Westmere-based HP SL 170x, with 2x4 cores, 24GB RAM, 6x2TB SATA ''
 * ''~1900 Westmere-based SuperMicro X8DTT-H, with 2x6 cores, 24GB RAM, 
6x2TB SATA ''

[Hadoop Wiki] Update of "UnknownHost" by SteveLoughran

2016-04-06 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "UnknownHost" page has been changed by SteveLoughran:
https://wiki.apache.org/hadoop/UnknownHost?action=diff=9=10

Comment:
mention unknown localhost

 a. The hostname in the configuration files (such as {{{core-site.xml}}}) 
is misspelled.
   1. The hostname in the configuration files (such as {{{core-site.xml}}}) is 
confused with the hostname of another service. For example, you are using the 
hostname of the YARN Resource Manager in the {{{fs.defaultFS}}} configuration 
option to define the namenode.
   1. A worker node thinks it has a given name which it reports to the NameNode 
and JobTracker, but that isn't the name that the network team gave it, so it 
isn't resolvable.
+  1. If it is happening in service startup, it means the hostname of that 
service (HDFS, YARN, etc) cannot be found in {{{/etc/hosts}}}; the service will 
fail to start as it cannot determine which network card/address to use.
   1. The calling machine is on a different subnet from the target machine, and 
short names are being used instead of fully qualified domain names (FQDNs).
   1. You are running in a cloud infrastructure and the destination machine is 
no longer there. It may have been deleted from the DNS records, or, due to some 
race condition, something is trying to talk to a host that hasn't been created 
yet.

[Hadoop Wiki] Update of "Books" by Packt Publishing

2016-04-01 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Books" page has been changed by Packt Publishing:
https://wiki.apache.org/hadoop/Books?action=diff=35=36

  }}}
  
  
+ === Hadoop Real-World Solutions Cookbook- Second Edition ===
+ '''Name:''' 
[[https://www.packtpub.com/big-data-and-business-intelligence/hadoop-real-world-solutions-cookbook-second-edition|Hadoop
 Real-World Solutions Cookbook- Second Edition]]
+ 
+ '''Author:''' Tanmay Deshpande
+ 
+ '''Publisher:''' Packt Publishing
+ 
+ '''Date of Publishing:''' March 2016
+ 
+ The book covers recipes that are based on the latest versions of Apache 
Hadoop 2.X, YARN, Hive, Pig, Sqoop, Flume, Apache Spark, Mahout etc.
+ 
  === Hadoop Security: Protecting Your Big Data Platform ===
  
  '''Name:'''  
[[https://www.gitbook.com/book/steveloughran/kerberos_and_hadoop/details|Hadoop 
Security: Protecting Your Big Data Platform]]

[Hadoop Wiki] Update of "ZooKeeper/HowToContribute" by PatrickHunt

2016-03-26 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "ZooKeeper/HowToContribute" page has been changed by PatrickHunt:
https://wiki.apache.org/hadoop/ZooKeeper/HowToContribute?action=diff=10=11

+ = This page is deprecated - please see our new home at 
https://cwiki.apache.org/confluence/display/ZOOKEEPER =
+ 
  = How to Contribute to  ZooKeeper =
  
  This page describes the mechanics of ''how'' to contribute software to  
ZooKeeper.  For ideas about ''what'' you might contribute, please see the 
[[ZooKeeper/ProjectSuggestions| ProjectSuggestions page]].

[Hadoop Wiki] Update of "ZooKeeper" by PatrickHunt

2016-03-26 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "ZooKeeper" page has been changed by PatrickHunt:
https://wiki.apache.org/hadoop/ZooKeeper?action=diff=29=30

+ = This page is deprecated - please see our new home at 
https://cwiki.apache.org/confluence/display/ZOOKEEPER =
+ 
+ 
  == General Information ==
  ZooKeeper: Because coordinating distributed systems is a Zoo

[Hadoop Wiki] Update of "HowToRelease" by XiaoChen

2016-03-25 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToRelease" page has been changed by XiaoChen:
https://wiki.apache.org/hadoop/HowToRelease?action=diff=81=82

Comment:
Add 1 step at the beginning of 'Creating the release candidate', according to 
HADOOP-12768.

  = Creating the release candidate (X.Y.Z-RC) =
  These steps need to be performed to create the ''N''th RC for X.Y.Z, where 
''N'' starts from 0.
  
+  1. Check if the release year for Web UI footer is updated (the property 
{{{}}} in {{{hadoop-project/pom.xml}}}). If not, create a jira to 
update the property value to the right year, and propagate the fix from trunk 
to all necessary branches. Consider the voting time needed before publishing, 
it's better to use the year of (current time + voting time) here, to be 
consistent with the publishing time.
   1. Run mvn rat-check and fix any errors
   {{{
  mvn apache-rat:check

[Hadoop Wiki] Update of "SocketException" by SteveLoughran

2016-03-22 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "SocketException" page has been changed by SteveLoughran:
https://wiki.apache.org/hadoop/SocketException?action=diff=2=3

Comment:
java.net.SocketException: Permission denied

  
  Remember: These are [[YourNetworkYourProblem|your network configuration 
problems]] . Only you can fix them.
  
+ 
+ == Permission denied ==
+ 
+ This can arise if the service is configured to listen on a port numbered less 
than 1024, but is not running as a user with the appropriate
+ permissions.
+ 
+ {{{
+ 2016-03-22 15:26:18,905 ERROR 
org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain
+ java.net.SocketException: Permission denied
+ at sun.nio.ch.Net.bind0(Native Method)
+ at sun.nio.ch.Net.bind(Net.java:433)
+ at sun.nio.ch.Net.bind(Net.java:425)
+ at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
+ at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
+ at 
io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
+ at 
io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:522)
+ at 
io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1196)
+ at 
io.netty.channel.ChannelHandlerInvokerUtil.invokeBindNow(ChannelHandlerInvokerUtil.java:108)
+ at 
io.netty.channel.DefaultChannelHandlerInvoker.invokeBind(DefaultChannelHandlerInvoker.java:214)
+ at 
io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:208)
+ at 
io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:1003)
+ at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:216)
+ at 
io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:357)
+ at 
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:322)
+ at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:356)
+ at 
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:703)
+ at 
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
+ at java.lang.Thread.run(Thread.java:745)
+ 2016-03-22 15:26:18,907 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
status 1
+ 2016-03-22 15:26:18,908 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
SHUTDOWN_MSG: 
+ /
+ 
+ }}}
+ 
+ Fixes: either run the service (here, the Datanode) as a user with 
permissions, or change the service configuration to use a higher
+ numbered port.
+

[Hadoop Wiki] Update of "ConnectionRefused" by SteveLoughran

2016-03-21 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "ConnectionRefused" page has been changed by SteveLoughran:
https://wiki.apache.org/hadoop/ConnectionRefused?action=diff=12=13

Comment:
subdomains

  
  If the application or cluster is not working, and this message appears in the 
log, then it is more serious.
  
-  1. Check the hostname the client using is correct. If it's in a Hadoop 
configuration option: examine it carefully, try doing an ping by hand
+  1. Check the hostname the client using is correct. If it's in a Hadoop 
configuration option: examine it carefully, try doing an ping by hand.
   1. Check the IP address the client is trying to talk to for the hostname is 
correct.
-  1. Make sure the destination address in the exception isn't 0.0.0.0  -this 
means that you haven't actually configured the client with the real address for 
that
+  1. Make sure the destination address in the exception isn't 0.0.0.0  -this 
means that you haven't actually configured the client with the real address for 
that.
  service, and instead it is picking up the server-side property telling it to 
listen on every port for connections.
   1. If the error message says the remote service is on "127.0.0.1" or 
"localhost" that means the configuration file is telling the client that the 
service is on the local server. If your client is trying to talk to a remote 
system, then your configuration is broken.
-  1. Check that there isn't an entry for your hostname mapped to 127.0.0.1 or 
127.0.1.1 in /etc/hosts (Ubuntu is notorious for this)
+  1. Check that there isn't an entry for your hostname mapped to 127.0.0.1 or 
127.0.1.1 in /etc/hosts (Ubuntu is notorious for this).
   1. Check the port the client is trying to talk to using matches that the 
server is offering a service on.
   1. On the server, try a {{{telnet localhost }}} to see if the port is 
open there.
   1. On the client, try a {{{telnet  }}} to see if the port is 
accessible remotely.
   1. Try connecting to the server/port from a different machine, to see if it 
just the single client misbehaving.
+  1. If your client and the server are in different subdomains, it may be that 
the configuration of the service is only publishing the basic hostname, rather 
than the Fully Qualified Domain Name. The client in the different subdomain can 
be unintentionally attempt to bind to a host in the local subdomain —and 
failing.
   1. If you are using a Hadoop-based product from a third party, -please use 
the support channels provided by the vendor.
   1. Please do not file bug reports related to your problem, as they will be 
closed as [[http://wiki.apache.org/hadoop/InvalidJiraIssues|Invalid]]

[Hadoop Wiki] Update of "HowToReleasePre2.8" by AndrewWang

2016-03-03 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToReleasePre2.8" page has been changed by AndrewWang:
https://wiki.apache.org/hadoop/HowToReleasePre2.8?action=diff=81=82

  ## page was renamed from HowToReleasePostMavenizationWithGit
  ## page was copied from HowToReleasePostMavenization
  ''This page is prepared for Hadoop Core committers. You need committer rights 
to create a new  Hadoop Core release.''
+ 
+ The current version of this page is available at HowToRelease
  
  These instructions have been updated for Hadoop 2.5.1 and later releases to 
reflect the changes to version-control (git), build-scripts and mavenization.

[Hadoop Wiki] Update of "HowToRelease" by AndrewWang

2016-03-03 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToRelease" page has been changed by AndrewWang:
https://wiki.apache.org/hadoop/HowToRelease?action=diff=80=81

Comment:
Remove manual CHANGES.txt related steps

  ## page was copied from HowToReleasePostMavenization
  ''This page is prepared for Hadoop Core committers. You need committer rights 
to create a new  Hadoop Core release.''
  
- These instructions have been updated for Hadoop 2.5.1 and later releases to 
reflect the changes to version-control (git), build-scripts and mavenization.
+ These instructions have been updated for Hadoop 2.8.0 and later releases to 
reflect the changes to version-control (git), build-scripts and mavenization.
  
- Earlier versions of this document are at HowToReleaseWithSvnAndAnt and 
HowToReleasePostMavenization
+ Earlier versions of this document are at HowToReleaseWithSvnAndAnt and 
HowToReleasePostMavenization and HowToReleasePre2.8
  
  <>
  
@@ -32, +32 @@

  = Branching =
  When releasing Hadoop X.Y.Z, the following branching changes are required. 
Note that a release can match more than one of the following if-conditions. For 
a major release, one needs to make the changes for minor and point releases as 
well. Similarly, a new minor release is also a new point release.
  
-  1. Add the release X.Y.Z to CHANGES.txt files if it doesn't already exist 
(leave the date as unreleased for now). Commit these changes to any '''live''' 
upstream branch. For example, if you are handling 2.6.2, commit the changes to 
trunk, branch-2, branch-2.6, and branch-2.7 (provided branch-2.7 is an active 
branch).
-  {{{
- git commit -a -m "Adding release X.Y.Z to CHANGES.txt"
- }}}
   1. If this is a new major release (i.e., Y = 0 and Z = 0)
1. Create a new branch (branch-X) for all releases in this major release.
1. Update the version on trunk to (X+1).0.0-SNAPSHOT
@@ -100, +96 @@

  mvn apache-rat:check
  }}}
   1. Set environment variable version for later steps. {{{export 
version=X.Y.Z-RCN}}}
-  1. Set the release date for X.Y.Z to the current date in each CHANGES.txt 
file in branch-X.Y.Z and commit the changes.
-  {{{
- git commit -a -m "Set the release date for $version"
- }}}
   1. Tag the release candidate:
   {{{
  git tag -s release-$version -m "Release candidate - $version"
@@ -139, +131 @@

  = Publishing =
  In 5 days if [[http://hadoop.apache.org/bylaws#Decision+Making|the release 
vote passes]], the release may be published.
  
-  1. Update the release date in CHANGES.txt to the final release vote passage 
date, and commit them to all live upstream branches (e.g., trunk, branch-X, 
branch-X.Y) to reflect the one in branch-X.Y.Z. Commit and push those changes.
-  {{{
- git commit -a -m "Set the release date for X.Y.Z"
- }}}
   1. Tag the release. Do it from the release branch and push the created tag 
to the remote repository:
   {{{
  git tag -s rel/release-X.Y.Z -m "Hadoop X.Y.Z release"
  git push origin rel/release-X.Y.Z
  }}}
-  1. Use 
[[https://builds.apache.org/job/HADOOP2_Release_Artifacts_Builder|this Jenkins 
job]] to create the final release files
   1. Copy release files to the distribution directory
1. Check out the corresponding svn repo if need be
{{{
  svn co https://dist.apache.org/repos/dist/release/hadoop/common/ hadoop-dist
  }}}
-   1. Generate new .mds files referring to the final release tarballs and not 
the RCs
1. Copy the release files to hadoop-dist/hadoop-${version}
1. Update the symlinks to current2 and stable2. The release directory 
usually contains just two releases, the most recent from two branches.
1. Commit the changes (it requires a PMC privilege)

[Hadoop Wiki] Update of "HowToCommit" by AndrewWang

2016-03-03 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "HowToCommit" page has been changed by AndrewWang:
https://wiki.apache.org/hadoop/HowToCommit?action=diff=33=34

Comment:
remove CHANGES.txt step, now autogenerated

  
   Committing a patch 
  When you commit a patch, please follow these steps:
-  1. '''CHANGES.txt:''' Add an entry in CHANGES.txt, at the end of the 
appropriate section. This should include the JIRA issue ID, and the name of the 
contributor. Attribution in CHANGES.txt should fall under the earliest release 
that is receiving the patch, and it should be consistent across all live 
branches. If the patch is targeted to 2.8.0, then its CHANGES.txt entry would 
go in the 2.8.0 section on trunk and branch-2. If the patch is targeted to 
2.7.2, then its CHANGES.txt entry would go in the 2.7.2 section on trunk, 
branch-2 and branch-2.7.  When backporting a patch that was previously 
committed for a later branch, please update its CHANGES.txt entry on all 
branches for accuracy. Suppose a patch initially targets 2.8.0, but then later 
becomes a candidate for 2.7.2. On the initial commit, it would have been listed 
under the 2.8.0 section on trunk and branch-2. After the decision to backport 
to 2.7.2, go back and update CHANGES.txt on all branches to match reality, 
moving it to the 2.7.2 section on trunk, branch-2 and branch-2.7.
   1. '''Commit locally:''' Commit the change locally to the appropriate branch 
(should be ''trunk'' if it is not a feature branch) using {{{git commit -a -m 
}}}. The commit message should include the JIRA issue id, along 
with a short description of the change and the name of the contributor if it is 
not you. ''Note:'' Be sure to get the issue id right, as this causes JIRA to 
link to the change in git (use the issue's "All" tab to see these). Verify all 
the changes are included in the commit using {{{git status}}}. If there are any 
remaining changes (previously missed files), please commit them and squash 
these commits into one using {{{git rebase -i}}}.
   1. '''Pull latest changes from remote repo:''' Pull in the latest changes 
from the remote branch using {{{git pull --rebase}}} (--rebase is not required 
if you have setup git pull to always --rebase). Verify this didn't cause any 
merge commits using {{{git log [--pretty=oneline]}}}
   1. '''Push changes to remote repo:''' Build and run a test to ensure it is 
all still kosher. Push the changes to the remote (main) repo using {{{git push 
 }}}.

[Hadoop Wiki] Update of "Roadmap" by AndrewWang

2016-03-01 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Roadmap" page has been changed by AndrewWang:
https://wiki.apache.org/hadoop/Roadmap?action=diff=60=61

* Move default ports out of ephemeral range 
[[https://issues.apache.org/jira/browse/HDFS-9427|HDFS-9427]]
   * HDFS
* Removal of hftp in favor of webhdfs 
[[https://issues.apache.org/jira/browse/HDFS-5570|HDFS-5570]]
+   * Support for more than two standby NameNodes 
[[https://issues.apache.org/jira/browse/HDFS-6440|HDFS-6440]]
+   * Support for Erasure Codes in HDFS 
[[https://issues.apache.org/jira/browse/HDFS-7285|HDFS-7285]]
   * YARN
   * MAPREDUCE
* Derive heap size or mapreduce.*.memory.mb automatically 
[[https://issues.apache.org/jira/browse/MAPREDUCE-5785|MAPREDUCE-5785]]
@@ -65, +67 @@

  === hadoop-2.9 ===
   * HADOOP
   * HDFS
-   * Support for Erasure Codes in HDFS 
[[https://issues.apache.org/jira/browse/HDFS-7285|HDFS-7285]]
   * YARN
   * MAPREDUCE

[Hadoop Wiki] Update of "Roadmap" by AndrewWang

2016-02-22 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Roadmap" page has been changed by AndrewWang:
https://wiki.apache.org/hadoop/Roadmap?action=diff=59=60

* Move to JDK8+
* Classpath isolation on by default 
[[https://issues.apache.org/jira/browse/HADOOP-11656|HADOOP-11656]]
* Shell script rewrite 
[[https://issues.apache.org/jira/browse/HADOOP-9902|HADOOP-9902]]
+   * Move default ports out of ephemeral range 
[[https://issues.apache.org/jira/browse/HDFS-9427|HDFS-9427]]
   * HDFS
* Removal of hftp in favor of webhdfs 
[[https://issues.apache.org/jira/browse/HDFS-5570|HDFS-5570]]
   * YARN

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 3124 matches

Mail list logo