[Hadoop Wiki] Update of "HBase/Articles" by Misty
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification. The "HBase/Articles" page has been changed by Misty: https://wiki.apache.org/hadoop/HBase/Articles?action=diff=15=16 + The HBase Wiki is in the process of being decommissioned. The info that used to be on this page has moved to https://hbase.apache.org/book.html#other.info. Please update your bookmarks. - * [[http://olex.openlogic.com/wazi/2011/tips-on-loading-and-real-time-searching-of-big-data-sets/|Tips on Loading and Real-Time Searching of Big Data Sets]] by Rod Cope, May 17, 2011 - * [[http://www.cloudera.com/blog/2010/07/whats-new-in-cdh3-b2-hbase/|What’s New in CDH3b2: HBase]] by Todd Lipcon, July 9, 2010 - * [[http://www.cslab.ntua.gr/~ikons/distributed_indexing_of_webscale_datasets_for_the_cloud_mdac_2010_cr.pdf|Distributed Indexing of Web Scale Datasets for the Cloud]] by Ioannis Konstantinou, Evangelos Angelou, Dimitrios Tsoumakos and Nectarios Koziris, April 2010 - * [[http://data-tactics.com/techtips/cloud_data_structure_diagramming.pdf|Cloud Data Structure Diagramming Techniques and Design Patterns]] Posted by Rhonda Fetters, Nov 2009 (last updated March 2010) - * [[http://www.ibm.com/developerworks/opensource/library/os-hbase/|Finding the way through the semantic Web with HBase]] by Gabriel Mateescu, September 15, 2009 - * [[http://blog.ibd.com/scalable-deployment/experience-installing-hbase-0-20-0-cluster-on-ubuntu-9-04-and-ec2/|Experience installing Hbase 0.20.0 Cluster on Ubuntu 9.04 and EC2]] Posted by Robert J. Berger, Sep 8, 2009 - * [[http://airodig.com/?p=56|HBase to the rescue]] Feydr on hooking up thrift and hbase, August 2009 - * [[http://www.cringely.com/2009/05/the-sequel-dilemma/|The Sequel Dilemma]] I, Cringely, May 2009 - * [[http://jimbojw.com/wiki/index.php?title=Understanding_Hbase_and_BigTable|Understanding Hbase and BigTable]] by Jim R. Wilson, May 19, 2008 - * [[http://www.infoq.com/news/2008/04/hbase-interview|HBase Leads Discuss Hadoop, BigTable and Distributed Databases]] Posted by Scott Delap, Apr 28, 2008, at Infoq
[Hadoop Wiki] Update of "Books" by SteveLoughran
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification. The "Books" page has been changed by SteveLoughran: https://wiki.apache.org/hadoop/Books?action=diff=26=27 Comment: rm all packt publishing trackback args in their book URLs, cut the flume and hive books. == Hadoop Books == - These books are listed in order of publication, most recent first. The Apache Software Foundation does not endorse any specific book. The links to Amazon are affiliated with the specific author. That said, we also encourage you to support your local bookshops, by buying the book from any local outlet, especially independent ones. == Books in Print == - Here are the books that are currently in print -in order of publishing-, along with the Hadoop version they were written against. One problem anyone writing a book will encounter is that Hadoop is a very fast-moving target, and that things can change fast. Usually this is for the better, when a book says "Hadoop can't" they really mean "the version of Hadoop we worked with couldn't", and that the situation may have improved since then. If you have any query about Hadoop, don't be afraid to ask on the relevant user mailing lists. - {{{#!wiki comment/dotted Attention people adding new entries. @@ -15, +12 @@ # Please write this in a neutral voice, not "this book will help you", as that implies that the ASF has opinions on the matter. Someone will just edit the claims out. # Please do not go overboard in exaggerating the outcome of reading a book, "readers of this book will become experts in advanced production-scale Hadoop MapReduce jobs". Such claims will be edited out. + # Please don't have tracking URLs. We'll only cut them. }}} + === YARN Essentials === + '''Name:''' [[https://www.packtpub.com/big-data-and-business-intelligence/yarn-essentials/|YARN Essentials]] + '''Authors:''' Amol Fasale, Nirmal Kumar - - === Apache Flume: Distributed Log Collection for Hadoop - Second Edition === - - '''Name:''' [[https://www.packtpub.com/application-development/apache-flume-distributed-log-collection-hadoop-second-edition/?utm_source=Pgwiki.apache.org_medium=pod_campaign=1784392170|Apache Flume: Distributed Log Collection for Hadoop - Second Edition]] - - '''Author:''' Steve Hoffman '''Publisher:''' Packt Publishing '''Date of Publishing:''' February, 2015 - Apache Flume: Distributed Log Collection for Hadoop - Second Edition is for Hadoop programmers who want to learn about Flume to be able to move datasets into Hadoop in a timely and replicable manner. + YARN Essentials is for developers with little knowledge of Hadoop 1.x and want to start afresh with YARN. - === YARN Essentials === + === Learning YARN === + '''Name:''' [[https://www.packtpub.com/big-data-and-business-intelligence/learning-yarn/|Learning YARN]] - '''Name:''' [[https://www.packtpub.com/big-data-and-business-intelligence/yarn-essentials/?utm_source=PGwiki.apache.org_medium=pod_campaign=1784391735|YARN Essentials]] + '''Authors:''' Akhil Arora, Shrey Mehrotra - '''Authors:''' Amol Fasale, Nirmal Kumar + '''Publisher:''' Packt Publishing + + '''Date of Publishing:''' August, 2015 + + Learning YARN is intended for those who want to understand what YARN is and how to efficiently use it for the resource management of large clusters. + + + === Big Data Forensics: Learning Hadoop Investigations === + '''Name:''' [[https://www.packtpub.com/networking-and-servers/big-data-forensics-learning-hadoop-investigations/|Big Data Forensics: Learning Hadoop Investigations]] + + '''Author:''' Joe Sremack + + '''Publisher:''' Packt Publishing + + '''Date of Publishing:''' August, 2015 + + Big Data Forensics: Learning Hadoop Investigations will guide statisticians and forensic analysts with basic knowledge of digital forensics to conduct Hadoop forensic investigations. + + === Learning Hadoop 2 === + '''Name:''' [[https://www.packtpub.com/big-data-and-business-intelligence/learning-hadoop/|Learning Hadoop 2]] + + '''Authors:''' Garry Turkington, Gabriele Modena '''Publisher:''' Packt Publishing '''Date of Publishing:''' February, 2015 - YARN Essentials is for developers with little knowledge of Hadoop 1.x and want to start afresh with YARN. + Learning Hadoop 2 is an introduction guide to building data-processing applications with the wide variety of tools supported by Hadoop 2. - === Learning YARN === + === Hadoop MapReduce v2 Cookbook - Second Edition === + '''Name:''' [[https://www.packtpub.com/big-data-and-business-intelligence/hadoop-mapreduce-v2-cookbook-second-edition/|Hadoop MapReduce v2 Cookbook - Second Edition]] + '''Authors:''' Thilina Gunarathne - '''Name:''' [[https://www.packtpub.com/big-data-and-business-intelligence/learning-yarn/?utm_source=PGwiki.apache.org_medium=pod_campaign=1784393967|Learning YARN]] - - '''Authors:''' Akhil Arora, Shrey
[Hadoop Wiki] Trivial Update of "QwertyManiac/BuildingHadoopTrunk" by QwertyManiac
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification. The "QwertyManiac/BuildingHadoopTrunk" page has been changed by QwertyManiac: https://wiki.apache.org/hadoop/QwertyManiac/BuildingHadoopTrunk?action=diff=17=18 Comment: Update repo paths If you are planning to develop a new thing for Apache Hadoop, {{{trunk}}} is what you need to familiarize yourself with. 1. Checkout the sources (Use any method below): - * Using GitHub mirror: {{{git clone g...@github.com:apache/hadoop-common.git hadoop}}} + * Using GitHub mirror: {{{git clone g...@github.com:apache/hadoop.git hadoop}}} - * Using Apache Git mirror: {{{git clone git://git.apache.org/hadoop-common.git hadoop}}} + * Using Apache Git mirror: {{{git clone git://git.apache.org/hadoop.git hadoop}}} * Using the Subversion repo: {{{svn checkout http://svn.apache.org/repos/asf/hadoop/common/trunk hadoop}}} 2. Download and install Google Protobuf 2.5 (higher may not work) in your OS/Distribution. @@ -54, +54 @@ This is similar to building trunk, but checkout the "'''branch-2'''" branch before you run the commands. 1. Checkout the sources (Use any method below): - * Using GitHub mirror: {{{git clone g...@github.com:apache/hadoop-common.git hadoop}}} + * Using GitHub mirror: {{{git clone g...@github.com:apache/hadoop.git hadoop}}} * Checkout the branch-2 branch once this is done: {{{cd hadoop; git checkout branch-2}}} - * Using Apache Git mirror: {{{git clone git://git.apache.org/hadoop-common.git hadoop}}} + * Using Apache Git mirror: {{{git clone git://git.apache.org/hadoop.git hadoop}}} * Checkout the branch-2 branch once this is done: {{{cd hadoop; git checkout branch-2}}} * Using the Subversion repo: {{{svn checkout http://svn.apache.org/repos/asf/hadoop/common/branches/branch-2 hadoop}}} @@ -71, +71 @@ This is similar to building trunk, but checkout the "'''branch-0.23'''" branch before you run the commands. 1. Checkout the sources (Use any method below): - * Using GitHub mirror: {{{git clone g...@github.com:apache/hadoop-common.git hadoop}}} + * Using GitHub mirror: {{{git clone g...@github.com:apache/hadoop.git hadoop}}} * Checkout the branch-0.23 branch once this is done: {{{cd hadoop; git checkout branch-0.23}}} - * Using Apache Git mirror: {{{git clone git://git.apache.org/hadoop-common.git hadoop}}} + * Using Apache Git mirror: {{{git clone git://git.apache.org/hadoop.git hadoop}}} * Checkout the branch-0.23 branch once this is done: {{{cd hadoop; git checkout branch-0.23}}} * Using the Subversion repo: {{{svn checkout http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.23 hadoop}}} @@ -88, +88 @@ 0.22 and below used [[http://ant.apache.org|Apache Ant]] as the build tool. You need the latest '''Apache Ant''' installed and the 'ant' executable available on your PATH before continuing. 1. Checkout the sources (Use any method below): - * Using GitHub mirror: {{{git clone g...@github.com:apache/hadoop-common.git hadoop}}}. + * Using GitHub mirror: {{{git clone g...@github.com:apache/hadoop.git hadoop}}}. * Check out the branch-0.22 branch once this is done: {{{cd hadoop; git checkout branch-0.22}}} - * Using Apache Git mirror: {{{git clone git://git.apache.org/hadoop-common.git hadoop}}} + * Using Apache Git mirror: {{{git clone git://git.apache.org/hadoop.git hadoop}}} * Check out the branch-0.22 branch once this is done: {{{cd hadoop; git checkout branch-0.22}}} * Using the Subversion repo: {{{svn checkout http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.22 hadoop}}} @@ -116, +116 @@ This is almost similar as building branch-0.22, but there is just one project directory to worry about. 1. Checkout the sources (Use any method below): - * Using GitHub mirror: {{{git clone g...@github.com:apache/hadoop-common.git hadoop}}}. + * Using GitHub mirror: {{{git clone g...@github.com:apache/hadoop.git hadoop}}}. * Check out the branch-1 branch once this is done: {{{cd hadoop; git checkout branch-1}}} - * Using Apache Git mirror: {{{git clone git://git.apache.org/hadoop-common.git hadoop}}} + * Using Apache Git mirror: {{{git clone git://git.apache.org/hadoop.git hadoop}}} * Check out the branch-1 branch once this is done: {{{cd hadoop; git checkout branch-1}}} * Using the Subversion repo: {{{svn checkout http://svn.apache.org/repos/asf/hadoop/common/branches/branch-1 hadoop}}}
[Hadoop Wiki] Update of "UnsetHostnameOrPort" by SteveLoughran
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification. The "UnsetHostnameOrPort" page has been changed by SteveLoughran: https://wiki.apache.org/hadoop/UnsetHostnameOrPort Comment: Create wiki entry on the all zeros address, for matching JIRA New page: = Unset Hostname Or Port = It is an error for a Hadoop client to try to connect to a service (including a web server/web service) with a network address of all zeros "0.0.0.0", or a network port of "0". Why not? Because it's meaningless on a client. The address "0.0.0.0" means, on a server, "start your server on all network interfaces you have". On a client, ''it tells you nothing about where the host is''. The client cannot talk to a service at 0.0.0.0 because there is no information as to where the service is running. Similarly, a port of "0" tells a server application to "find a free port". The server can find a free port, but the client cannot know what it is. Usually problems with 0.0.0.0 addresses and 0 ports surface when a client application has been given the -site configuration of a service, the one the service uses to start up, which has a 0.0.0.0 address. You cannot use the same configuration file for the client and the server in this situation. The client needs to know a real hostname or IP address of the server which is hosting the service. Fix your client configuration and try again. This is not a Hadoop problem, it is an application configuration issue. As it is your cluster, [[YourNetworkYourProblem|only you can find out and track down the problem.]]. Please do not file bug reports related to your problem, as they will be closed as [[http://wiki.apache.org/hadoop/InvalidJiraIssues|Invalid]]
[Hadoop Wiki] Trivial Update of "WindowsProblems" by SteveLoughran
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification. The "WindowsProblems" page has been changed by SteveLoughran: https://wiki.apache.org/hadoop/WindowsProblems?action=diff=4=5 Comment: clarify the install process better You can fix this problem in two ways 1. Install a full native windows Hadoop version. The ASF does not currently (September 2015) release such a version; releases are available externally. - 1. Get the `WINUTILS.EXE` binary from a Hadoop redistribution. There is a repository of this for some Hadoop versions [[https://github.com/steveloughran/winutils|on github]]. + 1. Or: get the `WINUTILS.EXE` binary from a Hadoop redistribution. There is a repository of this for some Hadoop versions [[https://github.com/steveloughran/winutils|on github]]. + + Then 1. Set the environment variable `%HADOOP_HOME%` to point to the directory above the `BIN` dir containing `WINUTILS.EXE`. 1. Or: run the Java process with the system property `hadoop.home.dir` set to the home directory.
[Hadoop Wiki] Update of "WindowsProblems" by SteveLoughran
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification. The "WindowsProblems" page has been changed by SteveLoughran: https://wiki.apache.org/hadoop/WindowsProblems?action=diff=1=2 Comment: wiki page on windows problems (esp. WINUTILS.EXE) 1. Install a full native windows Hadoop version. The ASF does not currently (September 2015) release such a version; releases are available externally. 1. Get the `WINUTILS.EXE` binary from a Hadoop redistribution. There is a copy of this for Hadoop 2.6 [[on github|https://github.com/steveloughran/clusterconfigs/tree/master/clusters/morzine/hadoop_home]]. 1. Set the environment variable `%HADOOP_HOME%` to point to the directory above the `BIN` dir containing `WINUTILS.EXE`. + 1. Or: run the Java process with the system property `hadoop.home.dir` set to the home directory.
[Hadoop Wiki] Update of "WindowsProblems" by SteveLoughran
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification. The "WindowsProblems" page has been changed by SteveLoughran: https://wiki.apache.org/hadoop/WindowsProblems Comment: wiki page on windows problems (esp. WINUTILS.EXE) New page: = Problems running Hadoop on Windows = Hadoop requires native libraries on Windows to work properly -that includes to access the {{{file://}}} filesystem, where Hadoop uses some Windows APIs to implement posix-like file access permissions. This is implemented in `HADOOP.DLL` and `WINUTILS.EXE`. In particular, `%HADOOP_HOME%\BIN\WINUTILS.EXE` must be locatable. If it is not, Hadoop or an application built on top of Hadoop will fail. == How to fix a missing WINUTILS.EXE == You can fix this problem in two ways 1. Install a full native windows Hadoop version. The ASF does not currently (September 2015) release such a version; releases are available externally. 1. Get the `WINUTILS.EXE` binary from a Hadoop redistribution. There is a copy of this for Hadoop 2.6 [[on github|https://github.com/steveloughran/clusterconfigs/tree/master/clusters/morzine/hadoop_home]]. 1. Set the environment variable `%HADOOP_HOME%` to point to the directory above the `BIN` dir containing `WINUTILS.EXE`.
[Hadoop Wiki] Update of "WindowsProblems" by SteveLoughran
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification. The "WindowsProblems" page has been changed by SteveLoughran: https://wiki.apache.org/hadoop/WindowsProblems?action=diff=2=3 Comment: link to new winutils github project You can fix this problem in two ways 1. Install a full native windows Hadoop version. The ASF does not currently (September 2015) release such a version; releases are available externally. - 1. Get the `WINUTILS.EXE` binary from a Hadoop redistribution. There is a copy of this for Hadoop 2.6 [[on github|https://github.com/steveloughran/clusterconfigs/tree/master/clusters/morzine/hadoop_home]]. + 1. Get the `WINUTILS.EXE` binary from a Hadoop redistribution. There is a repository of this for some Hadoop versions [[on github|https://github.com/steveloughran/winutils]]. 1. Set the environment variable `%HADOOP_HOME%` to point to the directory above the `BIN` dir containing `WINUTILS.EXE`. 1. Or: run the Java process with the system property `hadoop.home.dir` set to the home directory. -
[Hadoop Wiki] Update of "WindowsProblems" by SteveLoughran
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification. The "WindowsProblems" page has been changed by SteveLoughran: https://wiki.apache.org/hadoop/WindowsProblems?action=diff=3=4 Comment: fix link up You can fix this problem in two ways 1. Install a full native windows Hadoop version. The ASF does not currently (September 2015) release such a version; releases are available externally. - 1. Get the `WINUTILS.EXE` binary from a Hadoop redistribution. There is a repository of this for some Hadoop versions [[on github|https://github.com/steveloughran/winutils]]. + 1. Get the `WINUTILS.EXE` binary from a Hadoop redistribution. There is a repository of this for some Hadoop versions [[https://github.com/steveloughran/winutils|on github]]. 1. Set the environment variable `%HADOOP_HOME%` to point to the directory above the `BIN` dir containing `WINUTILS.EXE`. 1. Or: run the Java process with the system property `hadoop.home.dir` set to the home directory.
[Hadoop Wiki] Update of "Books" by Packt Publishing
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification. The "Books" page has been changed by Packt Publishing: https://wiki.apache.org/hadoop/Books?action=diff=25=26 # Please do not go overboard in exaggerating the outcome of reading a book, "readers of this book will become experts in advanced production-scale Hadoop MapReduce jobs". Such claims will be edited out. }}} + + === Apache Flume: Distributed Log Collection for Hadoop - Second Edition === + + '''Name:''' [[https://www.packtpub.com/application-development/apache-flume-distributed-log-collection-hadoop-second-edition/?utm_source=Pgwiki.apache.org_medium=pod_campaign=1784392170|Apache Flume: Distributed Log Collection for Hadoop - Second Edition]] + + '''Author:''' Steve Hoffman + + '''Publisher:''' Packt Publishing + + '''Date of Publishing:''' February, 2015 + + Apache Flume: Distributed Log Collection for Hadoop - Second Edition is for Hadoop programmers who want to learn about Flume to be able to move datasets into Hadoop in a timely and replicable manner. + + === YARN Essentials === + + '''Name:''' [[https://www.packtpub.com/big-data-and-business-intelligence/yarn-essentials/?utm_source=PGwiki.apache.org_medium=pod_campaign=1784391735|YARN Essentials]] + + '''Authors:''' Amol Fasale, Nirmal Kumar + + '''Publisher:''' Packt Publishing + + '''Date of Publishing:''' February, 2015 + + YARN Essentials is for developers with little knowledge of Hadoop 1.x and want to start afresh with YARN. + + === Learning YARN === + + '''Name:''' [[https://www.packtpub.com/big-data-and-business-intelligence/learning-yarn/?utm_source=PGwiki.apache.org_medium=pod_campaign=1784393967|Learning YARN]] + + '''Authors:''' Akhil Arora, Shrey Mehrotra + + '''Publisher:''' Packt Publishing + + '''Date of Publishing:''' August, 2015 + + Learning YARN is intended for those who want to understand what YARN is and how to efficiently use it for the resource management of large clusters. + + === Apache Hive Essentials === + + '''Name:''' [[https://www.packtpub.com/big-data-and-business-intelligence/apache-hive-essentials/?utm_source=PGwiki.apache.org_medium=pod_campaign=1783558571|Apache Hive Essentials]] + + '''Author:''' Dayong Du + + '''Publisher:''' Packt Publishing + + '''Date of Publishing:''' February, 2015 + + Apache Hive Essentials is for data analysts and developers who want to use Hive to explore and analyze data in Hadoop. + + === Big Data Forensics: Learning Hadoop Investigations === + + '''Name:''' [[https://www.packtpub.com/networking-and-servers/big-data-forensics-learning-hadoop-investigations/?utm_source=PGwiki.apache.org_medium=pod_campaign=1785288105|Big Data Forensics: Learning Hadoop Investigations]] + + '''Author:''' Joe Sremack + + '''Publisher:''' Packt Publishing + + '''Date of Publishing:''' August, 2015 + + Big Data Forensics: Learning Hadoop Investigations will guide statisticians and forensic analysts with basic knowledge of digital forensics to conduct Hadoop forensic investigations. + === Learning Hadoop 2 === '''Name:''' [[https://www.packtpub.com/big-data-and-business-intelligence/learning-hadoop/?utm_source=POD_medium=referral_campaign=1783285516|Learning Hadoop 2]]
[Hadoop Wiki] Trivial Update of "Release-2.6.1-Working-Notes" by VinodKumarVavilapalli
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification. The "Release-2.6.1-Working-Notes" page has been changed by VinodKumarVavilapalli: https://wiki.apache.org/hadoop/Release-2.6.1-Working-Notes?action=diff=23=24 Comment: Added YARN-2766 ||802676e1be350785d8c0ad35f6676eeb85b2467b ||YARN-3526. ApplicationMaster tracking URL is incorrectly redirected on a QJM cluster. Cont ||Applied locally || || yes || ||536b9ee6d6e5b8430fda23cbdcfd859c299fa8ad ||HDFS-8404. Pending block replication can get stuck using older genstamp. Contributed by Na || || ||minor issues || ||d3193fd1d7395bf3e7c8dfa70d1aec08b0f147e6 ||Move YARN-2918 from 2.8.0 to 2.7.1 (cherry picked from commit 03f897fd1a3779251023bae3582 ||Applied locally || || yes || + ||3648cb57c9f018a3a339c26f5a0ca2779485521a ||YARN-2766. ApplicationHistoryManager is expected to return a sorted list of apps/attempts/containers || || Added as a dependency for YARN-3700 || ||839f81a6326b2f8b3d5183178382c1551b0bc259 ||YARN-3700. Made generic history service load a number of latest applications according to ||Applied locally || || No, merge conflicts in WebServices.java, AppsBlock.java. Had to rewrite the documentation in apt || ||25db34127811fbadb9a698fa3a76e24d426fb0f6 ||HDFS-8431. hdfs crypto class not found in Windows. Contributed by Anu Engineer. (cherry p || || ||minor issues || ||33648268ce0f79bf51facafa3d151612e3d00ddb ||HADOOP-11934. Use of JavaKeyStoreProvider in LdapGroupsMapping causes infinite loop. Contr ||Skipped || ||Dropped. Cannot apply to branch-2.6 due to java.nio package. ||
[Hadoop Wiki] Update of "Release-2.6.1-Working-Notes" by VinodKumarVavilapalli
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification. The "Release-2.6.1-Working-Notes" page has been changed by VinodKumarVavilapalli: https://wiki.apache.org/hadoop/Release-2.6.1-Working-Notes?action=diff=22=23 Comment: Updating progress for HADOOP-11802. ||756c2542930756fef1cbff82056b418070f8d55f ||MAPREDUCE-6238. MR2 can't run local jobs with -libjars command options which is a regressi ||Applied locally ||Yes, added MAPREDUCE-6267 as a dependency || ||1544c63602089b690e850e0e30af4589513a2371 ||HADOOP-11812. Implement listLocatedStatus for ViewFileSystem to speed up split calculation ||Applied locally || ||Yes || ||a6a5d1d6b5ee76c829ba7b54a4ad619f7b986681 ||HADOOP-11730. Regression: s3n read failure recovery broken. (Takenori Sato via stevel) ||Applied locally || ||Yes || - ||788b76761d5dfadf688406d50169e95401fe5d33 ||HADOOP-11802. DomainSocketWatcher thread terminates sometimes after there is an I/O error ||Skipped ||Yes ||Dropped. We will need to commit HDFS-7915 first. If we commit HDFS-7915, we need to commit HDFS-8070 as well. || + ||788b76761d5dfadf688406d50169e95401fe5d33 ||HADOOP-11802. DomainSocketWatcher thread terminates sometimes after there is an I/O error ||Included ||Yes ||Included with the dependencies. We will need to commit HDFS-7915 first. If we commit HDFS-7915, we need to commit HDFS-8070 as well. || ||a7696b3fbfacd98a892bbb3678663658c7b9d2bd ||YARN-3024. LocalizerRunner should give DIE action when all resources are localized. Contributed by Chengbing Liu ||Applied locally ||ADDED as a dependency for YARN-3464. Merge conflict || ||4045c41afe440b773d006e962bf8a5eae3fdc284 ||YARN-3464. Race condition in LocalizerRunner kills localizer before localizing all resourc ||Applied locally || ||No, merge conflict in 3 files || ||32dc13d907a416049bdb7deff429725bd6dbcb49 ||MAPREDUCE-6324. Fixed MapReduce uber jobs to not fail the udpate of AM-RM tokens when they ||Applied locally || ||Yes ||
[Hadoop Wiki] Trivial Update of "SocketPathSecurity" by SteveLoughran
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification. The "SocketPathSecurity" page has been changed by SteveLoughran: https://wiki.apache.org/hadoop/SocketPathSecurity?action=diff=2=3 Comment: formatting - = Socket Path Security = + = Socket Path Security = On Linux and potentially other Unix platforms, Apache Hadoop can support higher performance access to HDFS data via [[Unix domain sockets|https://en.wikipedia.org/wiki/Unix_domain_socket]. @@ -23, +23 @@ They can be addressed through tightening the permissions and changing user and group details. The exceptions should provide enough information to help you get started here. - Finally, these are not problems in the Hadoop code, they are related to the configuration of your servers. Filing bugs about these exceptions is likely to result in them being closed as [[Invalid|InvalidJiraIssues]] + Finally, these are not problems in the Hadoop code, they are related to the configuration of your servers. Filing bugs about these exceptions is likely to result in them being closed as InvalidJiraIssues
[Hadoop Wiki] Trivial Update of "SocketPathSecurity" by SteveLoughran
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification. The "SocketPathSecurity" page has been changed by SteveLoughran: https://wiki.apache.org/hadoop/SocketPathSecurity?action=diff=2=3 Comment: typo - = Socket Path Security = + = Socket Path Security = On Linux and potentially other Unix platforms, Apache Hadoop can support higher performance access to HDFS data via [[Unix domain sockets|https://en.wikipedia.org/wiki/Unix_domain_socket]. @@ -23, +23 @@ They can be addressed through tightening the permissions and changing user and group details. The exceptions should provide enough information to help you get started here. - Finally, these are not problems in the Hadoop code, they are related to the configuration of your servers. Filing bugs about these exceptions is likely to result in them being closed as [[Invalid|InvalidJiraIssues]] + Finally, these are not problems in the Hadoop code, they are related to the configuration of your servers. Filing bugs about these exceptions is likely to result in them being closed as InvalidJiraIssues
[Hadoop Wiki] Update of SocketPathSecurity by SteveLoughran
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The SocketPathSecurity page has been changed by SteveLoughran: https://wiki.apache.org/hadoop/SocketPathSecurity Comment: New page to go with HADOOP-12344: validateSocketPathSecurity0 message could be better New page: = Socket Path Security = On Linux and potentially other Unix platforms, Apache Hadoop can support higher performance access to HDFS data via [[Unix domain sockets|https://en.wikipedia.org/wiki/Unix_domain_socket]. These objects live in the unix filesystem, and, when opened by the both the datanode and a local process (such as HBase), allows the local process to 1. Bypass the TCP stack for less communications overhead. 1. Share file descriptors so that read operations may actually be done in the local process. To ensure data security and integrity, Hadoop will not use these sockets if the filesystem permissions of the domain socket are inadequate. If you were referred to this page by an exception in the Hadoop logs, then Hadoop considers the configuration of the domain socket insecure. This means 1. Nobody malicious can overwrite the entry with their own socket. The entire path to the socket must not contain any world-writeable directory. 1. No entry in the path is group writeable, except in the special case that the owner is root (and of course the group must be one containing only trusted accounts) 1. The owner of the file is neither root nor the effective user trying to work with the socket. All these requirements are checked, and attempts to use Domain Sockets will fail if they are unmet. They can be addressed through tightening the permissions and changing user and group details. The exceptions should provide enough information to help you get started here. Finally, these are not problems in the Hadoop code, they are related to the configuration of your servers. Filing bugs about these exceptions is likely to result in them being closed as [[Invalid|InvalidJiraIssues]
[Hadoop Wiki] Trivial Update of Release-2.6.1-Working-Notes by VinodKumarVavilapalli
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Release-2.6.1-Working-Notes page has been changed by VinodKumarVavilapalli: https://wiki.apache.org/hadoop/Release-2.6.1-Working-Notes?action=diffrev1=21rev2=22 Comment: Updating progress on local cherry-picks ||dda1fc169db2e69964cca746be4ff8965eb8b56f ||HDFS-7531. Improve the concurrent access on FsVolumeList (Lei Xu via Colin P. McCabe) (che || || ||minor issues || ||173664d70f0ed3b1852b6703d32e796778fb1c78 ||YARN-2964. RM prematurely cancels tokens for jobs that submit jobs (oozie). Contributed by ||Applied locally || ||Yes || ||bcaf15e2fa94db929b8cd11ed7c07085161bf950 ||HDFS-7443. Datanode upgrade to BLOCKID_BASED_LAYOUT fails if duplicate block files are pre || || ||remove; already committed || + ||89ef49fb0814baea3640798fdf66d2ae3a550896 ||YARN-1984. LeveldbTimelineStore does not handle db exceptions properly. Contributed by Varun Saxena ||Applied locally || ||Yes. ADDED as dep for YARN-2952 || - ||9180d11b3bbb2a49127d5d25f53b38c5113bf7ea ||YARN-2952. Fixed incorrect version check in StateStore. Contributed by Rohith Sharmaks (ch ||Applied locally || ||No, minor merge conflict || + ||9180d11b3bbb2a49127d5d25f53b38c5113bf7ea ||YARN-2952. Fixed incorrect version check in StateStore. Contributed by Rohith Sharmaks (ch ||Applied locally || ||Yes || ||8b398a66ca3728f47363fc8b2fcf7e556e6bbf5a ||YARN-2340. Fixed NPE when queue is stopped during RM restart. Contributed by Rohith Sharma ||Applied locally || ||Yes || ||ca0349b87ab1b2d0d2b9dc93de7806d26713165c ||YARN-2992. ZKRMStateStore crashes due to session expiry. Contributed by Karthik Kambatla( ||Applied locally || ||Yes || ||c116743bdda2b1792bf872020a5e2b14d772ac60 ||YARN-2922. ConcurrentModificationException in CapacityScheduler's LeafQueue. Contributed b ||Applied locally || ||Yes ||
[Hadoop Wiki] Update of Release-2.6.1-Working-Notes by VinodKumarVavilapalli
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Release-2.6.1-Working-Notes page has been changed by VinodKumarVavilapalli: https://wiki.apache.org/hadoop/Release-2.6.1-Working-Notes?action=diffrev1=20rev2=21 Comment: Updating progress on local cherry-picks ||38b031d6bab8527698bd186887d301bd6a63cf01 ||HDFS-8127. NameNode Failover during HA upgrade can cause DataNode to finalize upgrade. Con || || ||minor issues || ||e7cbecddc3e7ca5386c71aa4deb67f133611415c ||YARN-3493. RM fails to come up with error Failed to load/recover state when mem settings ||Applied locally || ||No, merge conflict in RMAppManager.java, SchedulerUtils.java, TestSchedulerUtils.java || ||3316cd4357ff6ccc4c76584813092adb1c2b4d43 ||YARN-3487. CapacityScheduler scheduler lock obtained unnecessarily when calling getQueue ( ||Applied locally || ||Yes || + ||f4d6c5e337e76dc408c9c8f19e306c3f4ba80d8e|| MAPREDUCE-6267. Refactor JobSubmitter#copyAndConfigureFiles into it's own class. (Chris Trezzo via kasha) || Applied locally || ADDED as a dependency for MAPREDUCE-6238. Two conflicts in Job.java and JobSumitter.java || + ||756c2542930756fef1cbff82056b418070f8d55f ||MAPREDUCE-6238. MR2 can't run local jobs with -libjars command options which is a regressi ||Applied locally ||Yes, added MAPREDUCE-6267 as a dependency || - - || - f4d6c5e337e76dc408c9c8f19e306c3f4ba80d8e|| MAPREDUCE-6267. Refactor JobSubmitter#copyAndConfigureFiles into it's own class. (Chris Trezzo via kasha) || Applied locally || ADDED as a dependency for MAPREDUCE-6238. Two conflicts in Job.java and JobSumitter.java || ||756c2542930756fef1cbff82056b418070f8d55f ||MAPREDUCE-6238. MR2 can't run local jobs with -libjars command options which is a regressi ||Applied locally ||Yes, added MAPREDUCE-6267 as a dependency || ||1544c63602089b690e850e0e30af4589513a2371 ||HADOOP-11812. Implement listLocatedStatus for ViewFileSystem to speed up split calculation ||Applied locally || ||Yes || ||a6a5d1d6b5ee76c829ba7b54a4ad619f7b986681 ||HADOOP-11730. Regression: s3n read failure recovery broken. (Takenori Sato via stevel) ||Applied locally || ||Yes || ||788b76761d5dfadf688406d50169e95401fe5d33 ||HADOOP-11802. DomainSocketWatcher thread terminates sometimes after there is an I/O error ||Skipped ||Yes ||Dropped. We will need to commit HDFS-7915 first. If we commit HDFS-7915, we need to commit HDFS-8070 as well. || @@ -112, +111 @@ ||4045c41afe440b773d006e962bf8a5eae3fdc284 ||YARN-3464. Race condition in LocalizerRunner kills localizer before localizing all resourc ||Applied locally || ||No, merge conflict in 3 files || ||32dc13d907a416049bdb7deff429725bd6dbcb49 ||MAPREDUCE-6324. Fixed MapReduce uber jobs to not fail the udpate of AM-RM tokens when they ||Applied locally || ||Yes || ||58970d69de8a1662e4548cd6d4ca460dd70562f8 ||HADOOP-11491. HarFs incorrectly declared as requiring an authority. (Brahma Reddy Battula ||Applied locally || ||Yes || - ||87c2d915f1cc799cb4020c945c04d3ecb82ee963 ||MAPREDUCE-5649. Reduce cannot use more than 2G memory for the final merge. Contributed by || || + ||87c2d915f1cc799cb4020c945c04d3ecb82ee963 ||MAPREDUCE-5649. Reduce cannot use more than 2G memory for the final merge. Contributed by ||Applied locally || ||Yes || ||e68e8b3b5cff85bfd8bb5b00b9033f63577856d6 ||HDFS-8219. setStoragePolicy with folder behavior is different after cluster restart. (sure || || ||minor issues || ||4e1f2eb3955a97a70cf127dc97ae49201a90f5e0 ||HDFS-7980. Incremental BlockReport will dramatically slow down namenode startup. Contribu || || ||non-trivial changes (dropped TestBlockManager.java changes) || ||d817fbb34d6e34991c6e512c20d71387750a98f4 ||YARN-2918. RM should not fail on startup if queue's configured labels do not exist in clus || || ||802a5775f3522c57c60ae29ecb9533dbbfecfe76 ||HDFS-7894. Rolling upgrade readiness is not updated in jmx until query command is issued. || || ||yes || ||f264a5aeede7e144af11f5357c7f901993de8e12 ||HDFS-8245. Standby namenode doesn't process DELETED_BLOCK if the addblock request is in ed || || ||minor issues || - ||fb5b0ebb459cc8812084090a7ce7ac29e2ad147c ||MAPREDUCE-6361. NPE issue in shuffle caused by concurrent issue between copySucceeded() in || || + ||fb5b0ebb459cc8812084090a7ce7ac29e2ad147c ||MAPREDUCE-6361. NPE issue in shuffle caused by concurrent issue between copySucceeded() in ||Applied locally || || No, merge conflict in TestShuffleScheduler || - ||a81ad814610936a02e55964fbe08f7b33fe29b23 ||YARN-3641. NodeManager: stopRecoveryStore() shouldn't be skipped when exceptions happen in || || + ||a81ad814610936a02e55964fbe08f7b33fe29b23 ||YARN-3641. NodeManager: stopRecoveryStore() shouldn't be skipped when exceptions happen in ||Applied locally || || yes || - ||802676e1be350785d8c0ad35f6676eeb85b2467b ||YARN-3526. ApplicationMaster tracking URL is incorrectly redirected on a
[Hadoop Wiki] Update of Release-2.6.1-Working-Notes by VinodKumarVavilapalli
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Release-2.6.1-Working-Notes page has been changed by VinodKumarVavilapalli: https://wiki.apache.org/hadoop/Release-2.6.1-Working-Notes?action=diffrev1=18rev2=19 Comment: Updating progress on local cherry-picks ||0d62e948877e5d50f1b6fbe735a94ac6da5ff472 ||YARN-3242. Asynchrony in ZK-close can lead to ZKRMStateStore watcher receiving events for ||Applied locally || ||Yes || ||b569c3ab1cb7e328dde822f6b2405d24b9560e3a ||HADOOP-11674. oneByteBuf in CryptoInputStream and CryptoOutputStream should be non static. ||Applied locally || ||Yes || ||a1963968d2a9589fcefaab0d63feeb68c07f4d06 || YARN-3230. Clarify application states on the web UI. (Jian He via wangda) || Applied locally || ADDED. Dependency for YARN-1809. || + ||6660c2f83b855535217582326746dc76d53fdf61 ||YARN-3249. Add a 'kill application' button to Resource Manager's Web UI. Contributed by Ryu Kobayashi. || Applied locally || ADDED. Dependency for YARN-1809. TODO: Feature || + ||a5f3fb4dc14503bf7c454a48cf954fb0d6710de2 ||YARN-1809. Synchronize RM and TimeLineServer Web-UIs. Contributed by Zhijie Shen and Xuan ||Applied locally | No, merge conflicts in 5 files. Pulled in YARN-3230 as a dependency. || ||994dadb9ba0a3b87b6548e6e0801eadd26554d55 ||HDFS-7885. Datanode should not trust the generation stamp provided by client. Contributed || || || yes || ||56c2050ab7c04e9741bcba9504b71e5a54d09eea ||YARN-3227. Timeline renew delegation token fails when RM user's TGT is expired. Contribute ||Applied locally || ||Yes || @@ -101, +103 @@ ||38b031d6bab8527698bd186887d301bd6a63cf01 ||HDFS-8127. NameNode Failover during HA upgrade can cause DataNode to finalize upgrade. Con || || || minor issues || ||e7cbecddc3e7ca5386c71aa4deb67f133611415c ||YARN-3493. RM fails to come up with error Failed to load/recover state when mem settings || Applied locally || ||No, merge conflict in RMAppManager.java, SchedulerUtils.java, TestSchedulerUtils.java || ||3316cd4357ff6ccc4c76584813092adb1c2b4d43 ||YARN-3487. CapacityScheduler scheduler lock obtained unnecessarily when calling getQueue ( ||Applied locally || ||Yes || - || f4d6c5e337e76dc408c9c8f19e306c3f4ba80d8e|| MAPREDUCE-6267. Refactor JobSubmitter#copyAndConfigureFiles into it's own class. (Chris Trezzo via kasha) || Applied locally || ADDED. as a dependency for MAPREDUCE-6238. Two conflicts in Job.java and JobSumitter.java || + || f4d6c5e337e76dc408c9c8f19e306c3f4ba80d8e|| MAPREDUCE-6267. Refactor JobSubmitter#copyAndConfigureFiles into it's own class. (Chris Trezzo via kasha) || Applied locally || ADDED as a dependency for MAPREDUCE-6238. Two conflicts in Job.java and JobSumitter.java || ||756c2542930756fef1cbff82056b418070f8d55f ||MAPREDUCE-6238. MR2 can't run local jobs with -libjars command options which is a regressi ||Applied locally ||Yes, added MAPREDUCE-6267 as a dependency|| ||1544c63602089b690e850e0e30af4589513a2371 ||HADOOP-11812. Implement listLocatedStatus for ViewFileSystem to speed up split calculation ||Applied locally || ||Yes || ||a6a5d1d6b5ee76c829ba7b54a4ad619f7b986681 ||HADOOP-11730. Regression: s3n read failure recovery broken. (Takenori Sato via stevel) ||Applied locally || ||Yes || ||788b76761d5dfadf688406d50169e95401fe5d33 ||HADOOP-11802. DomainSocketWatcher thread terminates sometimes after there is an I/O error ||Skipped ||Yes ||Dropped. We will need to commit HDFS-7915 first. If we commit HDFS-7915, we need to commit HDFS-8070 as well. || + ||a7696b3fbfacd98a892bbb3678663658c7b9d2bd || YARN-3024. LocalizerRunner should give DIE action when all resources are localized. Contributed by Chengbing Liu || Applied locally || ADDED as a dependency for YARN-3464. Merge conflict || ||4045c41afe440b773d006e962bf8a5eae3fdc284 ||YARN-3464. Race condition in LocalizerRunner kills localizer before localizing all resourc ||Applied locally || ||No, merge conflict in 3 files || ||32dc13d907a416049bdb7deff429725bd6dbcb49 ||MAPREDUCE-6324. Fixed MapReduce uber jobs to not fail the udpate of AM-RM tokens when they ||Applied locally || ||Yes || ||58970d69de8a1662e4548cd6d4ca460dd70562f8 ||HADOOP-11491. HarFs incorrectly declared as requiring an authority. (Brahma Reddy Battula ||Applied locally || ||Yes ||
[Hadoop Wiki] Update of ContributorsGroup by SteveLoughran
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The ContributorsGroup page has been changed by SteveLoughran: https://wiki.apache.org/hadoop/ContributorsGroup?action=diffrev1=111rev2=112 Comment: + VarunSaxena * udanax * Uma Maheswara Rao G * Vaibhav Puranik + * VarunSaxena * VijendarGanta * VladKorolev * VinodKumarVavilapalli
[Hadoop Wiki] Update of Release-2.6.1-Working-Notes by AkiraAjisaka
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Release-2.6.1-Working-Notes page has been changed by AkiraAjisaka: https://wiki.apache.org/hadoop/Release-2.6.1-Working-Notes?action=diffrev1=14rev2=15 Comment: Reflected Sangjin's updates. Ordered list of commits to cherrypick ||SHA1||JIRA || Status || New patch on JIRA || Applies cleanly || + ||5f3d967aaefa0b20ef1586b4048b8fa5345d2618 ||HDFS-7278. Add a command that allows sysadmins to manually trigger full block reports from || || || ADDED!!! minor issues || - ||946463efefec9031cacb21d5a5367acd150ef904 ||HDFS-7213. processIncrementalBlockReport performance degradation. Contributed by Eric Payn || || + ||946463efefec9031cacb21d5a5367acd150ef904 ||HDFS-7213. processIncrementalBlockReport performance degradation. Contributed by Eric Payn || || || yes || - ||842a54a5f66e76eb79321b66cc3b8820fe66c5cd ||HDFS-7235. DataNode#transferBlock should report blocks that don't exist using reportBadBlo || || + ||842a54a5f66e76eb79321b66cc3b8820fe66c5cd ||HDFS-7235. DataNode#transferBlock should report blocks that don't exist using reportBadBlo || || || yes || - ||8bfef590295372a48bd447b1462048008810ee17 ||HDFS-7263. Snapshot read can reveal future bytes for appended files. Contributed by Tao Lu || || + ||8bfef590295372a48bd447b1462048008810ee17 ||HDFS-7263. Snapshot read can reveal future bytes for appended files. Contributed by Tao Lu || || || yes || - ||ec2621e907742aad0264c5f533783f0f18565880 ||HDFS-7035. Make adding a new data directory to the DataNode an atomic operation and improv || || + ||ec2621e907742aad0264c5f533783f0f18565880 ||HDFS-7035. Make adding a new data directory to the DataNode an atomic operation and improv || || || minor issues || ||9e63cb4492896ffb78c84e27f263a61ca12148c8 ||HADOOP-10786. Fix UGI#reloginFromKeytab on Java 8. ||Applied locally || ||Yes || ||beb184ac580b0d89351a3f3a7201da34a26db1c1 ||YARN-2856. Fixed RMAppImpl to handle ATTEMPT_KILLED event at ACCEPTED state on app recover ||Applied locally || ||Yes || ||ad140d1fc831735fb9335e27b38d2fc040847af1 ||YARN-2816. NM fail to start with NPE during container recovery. Contributed by Zhihai Xu( ||Applied locally || ||Yes || ||242fd0e39ad1c5d51719cd0f6c197166066e3288 ||YARN-2414. RM web UI: app page will crash if app is failed before any attempt has been cre ||Applied locally || ||Yes || - ||2e15754a92c6589308ccbbb646166353cc2f2456 ||HDFS-7225. Remove stale block invalidation work when DN re-registers with different UUID. || + ||2e15754a92c6589308ccbbb646166353cc2f2456 ||HDFS-7225. Remove stale block invalidation work when DN re-registers with different UUID. || || yes || ||db31ef7e7f55436bbf88c6d93e2273c4463ca9f0 ||YARN-2865. Fixed RM to always create a new RMContext when transtions from StandBy to Activ ||Applied locally || ||Yes || - ||8d8eb8dcec94e92d94eedef883cdece8ba333087 ||HDFS-7425. NameNode block deletion logging uses incorrect appender. Contributed by Chris N || || + ||8d8eb8dcec94e92d94eedef883cdece8ba333087 ||HDFS-7425. NameNode block deletion logging uses incorrect appender. Contributed by Chris N || || || remove; already committed || - ||946df98dce18975e37a6a14744ca7a5429f019ce ||HDFS-4882. Prevent the Namenode's LeaseManager from looping forever in checkLeases (Ravi P || || + ||946df98dce18975e37a6a14744ca7a5429f019ce ||HDFS-4882. Prevent the Namenode's LeaseManager from looping forever in checkLeases (Ravi P || || || remove; already committed || ||ae35b0e14d3438237f4b5d3b5d5268d45e549846 ||YARN-2906. CapacitySchedulerPage shows HTML tags for a queue's Active Users. Contributed b ||Applied locally || ||Yes || ||f6d1bf5ed1cf647d82e676df15587de42b1faa42 ||HADOOP-11333. Fix deadlock in DomainSocketWatcher when the notification pipe is full (zhao ||Applied locally || ||Yes || ||38ea1419f60d2b8176dba4931748f1f0e52ca84e ||YARN-2905. AggregatedLogsBlock page can infinitely loop if the aggregated log file is corr ||Applied locally || ||Yes || ||d21ef79707a0f32939d9a5af4fed2d9f5fe6f2ec ||YARN-2890. MiniYARNCluster should start the timeline server based on the configuration. Co ||Applied locally || ||Yes || - ||06552a15d5172a2b0ad3d61aa7f9a849857385aa ||HDFS-7446. HDFS inotify should have the ability to determine what txid it has read up to ( || || + ||06552a15d5172a2b0ad3d61aa7f9a849857385aa ||HDFS-7446. HDFS inotify should have the ability to determine what txid it has read up to ( || || || minor issues || ||d6f3d4893d750f19dd8c539fe28eecfab2a54576 ||YARN-2894. Fixed a bug regarding application view acl when RM fails over. Contributed by R ||Applied locally || || No, minor import issues || ||25be97808b99148412c0efd4d87fc750db4d6607 ||YARN-2874. Dead lock in DelegationTokenRenewer which blocks RM to execute any further apps ||Applied locally || ||Yes || ||dabdd2d746d1e1194c124c5c7fe73fcc025e78d2
[Hadoop Wiki] Update of Release-2.6.1-Working-Notes by AkiraAjisaka
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Release-2.6.1-Working-Notes page has been changed by AkiraAjisaka: https://wiki.apache.org/hadoop/Release-2.6.1-Working-Notes?action=diffrev1=15rev2=16 ||842a54a5f66e76eb79321b66cc3b8820fe66c5cd ||HDFS-7235. DataNode#transferBlock should report blocks that don't exist using reportBadBlo || || || yes || ||8bfef590295372a48bd447b1462048008810ee17 ||HDFS-7263. Snapshot read can reveal future bytes for appended files. Contributed by Tao Lu || || || yes || ||ec2621e907742aad0264c5f533783f0f18565880 ||HDFS-7035. Make adding a new data directory to the DataNode an atomic operation and improv || || || minor issues || - ||9e63cb4492896ffb78c84e27f263a61ca12148c8 ||HADOOP-10786. Fix UGI#reloginFromKeytab on Java 8. ||Applied locally || ||Yes || + ||9e63cb4492896ffb78c84e27f263a61ca12148c8 ||HADOOP-10786. Fix UGI#reloginFromKeytab on Java 8. || || ||remove; already committed || ||beb184ac580b0d89351a3f3a7201da34a26db1c1 ||YARN-2856. Fixed RMAppImpl to handle ATTEMPT_KILLED event at ACCEPTED state on app recover ||Applied locally || ||Yes || ||ad140d1fc831735fb9335e27b38d2fc040847af1 ||YARN-2816. NM fail to start with NPE during container recovery. Contributed by Zhihai Xu( ||Applied locally || ||Yes || ||242fd0e39ad1c5d51719cd0f6c197166066e3288 ||YARN-2414. RM web UI: app page will crash if app is failed before any attempt has been cre ||Applied locally || ||Yes || @@ -17, +17 @@ ||8d8eb8dcec94e92d94eedef883cdece8ba333087 ||HDFS-7425. NameNode block deletion logging uses incorrect appender. Contributed by Chris N || || || remove; already committed || ||946df98dce18975e37a6a14744ca7a5429f019ce ||HDFS-4882. Prevent the Namenode's LeaseManager from looping forever in checkLeases (Ravi P || || || remove; already committed || ||ae35b0e14d3438237f4b5d3b5d5268d45e549846 ||YARN-2906. CapacitySchedulerPage shows HTML tags for a queue's Active Users. Contributed b ||Applied locally || ||Yes || - ||f6d1bf5ed1cf647d82e676df15587de42b1faa42 ||HADOOP-11333. Fix deadlock in DomainSocketWatcher when the notification pipe is full (zhao ||Applied locally || ||Yes || + ||f6d1bf5ed1cf647d82e676df15587de42b1faa42 ||HADOOP-11333. Fix deadlock in DomainSocketWatcher when the notification pipe is full (zhao || || ||remove; already committed || ||38ea1419f60d2b8176dba4931748f1f0e52ca84e ||YARN-2905. AggregatedLogsBlock page can infinitely loop if the aggregated log file is corr ||Applied locally || ||Yes || ||d21ef79707a0f32939d9a5af4fed2d9f5fe6f2ec ||YARN-2890. MiniYARNCluster should start the timeline server based on the configuration. Co ||Applied locally || ||Yes || ||06552a15d5172a2b0ad3d61aa7f9a849857385aa ||HDFS-7446. HDFS inotify should have the ability to determine what txid it has read up to ( || || || minor issues || @@ -129, +129 @@ ||752caa95a40d899e1bf98bc907e91aec2bb57073 ||YARN-3585. NodeManager cannot exit on SHUTDOWN event triggered and NM recovery is enabled. || || ||78d626fa892415023827e35ad549636e2a83275d ||YARN-3733. Fix DominantRC#compare() does not work as expected if cluster resource is empty || || ||344b7509153cdd993218cd5104c7e5c07cd35d3c ||Add missing test file of YARN-3733 (cherry picked from commit 405bbcf68c32d8fd8a83e46e686 || || - ||80697e4f324948ec32b4cad3faccba55287be652 ||HADOOP-7139. Allow appending to existing SequenceFiles (Contributed by kanaka kumar avvaru ||Applied locally || ||Yes || + ||80697e4f324948ec32b4cad3faccba55287be652 ||HADOOP-7139. Allow appending to existing SequenceFiles (Contributed by kanaka kumar avvaru || || ||remove; already committed || ||cbd11681ce8a51d187d91748b67a708681e599de ||HDFS-8480. Fix performance and timeout issues in HDFS-7929 by using hard-links to preserve || || || non-trivial changes (dropped TestDFSUpgrade.java changes) || ||15b1800b1289d239cbebc5cfd66cfe156d45a2d3 ||YARN-3832. Resource Localization fails on a cluster due to existing cache directories. Con || || ||55427fb66c6d52ce98b4d68a29b592a734014c28 ||HADOOP-8151. Error handling in snappy decompressor throws invalid exceptions. Contributed ||Applied locally || ||Yes ||
[Hadoop Wiki] Update of Release-2.6.1-Working-Notes by VinodKumarVavilapalli
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Release-2.6.1-Working-Notes page has been changed by VinodKumarVavilapalli: https://wiki.apache.org/hadoop/Release-2.6.1-Working-Notes?action=diffrev1=17rev2=18 Comment: Updating progress on local cherry-picks ||6090f51725e2b44d794433ed72a1901fae2ba7e3 ||HDFS-7871. NameNodeEditLogRoller can keep printing 'Swallowing exception' message. Contrib || || || yes || ||888a44563819ba910dc3cc10d10ee0fb8f05db61 ||YARN-3222. Fixed NPE on RMNodeImpl#ReconnectNodeTransition when a node is reconnected with ||Applied locally || ||Yes || ||721d7b574126c4070322f70ec5b49a7b8558a4c7 ||YARN-3231. FairScheduler: Changing queueMaxRunningApps interferes with pending jobs. (Siqi ||Applied locally || ||Yes || - ||0d62e948877e5d50f1b6fbe735a94ac6da5ff472 ||YARN-3242. Asynchrony in ZK-close can lead to ZKRMStateStore watcher receiving events for || || + ||0d62e948877e5d50f1b6fbe735a94ac6da5ff472 ||YARN-3242. Asynchrony in ZK-close can lead to ZKRMStateStore watcher receiving events for ||Applied locally || ||Yes || ||b569c3ab1cb7e328dde822f6b2405d24b9560e3a ||HADOOP-11674. oneByteBuf in CryptoInputStream and CryptoOutputStream should be non static. ||Applied locally || ||Yes || + ||a1963968d2a9589fcefaab0d63feeb68c07f4d06 || YARN-3230. Clarify application states on the web UI. (Jian He via wangda) || Applied locally || ADDED. Dependency for YARN-1809. || - ||a5f3fb4dc14503bf7c454a48cf954fb0d6710de2 ||YARN-1809. Synchronize RM and TimeLineServer Web-UIs. Contributed by Zhijie Shen and Xuan || || + ||a5f3fb4dc14503bf7c454a48cf954fb0d6710de2 ||YARN-1809. Synchronize RM and TimeLineServer Web-UIs. Contributed by Zhijie Shen and Xuan ||Applied locally | No, merge conflicts in 5 files. Pulled in YARN-3230 as a dependency. || ||994dadb9ba0a3b87b6548e6e0801eadd26554d55 ||HDFS-7885. Datanode should not trust the generation stamp provided by client. Contributed || || || yes || - ||56c2050ab7c04e9741bcba9504b71e5a54d09eea ||YARN-3227. Timeline renew delegation token fails when RM user's TGT is expired. Contribute || || + ||56c2050ab7c04e9741bcba9504b71e5a54d09eea ||YARN-3227. Timeline renew delegation token fails when RM user's TGT is expired. Contribute ||Applied locally || ||Yes || - ||a94d23762e2cf4211fe84661eb67504c7072db49 ||YARN-3287. Made TimelineClient put methods do as the correct login context. Contributed by || || + ||a94d23762e2cf4211fe84661eb67504c7072db49 ||YARN-3287. Made TimelineClient put methods do as the correct login context. Contributed by ||Applied locally || ||No, one conflict in TimelineClientImpl.java || ||eefca23e8c5e474de1e25bf2ec8a5b266bbe8cfe ||HDFS-7830. DataNode does not release the volume lock when adding a volume fails. (Lei Xu v || || || non-trivial changes || ||813c93cb250d6d556604fe98845b979970bd5e18 ||HADOOP-11710. Make CryptoOutputStream behave like DFSOutputStream wrt synchronization. (Se ||Applied locally || ||Yes || - ||44aedad5ddc8069a6dba3eaf66ed54d612b21208 ||YARN-3267. Timelineserver applies the ACL rules after applying the limit on the number of || || + ||44aedad5ddc8069a6dba3eaf66ed54d612b21208 ||YARN-3267. Timelineserver applies the ACL rules after applying the limit on the number of ||Applied locally | No, one conflict in LeveldbTimelineStore.java || ||af80a98ace50934284dde417efc802bb094c8b4e ||HDFS-7926. NameNode implementation of ClientProtocol.truncate(..) is not idempotent. Contr || || || remove; does not apply to 2.6, truncate is a 2.7 feature || ||5a5b2446485531f12d37f3d4ca791672b9921872 ||HDFS-7587. Edit log corruption can happen if append fails with a quota violation. Contribu || || yes || used the 2.6 patch || ||219eb22c1571f76df32967a930049d983cbf5024 ||HDFS-7929. inotify unable fetch pre-upgrade edit log segments once upgrade starts (Zhe Zha || || || non-trivial changes || ||90164ffd84f6ef56e9f8f99dcc7424a8d115dbae ||HDFS-7930. commitBlockSynchronization() does not remove locations. (yliu) || || yes || used the 2.6 patch || - ||8e142d27cbddfa1a1c83c5f8752bd14ac0a13612 ||YARN-3369. Missing NullPointer check in AppSchedulingInfo causes RM to die. (Brahma Reddy || || + ||8e142d27cbddfa1a1c83c5f8752bd14ac0a13612 ||YARN-3369. Missing NullPointer check in AppSchedulingInfo causes RM to die. (Brahma Reddy ||Applied locally || ||Yes || - ||cbdcdfad6de81e17fb586bc2a53b37da43defd79 ||YARN-3393. Getting application(s) goes wrong when app finishes before starting the attempt || || + ||cbdcdfad6de81e17fb586bc2a53b37da43defd79 ||YARN-3393. Getting application(s) goes wrong when app finishes before starting the attempt ||Applied locally || ||Yes || ||fe693b72dec703ecbf4ab3919d61d06ea8735a9e ||HDFS-7884. Fix NullPointerException in BlockSender when the generation stamp provided by t || || || yes || ||2f46ee50bd4efc82ba3d30bd36f7637ea9d9714e ||HDFS-7960. The full
[Hadoop Wiki] Update of Release-2.6.1-Working-Notes by AkiraAjisaka
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Release-2.6.1-Working-Notes page has been changed by AkiraAjisaka: https://wiki.apache.org/hadoop/Release-2.6.1-Working-Notes?action=diffrev1=3rev2=4 ||8d8eb8dcec94e92d94eedef883cdece8ba333087 ||HDFS-7425. NameNode block deletion logging uses incorrect appender. Contributed by Chris N || || ||946df98dce18975e37a6a14744ca7a5429f019ce ||HDFS-4882. Prevent the Namenode's LeaseManager from looping forever in checkLeases (Ravi P || || ||ae35b0e14d3438237f4b5d3b5d5268d45e549846 ||YARN-2906. CapacitySchedulerPage shows HTML tags for a queue's Active Users. Contributed b || || - ||f6d1bf5ed1cf647d82e676df15587de42b1faa42 ||HADOOP-11333. Fix deadlock in DomainSocketWatcher when the notification pipe is full (zhao || || + ||f6d1bf5ed1cf647d82e676df15587de42b1faa42 ||HADOOP-11333. Fix deadlock in DomainSocketWatcher when the notification pipe is full (zhao || || ||Yes || ||38ea1419f60d2b8176dba4931748f1f0e52ca84e ||YARN-2905. AggregatedLogsBlock page can infinitely loop if the aggregated log file is corr || || ||d21ef79707a0f32939d9a5af4fed2d9f5fe6f2ec ||YARN-2890. MiniYARNCluster should start the timeline server based on the configuration. Co || || ||06552a15d5172a2b0ad3d61aa7f9a849857385aa ||HDFS-7446. HDFS inotify should have the ability to determine what txid it has read up to ( || || ||d6f3d4893d750f19dd8c539fe28eecfab2a54576 ||YARN-2894. Fixed a bug regarding application view acl when RM fails over. Contributed by R || || ||25be97808b99148412c0efd4d87fc750db4d6607 ||YARN-2874. Dead lock in DelegationTokenRenewer which blocks RM to execute any further apps || || - ||dabdd2d746d1e1194c124c5c7fe73fcc025e78d2 ||HADOOP-11343. Overflow is not properly handled in caclulating final iv for AES CTR. Contri || || + ||dabdd2d746d1e1194c124c5c7fe73fcc025e78d2 ||HADOOP-11343. Overflow is not properly handled in caclulating final iv for AES CTR. Contri || || ||Yes || ||deaa172e7a2ab09656cc9eb431a3e68a73e0bd96 ||HADOOP-11368. Fix SSLFactory truststore reloader thread leak in KMSClientProvider. Contrib || || ||a037d6030b5ae9422fdb265f5e4880d515be9e37 ||HDFS-7489. Incorrect locking in FsVolumeList#checkDirs can hang datanodes (Noah Lorang via || || ||1986ea8dd223267ced3e3aef69980b46e2fef740 ||YARN-2910. FSLeafQueue can throw ConcurrentModificationException. (Wilfred Spiegelenburg v || ||
[Hadoop Wiki] Update of Release-2.6.1-Working-Notes by VinodKumarVavilapalli
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Release-2.6.1-Working-Notes page has been changed by VinodKumarVavilapalli: https://wiki.apache.org/hadoop/Release-2.6.1-Working-Notes?action=diffrev1=2rev2=3 Comment: Added a Header Ordered list of commits to cherrypick + ||SHA1||JIRA || Status || New patch on JIRA || Applies cleanly || ||946463efefec9031cacb21d5a5367acd150ef904 ||HDFS-7213. processIncrementalBlockReport performance degradation. Contributed by Eric Payn || || ||842a54a5f66e76eb79321b66cc3b8820fe66c5cd ||HDFS-7235. DataNode#transferBlock should report blocks that don't exist using reportBadBlo || || ||8bfef590295372a48bd447b1462048008810ee17 ||HDFS-7263. Snapshot read can reveal future bytes for appended files. Contributed by Tao Lu || ||
[Hadoop Wiki] Update of Release-2.6.1-Working-Notes by AkiraAjisaka
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Release-2.6.1-Working-Notes page has been changed by AkiraAjisaka: https://wiki.apache.org/hadoop/Release-2.6.1-Working-Notes?action=diffrev1=5rev2=6 ||43b3b43cea1f620ce66521bcc1c4b6aec264aa9a ||MAPREDUCE-6230. Fixed RMContainerAllocator to update the new AMRMToken service name proper || || ||c428d303f67bef3a7df12153947c6b0199a0938b ||HDFS-7707. Edit log corruption due to delayed block removal again. Contributed by Yongjun || || ||5807afed0a4f08b6b2acd88f424bace506e03707 ||HDFS-7733. NFS: readdir/readdirplus return null directory attribute on failure. (Contribut || || - ||3d36d4737c160d7dc8829e9dd6b801ef6726c0c0 ||HADOOP-11506. Configuration variable expansion regex expensive for long values. (Gera Sheg || || + ||3d36d4737c160d7dc8829e9dd6b801ef6726c0c0 ||HADOOP-11506. Configuration variable expansion regex expensive for long values. (Gera Sheg ||Applied locally || ||Yes || ||b1aad1d941c5855e5dd5f5e819e9218d8fc266ee ||MAPREDUCE-6237. Multiple mappers with DBInputFormat don't work because of reusing conectio || || ||61466809552f96a83aa19446d4d59cecd0d2cad5 ||YARN-3094. Reset timer for liveness monitors after RM recovery. Contributed by Jun Gong (c || || ||a1bf7aecf7d018c5305fa3bd7a9e3ef9af3155c1 ||HDFS-7714. Simultaneous restart of HA NameNodes and DataNode can cause DataNode to registe || || ||fd75b8c9cadd069673afc80a0fc5661d779897bd ||YARN-2246. Made the proxy tracking URL always be http(s)://proxy addr:port/proxy/appId t || || ||ba18adbb27c37a8fa92223a412ce65eaa462d18b ||YARN-3207. Secondary filter matches entites which do not have the key being filtered for. || || - ||6c01e586198a3c3ebaa7561778c124ae62553246 ||HADOOP-11295. RPC Server Reader thread can't shutdown if RPCCallQueue is full. Contributed || || + ||6c01e586198a3c3ebaa7561778c124ae62553246 ||HADOOP-11295. RPC Server Reader thread can't shutdown if RPCCallQueue is full. Contributed ||Applied locally ||Yes ||No || ||b9157f92fc3e008e4f3029f8feeaf6acb52eb76f ||HDFS-7788. Post-2.6 namenode may not start up with an image containing inodes created with || || ||187e081d5a8afe1ddfe5d7b5e7de7a94512aa53e ||HADOOP-11604. Prevent ConcurrentModificationException while closing domain sockets during || || ||fefeba4ac8bed44ce2dd0d3c4f0a99953ff8d4df ||YARN-3238. Connection timeouts to nodemanagers are retried at multiple levels. Contributed || ||
[Hadoop Wiki] Update of Release-2.6.1-Working-Notes by AkiraAjisaka
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Release-2.6.1-Working-Notes page has been changed by AkiraAjisaka: https://wiki.apache.org/hadoop/Release-2.6.1-Working-Notes?action=diffrev1=4rev2=5 ||8d8eb8dcec94e92d94eedef883cdece8ba333087 ||HDFS-7425. NameNode block deletion logging uses incorrect appender. Contributed by Chris N || || ||946df98dce18975e37a6a14744ca7a5429f019ce ||HDFS-4882. Prevent the Namenode's LeaseManager from looping forever in checkLeases (Ravi P || || ||ae35b0e14d3438237f4b5d3b5d5268d45e549846 ||YARN-2906. CapacitySchedulerPage shows HTML tags for a queue's Active Users. Contributed b || || - ||f6d1bf5ed1cf647d82e676df15587de42b1faa42 ||HADOOP-11333. Fix deadlock in DomainSocketWatcher when the notification pipe is full (zhao || || ||Yes || + ||f6d1bf5ed1cf647d82e676df15587de42b1faa42 ||HADOOP-11333. Fix deadlock in DomainSocketWatcher when the notification pipe is full (zhao ||Applied locally || ||Yes || ||38ea1419f60d2b8176dba4931748f1f0e52ca84e ||YARN-2905. AggregatedLogsBlock page can infinitely loop if the aggregated log file is corr || || ||d21ef79707a0f32939d9a5af4fed2d9f5fe6f2ec ||YARN-2890. MiniYARNCluster should start the timeline server based on the configuration. Co || || ||06552a15d5172a2b0ad3d61aa7f9a849857385aa ||HDFS-7446. HDFS inotify should have the ability to determine what txid it has read up to ( || || ||d6f3d4893d750f19dd8c539fe28eecfab2a54576 ||YARN-2894. Fixed a bug regarding application view acl when RM fails over. Contributed by R || || ||25be97808b99148412c0efd4d87fc750db4d6607 ||YARN-2874. Dead lock in DelegationTokenRenewer which blocks RM to execute any further apps || || - ||dabdd2d746d1e1194c124c5c7fe73fcc025e78d2 ||HADOOP-11343. Overflow is not properly handled in caclulating final iv for AES CTR. Contri || || ||Yes || + ||dabdd2d746d1e1194c124c5c7fe73fcc025e78d2 ||HADOOP-11343. Overflow is not properly handled in caclulating final iv for AES CTR. Contri ||Applied locally || ||Yes || - ||deaa172e7a2ab09656cc9eb431a3e68a73e0bd96 ||HADOOP-11368. Fix SSLFactory truststore reloader thread leak in KMSClientProvider. Contrib || || + ||deaa172e7a2ab09656cc9eb431a3e68a73e0bd96 ||HADOOP-11368. Fix SSLFactory truststore reloader thread leak in KMSClientProvider. Contrib ||Applied locally || ||Yes || ||a037d6030b5ae9422fdb265f5e4880d515be9e37 ||HDFS-7489. Incorrect locking in FsVolumeList#checkDirs can hang datanodes (Noah Lorang via || || ||1986ea8dd223267ced3e3aef69980b46e2fef740 ||YARN-2910. FSLeafQueue can throw ConcurrentModificationException. (Wilfred Spiegelenburg v || || ||e4f9ddfdbcdab1eacf568e8689448ffc10bbc2aa ||HDFS-7503. Namenode restart after large deletions can cause slow processReport (Arpit Agar || || ||41f0d20fcb4fec5b932b8947a44f93345205222c ||YARN-2917. Fixed potential deadlock when system.exit is called in AsyncDispatcher. Contrib || || - ||b521d91c0f5b6d114630b6727f6a01db56dba4f1 ||HADOOP-11238. Update the NameNode's Group Cache in the background when possible (Chris Li || || + ||b521d91c0f5b6d114630b6727f6a01db56dba4f1 ||HADOOP-11238. Update the NameNode's Group Cache in the background when possible (Chris Li ||Applied locally || ||Yes || ||7c7bccfc31ede6f4afc043d81c8204046148c02e ||MAPREDUCE-6166. Reducers do not validate checksum of map outputs when fetching directly to || || ||2d832ad2eb87e0ce7c50899c54d05f612666518a ||YARN-2964. FSLeafQueue#assignContainer - document the reason for using both write and read || || ||dda1fc169db2e69964cca746be4ff8965eb8b56f ||HDFS-7531. Improve the concurrent access on FsVolumeList (Lei Xu via Colin P. McCabe) (che || || @@ -42, +42 @@ ||4b589e7cfa27bd042e228bbbcf1c3b75b2aeaa57 ||HDFS-7182. JMX metrics aren't accessible when NN is busy. Contributed by Ming Ma. || || ||75e4e55e12b2faa521af7c23fddcba06a9ce661d ||HDFS-7596. NameNode should prune dead storages from storageMap. Contributed by Arpit Agarw || || ||33534a0c9aef5024aa6f340e7ee24930c8fa8ed5 ||HDFS-7533. Datanode sometimes does not shutdown on receiving upgrade shutdown command. Con || || - ||2e4df8710435c8362506fe944a935e74ad5919c0 ||HADOOP-11350. The size of header buffer of HttpServer is too small when HTTPS is enabled. || || + ||2e4df8710435c8362506fe944a935e74ad5919c0 ||HADOOP-11350. The size of header buffer of HttpServer is too small when HTTPS is enabled. ||Applied locally || ||Yes || ||1d9d166c0beb56aa45e65f779044905acff25d88 ||HDFS-7575. Upgrade should generate a unique storage ID for each volume. (Contributed by Ar || || - ||7b69719455a1a374c9417417ef0c8d7ba6bf593f ||HADOOP-11482. Use correct UGI when KMSClientProvider is called by a proxy user. Contribute || || + ||7b69719455a1a374c9417417ef0c8d7ba6bf593f ||HADOOP-11482. Use correct UGI when KMSClientProvider is called by a proxy user. Contribute ||Applied locally || ||Yes ||
[Hadoop Wiki] Update of Release-2.6.1-Working-Notes by AkiraAjisaka
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Release-2.6.1-Working-Notes page has been changed by AkiraAjisaka: https://wiki.apache.org/hadoop/Release-2.6.1-Working-Notes?action=diffrev1=6rev2=7 ||ba18adbb27c37a8fa92223a412ce65eaa462d18b ||YARN-3207. Secondary filter matches entites which do not have the key being filtered for. || || ||6c01e586198a3c3ebaa7561778c124ae62553246 ||HADOOP-11295. RPC Server Reader thread can't shutdown if RPCCallQueue is full. Contributed ||Applied locally ||Yes ||No || ||b9157f92fc3e008e4f3029f8feeaf6acb52eb76f ||HDFS-7788. Post-2.6 namenode may not start up with an image containing inodes created with || || - ||187e081d5a8afe1ddfe5d7b5e7de7a94512aa53e ||HADOOP-11604. Prevent ConcurrentModificationException while closing domain sockets during || || + ||187e081d5a8afe1ddfe5d7b5e7de7a94512aa53e ||HADOOP-11604. Prevent ConcurrentModificationException while closing domain sockets during ||Applied locally || ||Yes || ||fefeba4ac8bed44ce2dd0d3c4f0a99953ff8d4df ||YARN-3238. Connection timeouts to nodemanagers are retried at multiple levels. Contributed || || ||657a6e389b3f6eae43efb11deb6253c3b1255a51 ||HDFS-7009. Active NN and standby NN have different live nodes. Contributed by Ming Ma. (c || || ||8346427929d2b6f2fd3fa228d77f3cf596ef0306 ||HDFS-7831. Fix the starting index and end condition of the loop in FileDiffList.findEarlie || || @@ -71, +71 @@ ||888a44563819ba910dc3cc10d10ee0fb8f05db61 ||YARN-3222. Fixed NPE on RMNodeImpl#ReconnectNodeTransition when a node is reconnected with || || ||721d7b574126c4070322f70ec5b49a7b8558a4c7 ||YARN-3231. FairScheduler: Changing queueMaxRunningApps interferes with pending jobs. (Siqi || || ||0d62e948877e5d50f1b6fbe735a94ac6da5ff472 ||YARN-3242. Asynchrony in ZK-close can lead to ZKRMStateStore watcher receiving events for || || - ||b569c3ab1cb7e328dde822f6b2405d24b9560e3a ||HADOOP-11674. oneByteBuf in CryptoInputStream and CryptoOutputStream should be non static. || || + ||b569c3ab1cb7e328dde822f6b2405d24b9560e3a ||HADOOP-11674. oneByteBuf in CryptoInputStream and CryptoOutputStream should be non static. ||Applied locally || ||Yes || ||a5f3fb4dc14503bf7c454a48cf954fb0d6710de2 ||YARN-1809. Synchronize RM and TimeLineServer Web-UIs. Contributed by Zhijie Shen and Xuan || || ||994dadb9ba0a3b87b6548e6e0801eadd26554d55 ||HDFS-7885. Datanode should not trust the generation stamp provided by client. Contributed || || ||56c2050ab7c04e9741bcba9504b71e5a54d09eea ||YARN-3227. Timeline renew delegation token fails when RM user's TGT is expired. Contribute || || ||a94d23762e2cf4211fe84661eb67504c7072db49 ||YARN-3287. Made TimelineClient put methods do as the correct login context. Contributed by || || ||eefca23e8c5e474de1e25bf2ec8a5b266bbe8cfe ||HDFS-7830. DataNode does not release the volume lock when adding a volume fails. (Lei Xu v || || - ||813c93cb250d6d556604fe98845b979970bd5e18 ||HADOOP-11710. Make CryptoOutputStream behave like DFSOutputStream wrt synchronization. (Se || || + ||813c93cb250d6d556604fe98845b979970bd5e18 ||HADOOP-11710. Make CryptoOutputStream behave like DFSOutputStream wrt synchronization. (Se ||Applied locally || ||Yes || ||44aedad5ddc8069a6dba3eaf66ed54d612b21208 ||YARN-3267. Timelineserver applies the ACL rules after applying the limit on the number of || || ||af80a98ace50934284dde417efc802bb094c8b4e ||HDFS-7926. NameNode implementation of ClientProtocol.truncate(..) is not idempotent. Contr || || ||5a5b2446485531f12d37f3d4ca791672b9921872 ||HDFS-7587. Edit log corruption can happen if append fails with a quota violation. Contribu || || @@ -98, +98 @@ ||e7cbecddc3e7ca5386c71aa4deb67f133611415c ||YARN-3493. RM fails to come up with error Failed to load/recover state when mem settings || || ||3316cd4357ff6ccc4c76584813092adb1c2b4d43 ||YARN-3487. CapacityScheduler scheduler lock obtained unnecessarily when calling getQueue ( || || ||756c2542930756fef1cbff82056b418070f8d55f ||MAPREDUCE-6238. MR2 can't run local jobs with -libjars command options which is a regressi || || - ||1544c63602089b690e850e0e30af4589513a2371 ||HADOOP-11812. Implement listLocatedStatus for ViewFileSystem to speed up split calculation || || + ||1544c63602089b690e850e0e30af4589513a2371 ||HADOOP-11812. Implement listLocatedStatus for ViewFileSystem to speed up split calculation ||Applied locally || ||Yes || - ||a6a5d1d6b5ee76c829ba7b54a4ad619f7b986681 ||HADOOP-11730. Regression: s3n read failure recovery broken. (Takenori Sato via stevel) || || + ||a6a5d1d6b5ee76c829ba7b54a4ad619f7b986681 ||HADOOP-11730. Regression: s3n read failure recovery broken. (Takenori Sato via stevel) ||Applied locally || ||Yes || ||788b76761d5dfadf688406d50169e95401fe5d33 ||HADOOP-11802. DomainSocketWatcher thread terminates sometimes after there is an I/O error || ||
[Hadoop Wiki] Update of Release-2.6.1-Working-Notes by VinodKumarVavilapalli
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Release-2.6.1-Working-Notes page has been changed by VinodKumarVavilapalli: https://wiki.apache.org/hadoop/Release-2.6.1-Working-Notes?action=diffrev1=9rev2=10 Comment: Updating progress on local cherry-picks ||173664d70f0ed3b1852b6703d32e796778fb1c78 ||YARN-2964. RM prematurely cancels tokens for jobs that submit jobs (oozie). Contributed by ||Applied locally || ||Yes || ||bcaf15e2fa94db929b8cd11ed7c07085161bf950 ||HDFS-7443. Datanode upgrade to BLOCKID_BASED_LAYOUT fails if duplicate block files are pre || || ||9180d11b3bbb2a49127d5d25f53b38c5113bf7ea ||YARN-2952. Fixed incorrect version check in StateStore. Contributed by Rohith Sharmaks (ch ||Applied locally || ||No, minor merge conflict || - ||8b398a66ca3728f47363fc8b2fcf7e556e6bbf5a ||YARN-2340. Fixed NPE when queue is stopped during RM restart. Contributed by Rohith Sharma || || + ||8b398a66ca3728f47363fc8b2fcf7e556e6bbf5a ||YARN-2340. Fixed NPE when queue is stopped during RM restart. Contributed by Rohith Sharma ||Applied locally || ||Yes || - ||ca0349b87ab1b2d0d2b9dc93de7806d26713165c ||YARN-2992. ZKRMStateStore crashes due to session expiry. Contributed by Karthik Kambatla( || || + ||ca0349b87ab1b2d0d2b9dc93de7806d26713165c ||YARN-2992. ZKRMStateStore crashes due to session expiry. Contributed by Karthik Kambatla( ||Applied locally || ||Yes || - ||c116743bdda2b1792bf872020a5e2b14d772ac60 ||YARN-2922. ConcurrentModificationException in CapacityScheduler's LeafQueue. Contributed b || || + ||c116743bdda2b1792bf872020a5e2b14d772ac60 ||YARN-2922. ConcurrentModificationException in CapacityScheduler's LeafQueue. Contributed b ||Applied locally || ||Yes || - ||e7e6173049adca2a2ae0e1231adcaca8168bec27 ||YARN-2997. Fixed NodeStatusUpdater to not send alreay-sent completed container statuses on || || + ||e7e6173049adca2a2ae0e1231adcaca8168bec27 ||YARN-2997. Fixed NodeStatusUpdater to not send alreay-sent completed container statuses on ||Applied locally || ||Yes || ||f0acb7c2a284db61640efee15a1648c6c26d24f5 ||HDFS-7579. Improve log reporting during block report rpc failure. Contributed by Charles L || || ||4b589e7cfa27bd042e228bbbcf1c3b75b2aeaa57 ||HDFS-7182. JMX metrics aren't accessible when NN is busy. Contributed by Ming Ma. || || ||75e4e55e12b2faa521af7c23fddcba06a9ce661d ||HDFS-7596. NameNode should prune dead storages from storageMap. Contributed by Arpit Agarw || || @@ -46, +46 @@ ||1d9d166c0beb56aa45e65f779044905acff25d88 ||HDFS-7575. Upgrade should generate a unique storage ID for each volume. (Contributed by Ar || || ||7b69719455a1a374c9417417ef0c8d7ba6bf593f ||HADOOP-11482. Use correct UGI when KMSClientProvider is called by a proxy user. Contribute ||Applied locally || ||Yes || ||24f0d56afb853a371f905c4569d82d09d89cf13e ||HDFS-7676. Fix TestFileTruncate to avoid bug of HDFS-7611. Contributed by Konstantin Shvac || || - ||8100c8a68c32978a177af9a3e6639f6de533886d ||YARN-3011. Possible IllegalArgumentException in ResourceLocalizationService might lead NM || || + ||8100c8a68c32978a177af9a3e6639f6de533886d ||YARN-3011. Possible IllegalArgumentException in ResourceLocalizationService might lead NM ||Applied locally || ||No, merge conflict in ResourceLocalizationService || - ||12522fd9cbd8da8c040a5b7bb71fcdaa256daf89 ||YARN-3103. AMRMClientImpl does not update AMRM token properly. Contributed by Jason Lowe( || || + ||12522fd9cbd8da8c040a5b7bb71fcdaa256daf89 ||YARN-3103. AMRMClientImpl does not update AMRM token properly. Contributed by Jason Lowe( ||Applied locally || ||Yes || ||e8300957a75353849219d616cfac08b08c182db2 ||HDFS-7611. deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap || || - ||43b3b43cea1f620ce66521bcc1c4b6aec264aa9a ||MAPREDUCE-6230. Fixed RMContainerAllocator to update the new AMRMToken service name proper || || + ||43b3b43cea1f620ce66521bcc1c4b6aec264aa9a ||MAPREDUCE-6230. Fixed RMContainerAllocator to update the new AMRMToken service name proper ||Applied locally || ||Yes || ||c428d303f67bef3a7df12153947c6b0199a0938b ||HDFS-7707. Edit log corruption due to delayed block removal again. Contributed by Yongjun || || ||5807afed0a4f08b6b2acd88f424bace506e03707 ||HDFS-7733. NFS: readdir/readdirplus return null directory attribute on failure. (Contribut || || ||3d36d4737c160d7dc8829e9dd6b801ef6726c0c0 ||HADOOP-11506. Configuration variable expansion regex expensive for long values. (Gera Sheg ||Applied locally || ||Yes || - ||b1aad1d941c5855e5dd5f5e819e9218d8fc266ee ||MAPREDUCE-6237. Multiple mappers with DBInputFormat don't work because of reusing conectio || || + ||b1aad1d941c5855e5dd5f5e819e9218d8fc266ee ||MAPREDUCE-6237. Multiple mappers with DBInputFormat don't work because of reusing conectio ||Applied locally || ||Yes || -
[Hadoop Wiki] Trivial Update of Release-2.6.1-Working-Notes by VinodKumarVavilapalli
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Release-2.6.1-Working-Notes page has been changed by VinodKumarVavilapalli: https://wiki.apache.org/hadoop/Release-2.6.1-Working-Notes?action=diffrev1=8rev2=9 ||dda1fc169db2e69964cca746be4ff8965eb8b56f ||HDFS-7531. Improve the concurrent access on FsVolumeList (Lei Xu via Colin P. McCabe) (che || || ||173664d70f0ed3b1852b6703d32e796778fb1c78 ||YARN-2964. RM prematurely cancels tokens for jobs that submit jobs (oozie). Contributed by ||Applied locally || ||Yes || ||bcaf15e2fa94db929b8cd11ed7c07085161bf950 ||HDFS-7443. Datanode upgrade to BLOCKID_BASED_LAYOUT fails if duplicate block files are pre || || - ||9180d11b3bbb2a49127d5d25f53b38c5113bf7ea ||YARN-2952. Fixed incorrect version check in StateStore. Contributed by Rohith Sharmaks (ch ||Applied locally || ||Yes || + ||9180d11b3bbb2a49127d5d25f53b38c5113bf7ea ||YARN-2952. Fixed incorrect version check in StateStore. Contributed by Rohith Sharmaks (ch ||Applied locally || ||No, minor merge conflict || ||8b398a66ca3728f47363fc8b2fcf7e556e6bbf5a ||YARN-2340. Fixed NPE when queue is stopped during RM restart. Contributed by Rohith Sharma || || ||ca0349b87ab1b2d0d2b9dc93de7806d26713165c ||YARN-2992. ZKRMStateStore crashes due to session expiry. Contributed by Karthik Kambatla( || || ||c116743bdda2b1792bf872020a5e2b14d772ac60 ||YARN-2922. ConcurrentModificationException in CapacityScheduler's LeafQueue. Contributed b || ||
[Hadoop Wiki] Update of Release-2.6.1-Working-Notes by VinodKumarVavilapalli
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Release-2.6.1-Working-Notes page has been changed by VinodKumarVavilapalli: https://wiki.apache.org/hadoop/Release-2.6.1-Working-Notes?action=diffrev1=8rev2=9 Comment: Updating progress on local cherry-picks ||dda1fc169db2e69964cca746be4ff8965eb8b56f ||HDFS-7531. Improve the concurrent access on FsVolumeList (Lei Xu via Colin P. McCabe) (che || || ||173664d70f0ed3b1852b6703d32e796778fb1c78 ||YARN-2964. RM prematurely cancels tokens for jobs that submit jobs (oozie). Contributed by ||Applied locally || ||Yes || ||bcaf15e2fa94db929b8cd11ed7c07085161bf950 ||HDFS-7443. Datanode upgrade to BLOCKID_BASED_LAYOUT fails if duplicate block files are pre || || - ||9180d11b3bbb2a49127d5d25f53b38c5113bf7ea ||YARN-2952. Fixed incorrect version check in StateStore. Contributed by Rohith Sharmaks (ch ||Applied locally || ||Yes || + ||9180d11b3bbb2a49127d5d25f53b38c5113bf7ea ||YARN-2952. Fixed incorrect version check in StateStore. Contributed by Rohith Sharmaks (ch ||Applied locally || ||No, minor merge conflict || ||8b398a66ca3728f47363fc8b2fcf7e556e6bbf5a ||YARN-2340. Fixed NPE when queue is stopped during RM restart. Contributed by Rohith Sharma || || ||ca0349b87ab1b2d0d2b9dc93de7806d26713165c ||YARN-2992. ZKRMStateStore crashes due to session expiry. Contributed by Karthik Kambatla( || || ||c116743bdda2b1792bf872020a5e2b14d772ac60 ||YARN-2922. ConcurrentModificationException in CapacityScheduler's LeafQueue. Contributed b || ||
[Hadoop Wiki] Update of Release-2.6.1-Working-Notes by AkiraAjisaka
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Release-2.6.1-Working-Notes page has been changed by AkiraAjisaka: https://wiki.apache.org/hadoop/Release-2.6.1-Working-Notes?action=diffrev1=13rev2=14 Comment: Updating progress on local cherry-picks. ||756c2542930756fef1cbff82056b418070f8d55f ||MAPREDUCE-6238. MR2 can't run local jobs with -libjars command options which is a regressi || || ||1544c63602089b690e850e0e30af4589513a2371 ||HADOOP-11812. Implement listLocatedStatus for ViewFileSystem to speed up split calculation ||Applied locally || ||Yes || ||a6a5d1d6b5ee76c829ba7b54a4ad619f7b986681 ||HADOOP-11730. Regression: s3n read failure recovery broken. (Takenori Sato via stevel) ||Applied locally || ||Yes || - ||788b76761d5dfadf688406d50169e95401fe5d33 ||HADOOP-11802. DomainSocketWatcher thread terminates sometimes after there is an I/O error ||Skip ||Yes ||No, need to commit HDFS-7915 first. If we commit HDFS-7915, we need to commit HDFS-8070 as well. || + ||788b76761d5dfadf688406d50169e95401fe5d33 ||HADOOP-11802. DomainSocketWatcher thread terminates sometimes after there is an I/O error ||Skipped ||Yes ||No, need to commit HDFS-7915 first. If we commit HDFS-7915, we need to commit HDFS-8070 as well. || ||4045c41afe440b773d006e962bf8a5eae3fdc284 ||YARN-3464. Race condition in LocalizerRunner kills localizer before localizing all resourc || || ||32dc13d907a416049bdb7deff429725bd6dbcb49 ||MAPREDUCE-6324. Fixed MapReduce uber jobs to not fail the udpate of AM-RM tokens when they || || ||58970d69de8a1662e4548cd6d4ca460dd70562f8 ||HADOOP-11491. HarFs incorrectly declared as requiring an authority. (Brahma Reddy Battula ||Applied locally || ||Yes || @@ -118, +118 @@ ||d3193fd1d7395bf3e7c8dfa70d1aec08b0f147e6 ||Move YARN-2918 from 2.8.0 to 2.7.1 (cherry picked from commit 03f897fd1a3779251023bae3582 || || ||839f81a6326b2f8b3d5183178382c1551b0bc259 ||YARN-3700. Made generic history service load a number of latest applications according to || || ||25db34127811fbadb9a698fa3a76e24d426fb0f6 ||HDFS-8431. hdfs crypto class not found in Windows. Contributed by Anu Engineer. (cherry p || || - ||33648268ce0f79bf51facafa3d151612e3d00ddb ||HADOOP-11934. Use of JavaKeyStoreProvider in LdapGroupsMapping causes infinite loop. Contr || || ||No, cannot apply to branch-2.6 due to java.nio package. || + ||33648268ce0f79bf51facafa3d151612e3d00ddb ||HADOOP-11934. Use of JavaKeyStoreProvider in LdapGroupsMapping causes infinite loop. Contr ||Skipped || ||No, cannot apply to branch-2.6 due to java.nio package. || ||17fb442a4c4e43105374c97fccd68dd966729a19 ||HDFS-7609. Avoid retry cache collision when Standby NameNode loading edits. Contributed by || || ||4fee8b320276bac86278e1ae0a3397592a78aa18 ||YARN-2900. Application (Attempt and Container) Not Found in AHS results in Internal Server || || ||a3734f67d35e714690ecdf21d80bce8a355381e3 ||YARN-3725. App submission via REST API is broken in secure mode due to Timeline DT service || || @@ -127, +127 @@ ||752caa95a40d899e1bf98bc907e91aec2bb57073 ||YARN-3585. NodeManager cannot exit on SHUTDOWN event triggered and NM recovery is enabled. || || ||78d626fa892415023827e35ad549636e2a83275d ||YARN-3733. Fix DominantRC#compare() does not work as expected if cluster resource is empty || || ||344b7509153cdd993218cd5104c7e5c07cd35d3c ||Add missing test file of YARN-3733 (cherry picked from commit 405bbcf68c32d8fd8a83e46e686 || || - ||80697e4f324948ec32b4cad3faccba55287be652 ||HADOOP-7139. Allow appending to existing SequenceFiles (Contributed by kanaka kumar avvaru || || + ||80697e4f324948ec32b4cad3faccba55287be652 ||HADOOP-7139. Allow appending to existing SequenceFiles (Contributed by kanaka kumar avvaru ||Applied locally || ||Yes || ||cbd11681ce8a51d187d91748b67a708681e599de ||HDFS-8480. Fix performance and timeout issues in HDFS-7929 by using hard-links to preserve || || ||15b1800b1289d239cbebc5cfd66cfe156d45a2d3 ||YARN-3832. Resource Localization fails on a cluster due to existing cache directories. Con || || - ||55427fb66c6d52ce98b4d68a29b592a734014c28 ||HADOOP-8151. Error handling in snappy decompressor throws invalid exceptions. Contributed || || + ||55427fb66c6d52ce98b4d68a29b592a734014c28 ||HADOOP-8151. Error handling in snappy decompressor throws invalid exceptions. Contributed ||Applied locally || ||Yes || ||0221d19f4e398c386f4ca3990b0893562aa8dacf ||YARN-3850. NM fails to read files from full disks which can lead to container logs being l || || ||516bbf1c20547dc513126df0d9f0934bb65c10c7 ||HDFS-7314. When the DFSClient lease cannot be renewed, abort open-for-write files rather t || || ||c31e3ba92132f232bd56b257f3854ffe430fbab9 ||YARN-3990. AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected || || ||d2b941f94a835f7bdde7714d21a470b505aa582b
[Hadoop Wiki] Update of Release-2.6.1-Working-Notes by AkiraAjisaka
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Release-2.6.1-Working-Notes page has been changed by AkiraAjisaka: https://wiki.apache.org/hadoop/Release-2.6.1-Working-Notes?action=diffrev1=10rev2=11 Comment: Skip HADOOP-11802 because we concerned about the dependency it pulls. ||756c2542930756fef1cbff82056b418070f8d55f ||MAPREDUCE-6238. MR2 can't run local jobs with -libjars command options which is a regressi || || ||1544c63602089b690e850e0e30af4589513a2371 ||HADOOP-11812. Implement listLocatedStatus for ViewFileSystem to speed up split calculation ||Applied locally || ||Yes || ||a6a5d1d6b5ee76c829ba7b54a4ad619f7b986681 ||HADOOP-11730. Regression: s3n read failure recovery broken. (Takenori Sato via stevel) ||Applied locally || ||Yes || - ||788b76761d5dfadf688406d50169e95401fe5d33 ||HADOOP-11802. DomainSocketWatcher thread terminates sometimes after there is an I/O error || || + ||788b76761d5dfadf688406d50169e95401fe5d33 ||HADOOP-11802. DomainSocketWatcher thread terminates sometimes after there is an I/O error ||Skip ||Yes ||No, need to commit HDFS-7915 first. If we commit HDFS-7915, we need to commit HDFS-8070 as well. || ||4045c41afe440b773d006e962bf8a5eae3fdc284 ||YARN-3464. Race condition in LocalizerRunner kills localizer before localizing all resourc || || ||32dc13d907a416049bdb7deff429725bd6dbcb49 ||MAPREDUCE-6324. Fixed MapReduce uber jobs to not fail the udpate of AM-RM tokens when they || || ||58970d69de8a1662e4548cd6d4ca460dd70562f8 ||HADOOP-11491. HarFs incorrectly declared as requiring an authority. (Brahma Reddy Battula || ||
[Hadoop Wiki] Update of Release-2.6.1-Working-Notes by AkiraAjisaka
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Release-2.6.1-Working-Notes page has been changed by AkiraAjisaka: https://wiki.apache.org/hadoop/Release-2.6.1-Working-Notes?action=diffrev1=11rev2=12 Comment: Add HADOOP-10786 to the list. ||842a54a5f66e76eb79321b66cc3b8820fe66c5cd ||HDFS-7235. DataNode#transferBlock should report blocks that don't exist using reportBadBlo || || ||8bfef590295372a48bd447b1462048008810ee17 ||HDFS-7263. Snapshot read can reveal future bytes for appended files. Contributed by Tao Lu || || ||ec2621e907742aad0264c5f533783f0f18565880 ||HDFS-7035. Make adding a new data directory to the DataNode an atomic operation and improv || || + ||9e63cb4492896ffb78c84e27f263a61ca12148c8 ||HADOOP-10786. Fix UGI#reloginFromKeytab on Java 8. ||Applied locally || ||Yes || ||beb184ac580b0d89351a3f3a7201da34a26db1c1 ||YARN-2856. Fixed RMAppImpl to handle ATTEMPT_KILLED event at ACCEPTED state on app recover ||Applied locally || ||Yes || ||ad140d1fc831735fb9335e27b38d2fc040847af1 ||YARN-2816. NM fail to start with NPE during container recovery. Contributed by Zhihai Xu( ||Applied locally || ||Yes || ||242fd0e39ad1c5d51719cd0f6c197166066e3288 ||YARN-2414. RM web UI: app page will crash if app is failed before any attempt has been cre ||Applied locally || ||Yes ||
[Hadoop Wiki] Update of Release-2.6.1-Working-Notes by VinodKumarVavilapalli
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Release-2.6.1-Working-Notes page has been changed by VinodKumarVavilapalli: https://wiki.apache.org/hadoop/Release-2.6.1-Working-Notes Comment: Creating the list of ordered commits for 2.6.1 New page: Ordered list of commits to cherrypick ||946463efefec9031cacb21d5a5367acd150ef904 ||HDFS-7213. processIncrementalBlockReport performance degradation. Contributed by Eric Payn || || ||842a54a5f66e76eb79321b66cc3b8820fe66c5cd ||HDFS-7235. DataNode#transferBlock should report blocks that don't exist using reportBadBlo || || ||8bfef590295372a48bd447b1462048008810ee17 ||HDFS-7263. Snapshot read can reveal future bytes for appended files. Contributed by Tao Lu || || ||ec2621e907742aad0264c5f533783f0f18565880 ||HDFS-7035. Make adding a new data directory to the DataNode an atomic operation and improv || || ||beb184ac580b0d89351a3f3a7201da34a26db1c1 ||YARN-2856. Fixed RMAppImpl to handle ATTEMPT_KILLED event at ACCEPTED state on app recover || || ||ad140d1fc831735fb9335e27b38d2fc040847af1 ||YARN-2816. NM fail to start with NPE during container recovery. Contributed by Zhihai Xu( || || ||242fd0e39ad1c5d51719cd0f6c197166066e3288 ||YARN-2414. RM web UI: app page will crash if app is failed before any attempt has been cre || || ||2e15754a92c6589308ccbbb646166353cc2f2456 ||HDFS-7225. Remove stale block invalidation work when DN re-registers with different UUID. || || ||db31ef7e7f55436bbf88c6d93e2273c4463ca9f0 ||YARN-2865. Fixed RM to always create a new RMContext when transtions from StandBy to Activ || || ||8d8eb8dcec94e92d94eedef883cdece8ba333087 ||HDFS-7425. NameNode block deletion logging uses incorrect appender. Contributed by Chris N || || ||946df98dce18975e37a6a14744ca7a5429f019ce ||HDFS-4882. Prevent the Namenode's LeaseManager from looping forever in checkLeases (Ravi P || || ||ae35b0e14d3438237f4b5d3b5d5268d45e549846 ||YARN-2906. CapacitySchedulerPage shows HTML tags for a queue's Active Users. Contributed b || || ||f6d1bf5ed1cf647d82e676df15587de42b1faa42 ||HADOOP-11333. Fix deadlock in DomainSocketWatcher when the notification pipe is full (zhao || || ||38ea1419f60d2b8176dba4931748f1f0e52ca84e ||YARN-2905. AggregatedLogsBlock page can infinitely loop if the aggregated log file is corr || || ||d21ef79707a0f32939d9a5af4fed2d9f5fe6f2ec ||YARN-2890. MiniYARNCluster should start the timeline server based on the configuration. Co || || ||06552a15d5172a2b0ad3d61aa7f9a849857385aa ||HDFS-7446. HDFS inotify should have the ability to determine what txid it has read up to ( || || ||d6f3d4893d750f19dd8c539fe28eecfab2a54576 ||YARN-2894. Fixed a bug regarding application view acl when RM fails over. Contributed by R || || ||25be97808b99148412c0efd4d87fc750db4d6607 ||YARN-2874. Dead lock in DelegationTokenRenewer which blocks RM to execute any further apps || || ||dabdd2d746d1e1194c124c5c7fe73fcc025e78d2 ||HADOOP-11343. Overflow is not properly handled in caclulating final iv for AES CTR. Contri || || ||deaa172e7a2ab09656cc9eb431a3e68a73e0bd96 ||HADOOP-11368. Fix SSLFactory truststore reloader thread leak in KMSClientProvider. Contrib || || ||a037d6030b5ae9422fdb265f5e4880d515be9e37 ||HDFS-7489. Incorrect locking in FsVolumeList#checkDirs can hang datanodes (Noah Lorang via || || ||1986ea8dd223267ced3e3aef69980b46e2fef740 ||YARN-2910. FSLeafQueue can throw ConcurrentModificationException. (Wilfred Spiegelenburg v || || ||e4f9ddfdbcdab1eacf568e8689448ffc10bbc2aa ||HDFS-7503. Namenode restart after large deletions can cause slow processReport (Arpit Agar || || ||41f0d20fcb4fec5b932b8947a44f93345205222c ||YARN-2917. Fixed potential deadlock when system.exit is called in AsyncDispatcher. Contrib || || ||b521d91c0f5b6d114630b6727f6a01db56dba4f1 ||HADOOP-11238. Update the NameNode's Group Cache in the background when possible (Chris Li || || ||7c7bccfc31ede6f4afc043d81c8204046148c02e ||MAPREDUCE-6166. Reducers do not validate checksum of map outputs when fetching directly to || || ||2d832ad2eb87e0ce7c50899c54d05f612666518a ||YARN-2964. FSLeafQueue#assignContainer - document the reason for using both write and read || || ||dda1fc169db2e69964cca746be4ff8965eb8b56f ||HDFS-7531. Improve the concurrent access on FsVolumeList (Lei Xu via Colin P. McCabe) (che || || ||173664d70f0ed3b1852b6703d32e796778fb1c78 ||YARN-2964. RM prematurely cancels tokens for jobs that submit jobs (oozie). Contributed by || || ||bcaf15e2fa94db929b8cd11ed7c07085161bf950 ||HDFS-7443. Datanode upgrade to BLOCKID_BASED_LAYOUT fails if duplicate block files are pre || || ||9180d11b3bbb2a49127d5d25f53b38c5113bf7ea ||YARN-2952. Fixed incorrect version check in StateStore. Contributed by Rohith Sharmaks (ch || || ||8b398a66ca3728f47363fc8b2fcf7e556e6bbf5a ||YARN-2340. Fixed NPE when queue is stopped during RM restart. Contributed by Rohith Sharma || ||
[Hadoop Wiki] Trivial Update of Release-2.6.1-Working-Notes by VinodKumarVavilapalli
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Release-2.6.1-Working-Notes page has been changed by VinodKumarVavilapalli: https://wiki.apache.org/hadoop/Release-2.6.1-Working-Notes?action=diffrev1=1rev2=2 Comment: Removed HDFS-7916 from the list ||fe693b72dec703ecbf4ab3919d61d06ea8735a9e ||HDFS-7884. Fix NullPointerException in BlockSender when the generation stamp provided by t || || ||2f46ee50bd4efc82ba3d30bd36f7637ea9d9714e ||HDFS-7960. The full block report should prune zombie storages even if they're not empty. C || || ||c4cedfc1d601127430c70ca8ca4d4e2ee2d1003d ||HDFS-7742. Favoring decommissioning node for replication can cause a block to stay underre || || - ||beb0fd0d601aff0ba993c2d48b83fe52edfb9065 ||HDFS-7916. 'reportBadBlocks' from datanodes to standby Node BPServiceActor goes for infini || || ||cacadea632f7ab6fe4fdb1432e1a2c48e8ebd55f ||MAPREDUCE-6303. Read timeout when retrying a fetch error can be fatal to a reducer. Contri || || ||a827089905524e10638c783ba908a895d621911d ||HDFS-7999. FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a very || || ||f0324738c9db4f45d2b1ec5cfb46c5f2b7669571 ||HDFS-8072. Reserved RBW space is not released if client terminates while writing block. (A || || @@ -98, +97 @@ ||e7cbecddc3e7ca5386c71aa4deb67f133611415c ||YARN-3493. RM fails to come up with error Failed to load/recover state when mem settings || || ||3316cd4357ff6ccc4c76584813092adb1c2b4d43 ||YARN-3487. CapacityScheduler scheduler lock obtained unnecessarily when calling getQueue ( || || ||756c2542930756fef1cbff82056b418070f8d55f ||MAPREDUCE-6238. MR2 can't run local jobs with -libjars command options which is a regressi || || - ||961051e569151d68d90f91055c8678c034c20207 ||HDFS-7916. 'reportBadBlocks' from datanodes to standby Node BPServiceActor goes for infini || || ||1544c63602089b690e850e0e30af4589513a2371 ||HADOOP-11812. Implement listLocatedStatus for ViewFileSystem to speed up split calculation || || ||a6a5d1d6b5ee76c829ba7b54a4ad619f7b986681 ||HADOOP-11730. Regression: s3n read failure recovery broken. (Takenori Sato via stevel) || || ||788b76761d5dfadf688406d50169e95401fe5d33 ||HADOOP-11802. DomainSocketWatcher thread terminates sometimes after there is an I/O error || || @@ -106, +104 @@ ||32dc13d907a416049bdb7deff429725bd6dbcb49 ||MAPREDUCE-6324. Fixed MapReduce uber jobs to not fail the udpate of AM-RM tokens when they || || ||58970d69de8a1662e4548cd6d4ca460dd70562f8 ||HADOOP-11491. HarFs incorrectly declared as requiring an authority. (Brahma Reddy Battula || || ||87c2d915f1cc799cb4020c945c04d3ecb82ee963 ||MAPREDUCE-5649. Reduce cannot use more than 2G memory for the final merge. Contributed by || || - ||01bdfd794cf460ae0a399649eaae54676d101214 ||HDFS-7916. 'reportBadBlocks' from datanodes to standby Node BPServiceActor goes for infini || || ||e68e8b3b5cff85bfd8bb5b00b9033f63577856d6 ||HDFS-8219. setStoragePolicy with folder behavior is different after cluster restart. (sure || || ||4e1f2eb3955a97a70cf127dc97ae49201a90f5e0 ||HDFS-7980. Incremental BlockReport will dramatically slow down namenode startup. Contribu || || ||d817fbb34d6e34991c6e512c20d71387750a98f4 ||YARN-2918. RM should not fail on startup if queue's configured labels do not exist in clus || || ||802a5775f3522c57c60ae29ecb9533dbbfecfe76 ||HDFS-7894. Rolling upgrade readiness is not updated in jmx until query command is issued. || || ||f264a5aeede7e144af11f5357c7f901993de8e12 ||HDFS-8245. Standby namenode doesn't process DELETED_BLOCK if the addblock request is in ed || || - ||50778f9d458443c2cbfead4502df4e1204c4d567 ||HDFS-7916. 'reportBadBlocks' from datanodes to standby Node BPServiceActor goes for infini || || ||fb5b0ebb459cc8812084090a7ce7ac29e2ad147c ||MAPREDUCE-6361. NPE issue in shuffle caused by concurrent issue between copySucceeded() in || || ||a81ad814610936a02e55964fbe08f7b33fe29b23 ||YARN-3641. NodeManager: stopRecoveryStore() shouldn't be skipped when exceptions happen in || || ||802676e1be350785d8c0ad35f6676eeb85b2467b ||YARN-3526. ApplicationMaster tracking URL is incorrectly redirected on a QJM cluster. Cont || || @@ -139, +135 @@ ||d2b941f94a835f7bdde7714d21a470b505aa582b ||HDFS-8850. VolumeScanner thread exits with exception if there is no block pool to be scann || || ||5950c1f6f8ac6f514f8d2e8bfbd1f71747b097de ||HADOOP-11932. MetricsSinkAdapter may hang when being stopped. Contributed by Brahma Reddy || || -
[Hadoop Wiki] Update of Books by Packt Publishing
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Books page has been changed by Packt Publishing: https://wiki.apache.org/hadoop/Books?action=diffrev1=24rev2=25 # Please do not go overboard in exaggerating the outcome of reading a book, readers of this book will become experts in advanced production-scale Hadoop MapReduce jobs. Such claims will be edited out. }}} + === Learning Hadoop 2 === + + '''Name:''' [[https://www.packtpub.com/big-data-and-business-intelligence/learning-hadoop/?utm_source=PODutm_medium=referralutm_campaign=1783285516|Learning Hadoop 2]] + + '''Authors:''' Garry Turkington, Gabriele Modena + + '''Publisher:''' Packt Publishing + + '''Date of Publishing:''' February, 2015 + + Learning Hadoop 2 is an introduction guide to building data-processing applications with the wide variety of tools supported by Hadoop 2. + + === Hadoop MapReduce v2 Cookbook - Second Edition === + + '''Name:''' [[https://www.packtpub.com/big-data-and-business-intelligence/hadoop-mapreduce-v2-cookbook-second-edition/?utm_source=PODutm_medium=referralutm_campaign=1783285478|Hadoop MapReduce v2 Cookbook - Second Edition]] + + '''Authors:''' Thilina Gunarathne + + '''Publisher:''' Packt Publishing + + '''Date of Publishing:''' February, 2015 + + Hadoop MapReduce v2 Cookbook - Second Edition is a beginner's guide to explore the Hadoop MapReduce v2 ecosystem to gain insights from very large datasets. + + === Scaling Big Data with Hadoop and Solr - Second Edition === + + '''Name:''' [[https://www.packtpub.com/big-data-and-business-intelligence/scaling-big-data-hadoop-and-solr-second-edition/?utm_source=PODutm_medium=referralutm_campaign=1783553391|Scaling Big Data with Hadoop and Solr - Second Edition]] + + '''Authors:''' Hrishikesh Vijay Karambelkar + + '''Hadoop Version:''' 2.6 + + '''Publisher:''' Packt Publishing + + '''Date of Publishing:''' April, 2015 + + Scaling Big Data with Hadoop and Solr - Second Edition is aimed at developers, designers, and architects who would like to build big data enterprise search solutions for their customers or organizations + + === Hadoop for Finance Essentials === + + '''Name:''' [[https://www.packtpub.com/big-data-and-business-intelligence/hadoop-finance-essentials/?utm_source=PODutm_medium=referralutm_campaign=1784395161|Hadoop for Finance Essentials]] + + '''Authors:''' Rajiv Tiwari + + '''Publisher:''' Packt Publishing + + '''Date of Publishing:''' April, 2015 + + Hadoop for Finance Essentials is for developers who would like to perform big data analytics with Hadoop for the financial sector. + === Monitoring Hadoop === '''Name:''' [[https://www.packtpub.com/big-data-and-business-intelligence/monitoring-hadoop/?utm_source=PODutm_medium=referralutm_campaign=1783281553|Monitoring Hadoop]]
[Hadoop Wiki] Update of Books by Packt Publishing
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Books page has been changed by Packt Publishing: https://wiki.apache.org/hadoop/Books?action=diffrev1=20rev2=21 }}} + === Hadoop Essentials === + + '''Name:''' [[https://www.packtpub.com/big-data-and-business-intelligence/hadoop-essentials/?utm_source=PODutm_medium=referralutm_campaign=1784396680|Hadoop Essentials]] + + '''Authors:''' Shiva Achari + + '''Hadoop Version:''' 2.6 + + '''Publisher:''' Packt Publishing + + '''Date of Publishing:''' April 29, 2015 + + Hadoop Essentials explains the key concepts of Hadoop and gives a thorough understanding of the Hadoop ecosystem. + == Hadoop in Practice, Second Edition == '''Name:''' [[http://www.manning.com/holmes2/|Hadoop in Practice, Second Edition]]
[Hadoop Wiki] Update of Books by Packt Publishing
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Books page has been changed by Packt Publishing: https://wiki.apache.org/hadoop/Books?action=diffrev1=23rev2=24 '''Date of Publishing:''' April 28, 2015 - Monitoring Hadoop is for Hadoop administrators who need to learn how to monitor and diagnose their clusters. + Monitoring Hadoop is for Hadoop administrators who want to learn how to monitor and diagnose their clusters. === Hadoop Backup and Recovery Solutions ===
[Hadoop Wiki] Update of Books by Packt Publishing
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Books page has been changed by Packt Publishing: https://wiki.apache.org/hadoop/Books?action=diffrev1=23rev2=24 '''Date of Publishing:''' April 28, 2015 - Monitoring Hadoop is for Hadoop administrators who need to learn how to monitor and diagnose their clusters. + Monitoring Hadoop is for Hadoop administrators who want to learn how to monitor and diagnose their clusters. === Hadoop Backup and Recovery Solutions ===
[Hadoop Wiki] Update of Books by Packt Publishing
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Books page has been changed by Packt Publishing: https://wiki.apache.org/hadoop/Books?action=diffrev1=23rev2=24 '''Date of Publishing:''' April 28, 2015 - Monitoring Hadoop is for Hadoop administrators who need to learn how to monitor and diagnose their clusters. + Monitoring Hadoop is for Hadoop administrators who want to learn how to monitor and diagnose their clusters. === Hadoop Backup and Recovery Solutions ===
[Hadoop Wiki] Update of PoweredBy by DavidTing
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The PoweredBy page has been changed by DavidTing: https://wiki.apache.org/hadoop/PoweredBy?action=diffrev1=436rev2=437 * ''Each (commodity) node has 8 cores and 12 TB of storage. '' * ''We are heavy users of both streaming as well as the Java APIs. We have built a higher level data warehousing framework using these features called Hive (see the http://hadoop.apache.org/hive/). We have also developed a FUSE implementation over HDFS. '' - * ''[[https://fnews.com/|fnews]] '' + * ''[[http://www.follownews.com/|FollowNews]] '' * ''We use Hadoop for storing logs, news analysis, tag analysis. '' * ''[[http://www.foxaudiencenetwork.com|FOX Audience Network]] ''
[Hadoop Wiki] Update of ContributorsGroup by SteveLoughran
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The ContributorsGroup page has been changed by SteveLoughran: https://wiki.apache.org/hadoop/ContributorsGroup?action=diffrev1=110rev2=111 Comment: adding Packt Publishing * OwenOMalley * OlivierLamy * Pacoffre + * Packt Publishing * parthpatil * PatrickHunt * PatrickKling
[Hadoop Wiki] Update of SocketTimeout by SteveLoughran
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The SocketTimeout page has been changed by SteveLoughran: https://wiki.apache.org/hadoop/SocketTimeout?action=diffrev1=5rev2=6 Comment: add that long haul networks can trigger transient outages * The remote machine crashing. This cannot easily be distinguished from a network partitioning. * A change in the firewall settings of one of the machines preventing communication. * The settings are wrong and the client is trying to talk to the wrong machine, one that is not on the network. That could be an error in Hadoop configuration files, or an entry in the DNS tables or the /etc/hosts file. + * If its over a long-haul network (i.e. out of cluster), it may be a transient failure due to the network playing up. - * If using a client of an object store such as the Amazon S3 and OpenStack Swift clients, socket timeouts may be caused by remote-throttling of client requests: your program is making too many PUT/DELETE requests and is being deliberately blocked by the far end. This is most likely to happen when creating many small files, or performing bulk deletes (e.g. deleting a directory with many child entries). + * If using a client of an object store such as the Amazon S3 and OpenStack clients, socket timeouts may be caused by remote-throttling of client requests: your program is making too many PUT/DELETE requests and is being deliberately blocked by the far end. This is most likely to happen when creating many small files, or performing bulk deletes (e.g. deleting a directory with many child entries). It can also arise from a transient failure of the long-haul link. Comparing this exception to the ConnectionRefused error, the latter indicates there is a server at the far end, but no program running on it can receive inbound connections on the chosen port. A Socket Timeout usually means that there is something there, but it or the network are not working right @@ -26, +27 @@ 1. Can you telnet to the target host and port? 1. Can you telnet to the target host and port from any other machine? 1. On the target machine, can you telnet to the port using localhost as the hostname. If this works but external network connections time out, it's usually a firewall issue. - 1. If it is a remote object store: is the address correct? Does it only happen on bulk operations? If the latter, it's probably due to throttling at the far end. + 1. If it is a remote object store: is the address correct? Does it go away when you repeat the operation? Does it only happen on bulk operations? If the latter, it's probably due to throttling at the far end. Remember: These are [[YourNetworkYourProblem|your network configuration problems]] . Only you can fix them.
[Hadoop Wiki] Update of BindException by SteveLoughran
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The BindException page has been changed by SteveLoughran: https://wiki.apache.org/hadoop/BindException?action=diffrev1=7rev2=8 Comment: +add the caveat that it's a local config problem As you cannot have more than one process listening on a TCP port, whatever is listening is stopping the service coming up. You will need to track down and stop that process, or change the service you are trying to start up to listen to a different port. How to track down the problem + 1. Identify which host/IP address the program is trying to use. 1. Make sure the hostname is valid:try to ping it; use {{{ifconfig}}} to list the network interfaces and their IP addresses. 1. Make sure the hostname/IP address is one belonging to the host in question. @@ -24, +25 @@ 1. As root use {{{netstat -a -t --numeric-ports -p}}} to list the ports that are in use by number and process. (On OS/X you need to use {{{lsof}}}). 1. Change the configuration of one of the programs to listen on a different port. + Finally, this is not a Hadoop problems, it is a host, network or Hadoop configuration problem. As it is your cluster, [[YourNetworkYourProblem|only you can find out and track down the problem.]]. Sorry +
[Hadoop Wiki] Update of BindException by SteveLoughran
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The BindException page has been changed by SteveLoughran: https://wiki.apache.org/hadoop/BindException?action=diffrev1=7rev2=8 Comment: move hostname up for a better workflow As you cannot have more than one process listening on a TCP port, whatever is listening is stopping the service coming up. You will need to track down and stop that process, or change the service you are trying to start up to listen to a different port. How to track down the problem + 1. Identify which host/IP address the program is trying to use. 1. Make sure the hostname is valid:try to ping it; use {{{ifconfig}}} to list the network interfaces and their IP addresses. 1. Make sure the hostname/IP address is one belonging to the host in question. @@ -24, +25 @@ 1. As root use {{{netstat -a -t --numeric-ports -p}}} to list the ports that are in use by number and process. (On OS/X you need to use {{{lsof}}}). 1. Change the configuration of one of the programs to listen on a different port. + Finally, this is not a Hadoop problems, it is a host, network or Hadoop configuration problem. As it is your cluster, [[YourNetworkYourProblem|only you can find out and track down the problem.]]. Sorry +
[Hadoop Wiki] Update of FAQ by GautamGopalakrishnan
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The FAQ page has been changed by GautamGopalakrishnan: https://wiki.apache.org/hadoop/FAQ?action=diffrev1=113rev2=114 == What does the message Operation category READ/WRITE is not supported in state standby mean? == In an HA-enabled cluster, DFS clients cannot know in advance which namenode is active at a given time. So when a client contacts a namenode and it happens to be the standby, the READ or WRITE operation will be refused and this message is logged. The client will then automatically contact the other namenode and try the operation again. As long as there is one active and one standby namenode in the cluster, this message can be safely ignored. - If an application is configured to contact only one namenode always, this message indicates that the application is failing to perform any read/write operation. In such situations, the application would need to be modified to contact the other namenode. + If an application is configured to contact only one namenode always, this message indicates that the application is failing to perform any read/write operation. In such situations, the application would need to be modified to use the HA configuration for the cluster. = Platform Specific = == General ==
[Hadoop Wiki] Trivial Update of ContributorsGroup by QwertyManiac
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The ContributorsGroup page has been changed by QwertyManiac: https://wiki.apache.org/hadoop/ContributorsGroup?action=diffrev1=109rev2=110 Comment: Adding Gautam (needs to edit FAQ) * GabrielReid * Garrett Wu * GaryHelmling + * GautamGopalakrishnan * GavinMcDonald * geisbruch * GeorgePorter
[Hadoop Wiki] Update of FAQ by GautamGopalakrishnan
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The FAQ page has been changed by GautamGopalakrishnan: https://wiki.apache.org/hadoop/FAQ?action=diffrev1=112rev2=113 Comment: Added 3.17 (Operation category READ/WRITE is not supported in state standby) See ConnectionRefused == Why is the 'hadoop.tmp.dir' config default user.name dependent? == + We need a directory that a user can write and also not to interfere with other users. If we didn't include the username, then different users would share the same tmp directory. This can cause authorization problems, if folks' default umask doesn't permit write by others. It can also result in folks stomping on each other, when they're, e.g., playing with HDFS and re-format their filesystem. - - We need a directory that a user can write and also not to interfere with other users. - If we didn't include the username, then different users would share the same tmp directory. - This can cause authorization problems, if folks' default umask doesn't permit write by others. - It can also result in folks stomping on each other, when they're, e.g., playing with HDFS and - re-format their filesystem. == Does Hadoop require SSH? == Hadoop provided scripts (e.g., start-mapred.sh and start-dfs.sh) use ssh in order to start and stop the various daemons and some other utilities. The Hadoop framework in itself does not '''require''' ssh. Daemons (e.g. TaskTracker and DataNode) can also be started manually on each node without the script's help. == What mailing lists are available for more help? == - A description of all the mailing lists are on the http://hadoop.apache.org/mailing_lists.html page. In general: * general is for people interested in the administrivia of Hadoop (e.g., new release discussion). @@ -96, +90 @@ * -dev mailing lists are for people who are changing the source code of the framework. For example, if you are implementing a new file system and want to know about the FileSystem API, hdfs-dev would be the appropriate mailing list. == What does NFS: Cannot create lock on (some dir) mean? == - This actually is not a problem with Hadoop, but represents a problem with the setup of the environment it is operating. - Usually, this error means that the NFS server to which the process is writing does not support file system locks. NFS prior to v4 requires a locking service daemon to run (typically rpc.lockd) in order to provide this functionality. NFSv4 has file system locks built into the protocol. @@ -293, +285 @@ Hadoop currently does not have a method by which to do this automatically. To do this manually: 1. Shutdown the DataNode involved - 2. Use the UNIX mv command to move the individual block replica and meta pairs from one directory to another on the selected host. On releases which have HDFS-6482 (Apache Hadoop 2.6.0+) you also need to ensure the subdir-named directory structure remains exactly the same when moving the blocks across the disks. For example, if the block replica and its meta pair were under '''/data/1'''/dfs/dn/current/BP-1788246909-172.23.1.202-1412278461680/current/finalized'''/subdir0/subdir1'''/, and you wanted to move it to /data/5/ disk, then it MUST be moved into the same subdirectory structure underneath that, i.e. '''/data/5'''/dfs/dn/current/BP-1788246909-172.23.1.202-1412278461680/current/finalized'''/subdir0/subdir1/'''. If this is not maintained, the DN will no longer be able to locate the replicas after the move. + 1. Use the UNIX mv command to move the individual block replica and meta pairs from one directory to another on the selected host. On releases which have HDFS-6482 (Apache Hadoop 2.6.0+) you also need to ensure the subdir-named directory structure remains exactly the same when moving the blocks across the disks. For example, if the block replica and its meta pair were under '''/data/1'''/dfs/dn/current/BP-1788246909-172.23.1.202-1412278461680/current/finalized'''/subdir0/subdir1'''/, and you wanted to move it to /data/5/ disk, then it MUST be moved into the same subdirectory structure underneath that, i.e. '''/data/5'''/dfs/dn/current/BP-1788246909-172.23.1.202-1412278461680/current/finalized'''/subdir0/subdir1/'''. If this is not maintained, the DN will no longer be able to locate the replicas after the move. - 3. Restart the DataNode. + 1. Restart the DataNode. == What does file could only be replicated to 0 nodes, instead of 1 mean? == The NameNode does not have any available !DataNodes. This can be caused by a wide variety of reasons. Check the DataNode logs, the NameNode logs, network connectivity, ... Please see the page: CouldOnlyBeReplicatedTo @@ -303, +295 @@ No. This is why it is very important to configure [[http://hadoop.apache.org/hdfs/docs/current/hdfs-default.html#dfs.namenode.name.dir|dfs.namenode.name.dir]]
[Hadoop Wiki] Update of HowToContribute by SomeOtherAccount
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The HowToContribute page has been changed by SomeOtherAccount: https://wiki.apache.org/hadoop/HowToContribute?action=diffrev1=110rev2=111 In order to create a patch, type (from the base directory of hadoop): {{{ - git diff --no-prefix trunk HADOOP-1234.patch + git diff trunk HADOOP-1234.patch }}} This will report all modifications done on Hadoop sources on your local disk and save them into the ''HADOOP-1234.patch'' file. Read the patch file. Make sure it includes ONLY the modifications required to fix a single issue.
[Hadoop Wiki] Update of ContributorsGroup by SteveLoughran
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The ContributorsGroup page has been changed by SteveLoughran: https://wiki.apache.org/hadoop/ContributorsGroup?action=diffrev1=108rev2=109 Comment: add bibinchundatt * BenConnors * BenjaminReed * Benipal Technologies + * bibinchundatt * BradfordStephens * BrandonHays * BriceArnould
[Hadoop Wiki] Update of SSLException by bibinchundatt
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The SSLException page has been changed by bibinchundatt: https://wiki.apache.org/hadoop/SSLException New page: = SSLException =;
[Hadoop Wiki] Update of SSLException by bibinchundatt
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The SSLException page has been changed by bibinchundatt: https://wiki.apache.org/hadoop/SSLException?action=diffrev1=2rev2=3 Indicates that the client and server could not negotiate the desired level of security *The certificate specified in Server and client mismatch is happening or certificate not available in JKS. - *Recheck the truststore password and is correct or not. + *Recheck the truststore password is correct or not. - *Check SSL truststore location the file is not available. + *Check SSL truststore location the file is available. Use the below command to verify in truststore the certificate is available.BR {{{keytool -list -v -keystore $ALL_JKS -storepass $CLIENT_TRUSTSTORE_PASSWORD}}}
[Hadoop Wiki] Update of SSLException by bibinchundatt
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The SSLException page has been changed by bibinchundatt: https://wiki.apache.org/hadoop/SSLException?action=diffrev1=4rev2=5 Probable causes for SSLException - *The certificate specified in Server and client mismatch is happening or certificate not available in file. + *The certificate specified in Server and client mismatch is happening. + *Certificate not available in jks file mentioned. *Truststore password specified is wrong in xml files. - *In SSL truststore location the file is available. + *In SSL truststore location the file is not available. *Misconfiguration of the server or client SSL certificate and private key. - *Check the hostname in certification is matching with actual server hostname. + *Hostname in certificate is not matching with actual server hostname. *Common Name Mismatch or Host name in the URL you’re using for communication not matches one of the common names in the SSL certificate. - *Expired Certificate can be a cause for SSLPeerUnverifiedException + *Expired Certificate can be a cause for SSLPeerUnverifiedException. *The particular cipher suite being used does not support authentication
[Hadoop Wiki] Update of SSLException by bibinchundatt
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The SSLException page has been changed by bibinchundatt: https://wiki.apache.org/hadoop/SSLException?action=diffrev1=3rev2=4 = SSLException = - Indicates some kind of error detected by an SSL subsystem.BR + Indicates some kind of error detected by an SSL subsystem.BRIn most of the cases it is misconfiguration where keystores didn't contain the correct certificates, the certificate chain was incomplete or the client didn't supply a valid certificate.BR In case of hadoop the SSL configuration are mainly done in core-site.xml ,ssl-server.xml and ssl-client.xml * ssl-server.xml @@ -19, +19 @@ Each keystore file contains the private key for each certificate, the single truststore file contains all the keys of all certificates. The keystore file is used by the Hadoop HttpServer while the truststore file is used by the client HTTPS connections. - '''SSLHandshakeException''' - - Indicates that the client and server could not negotiate the desired level of security - - *The certificate specified in Server and client mismatch is happening or certificate not available in JKS. - *Recheck the truststore password is correct or not. - *Check SSL truststore location the file is available. - Use the below command to verify in truststore the certificate is available.BR {{{keytool -list -v -keystore $ALL_JKS -storepass $CLIENT_TRUSTSTORE_PASSWORD}}} - '''SSLKeyException''' - Reports a bad SSL key. + Probable causes for SSLException + *The certificate specified in Server and client mismatch is happening or certificate not available in file. + *Truststore password specified is wrong in xml files. + *In SSL truststore location the file is available. - *Indicates misconfiguration of the server or client SSL certificate and private key. + *Misconfiguration of the server or client SSL certificate and private key. - *Check the hostname in certification is matching with actual server hostname + *Check the hostname in certification is matching with actual server hostname. *Common Name Mismatch or Host name in the URL you’re using for communication not matches one of the common names in the SSL certificate. - - - '''SSLPeerUnverifiedException''' - - Indicates that the peer's identity has not been verified. - *Expired Certificate can be a cause for SSLPeerUnverifiedException *The particular cipher suite being used does not support authentication - *No peer authentication was established during SSL handshaking - '''SSLProtocolException''' - - Reports an error in the operation of the SSL protocol. Normally this indicates a flaw in one of the protocol implementations. -
[Hadoop Wiki] Update of BindException by SteveLoughran
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The BindException page has been changed by SteveLoughran: https://wiki.apache.org/hadoop/BindException?action=diffrev1=5rev2=6 Comment: Because some people don't know what to do once they have that port/process * The port is in use (likeliest) * If the port number is below 1024, the OS may be preventing your program from binding to a trusted port * If the configuration is a {{{hostname:port}}} value, it may be that the hostname is wrong -or its IP address isn't one your machine has. + * There is an instance of the service already running. If the port is 0, then the OS is looking for any free port -so the port-in-use and port-below-1024 problems are highly unlikely to be the cause of the problem. + As you cannot have more than one process listening on a TCP port, whatever is listening is stopping the service coming up. You will need to track down and stop that process, or change the service you are trying to start up to listen to a different port. + How to track down the problem 1. identify which port the program is trying to bind to - 1. as root use {{{netstat -a -t --numeric-ports -p}}} to list the ports that are in use by number and process. (On OS/X you need to use {{{lsof}}}) 1. identify the port that is in use and the program that is in use + 1. as root use {{{netstat -a -t --numeric-ports -p}}} to list the ports that are in use by number and process. (On OS/X you need to use {{{lsof}}}). 1. Make sure the hostname is valid:try to ping it; use {{{ifconfig}}} to list the network interfaces and their IP addresses. 1. try and identify why it is in use. {{{telnet hostname port}}} and pointing a web browser at it are both good tricks. 1. change the configuration of one of the programs to listen on a different port.
[Hadoop Wiki] Trivial Update of HowToRelease by VinodKumarVavilapalli
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The HowToRelease page has been changed by VinodKumarVavilapalli: https://wiki.apache.org/hadoop/HowToRelease?action=diffrev1=74rev2=75 Comment: Minor updates to the notes. tar xvf /www/www.apache.org/dist/hadoop/core/hadoop-${version}/hadoop-${version}.tar.gz cp -rp hadoop-${version}/share/doc/hadoop publish/docs/r${version} rm -r hadoop-${version} + cd publish/docs + # Update current2, current, stable and stable2 as needed. + # For example rm current2 current ln -s r${version} current2 ln -s current2 current
[Hadoop Wiki] Trivial Update of HowToRelease by VinodKumarVavilapalli
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The HowToRelease page has been changed by VinodKumarVavilapalli: https://wiki.apache.org/hadoop/HowToRelease?action=diffrev1=73rev2=74 Comment: Adding a note about tagging releases after voting finishes. {{{ python ./dev-support/relnotes.py -v $(vers) }}} -If your release includes more then one version you may add additional -v options for each version. By default the previousVersion mentioned in the notes will be X.Y.Z-1, if this is not correct you can override this by setting the --previousVer option. +. If your release includes more then one version you may add additional -v options for each version. By default the previousVersion mentioned in the notes will be X.Y.Z-1, if this is not correct you can override this by setting the --previousVer option. 1. Update {{{releasenotes.html}}} {{{ mv releasenotes.$(vers).html ./hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html }}} -Note that the script generates a set of notes for HDFS, HADOOP, MAPREDUCE, and YARN too, but only common is linked from the html documentation so the indavidual ones are ignored for now. +. Note that the script generates a set of notes for HDFS, HADOOP, MAPREDUCE, and YARN too, but only common is linked from the html documentation so the indavidual ones are ignored for now. 1. Commit these changes to branch-X.Y.Z {{{ git commit -a -m Preparing for release X.Y.Z @@ -140, +140 @@ {{{ git commit -a -m Set the release date for X.Y.Z }}} - 1. Tag the release: + 1. Tag the release. Do it from the release branch and push the created tag to the remote repository: {{{ git tag -s release-X.Y.Z -m Hadoop X.Y.Z release }}}
[Hadoop Wiki] Trivial Update of HowToRelease by VinodKumarVavilapalli
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The HowToRelease page has been changed by VinodKumarVavilapalli: https://wiki.apache.org/hadoop/HowToRelease?action=diffrev1=72rev2=73 Comment: Minor edit ## page was copied from HowToReleasePostMavenization ''This page is prepared for Hadoop Core committers. You need committer rights to create a new Hadoop Core release.'' - These instructions have been updated for Hadoop 2.5.1 and later releases to reflect the changes to version-control (git), build-scripts and mavenization. + These instructions have been updated for Hadoop 2.5.1 and later releases to reflect the changes to version-control (git), build-scripts and mavenization. Earlier versions of this document are at HowToReleaseWithSvnAndAnt and HowToReleasePostMavenization TableOfContents(4) = Preparation = - - 1. Bulk update Jira to unassign from this release all issues that are open non-blockers and send follow-up notification to the developer list that this was done. + 1. Bulk update Jira to unassign from this release all issues that are open non-blockers and send follow-up notification to the developer list that this was done. - 1. If you have not already done so, [[http://www.apache.org/dev/release-signing.html#keys-policy|append your code signing key]] to the [[https://dist.apache.org/repos/dist/release/hadoop/common/KEYS|KEYS]] file. Once you commit your changes, they will automatically be propagated to the website. Also [[http://www.apache.org/dev/release-signing.html#keys-policy|upload your key to a public key server]] if you haven't. End users use the KEYS file (along with the [[http://www.apache.org/dev/release-signing.html#web-of-trust|web of trust]]) to validate that releases were done by an Apache committer. For more details on signing releases, see [[http://www.apache.org/dev/release-signing.html|Signing Releases]] and [[http://www.apache.org/dev/mirror-step-by-step.html?Step-By-Step|Step-By-Step Guide to Mirroring Releases]]. + 1. If you have not already done so, [[http://www.apache.org/dev/release-signing.html#keys-policy|append your code signing key]] to the [[https://dist.apache.org/repos/dist/release/hadoop/common/KEYS|KEYS]] file. Once you commit your changes, they will automatically be propagated to the website. Also [[http://www.apache.org/dev/release-signing.html#keys-policy|upload your key to a public key server]] if you haven't. End users use the KEYS file (along with the [[http://www.apache.org/dev/release-signing.html#web-of-trust|web of trust]]) to validate that releases were done by an Apache committer. For more details on signing releases, see [[http://www.apache.org/dev/release-signing.html|Signing Releases]] and [[http://www.apache.org/dev/mirror-step-by-step.html?Step-By-Step|Step-By-Step Guide to Mirroring Releases]]. - 1. To deploy artifacts to the Apache Maven repository create {{{~/.m2/settings.xml}}}:{{{ + 1. To deploy artifacts to the Apache Maven repository create {{{~/.m2/settings.xml}}}: + {{{ settings xmlns=http://maven.apache.org/SETTINGS/1.0.0; xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance; xsi:schemaLocation=http://maven.apache.org/SETTINGS/1.0.0 @@ -30, +30 @@ }}} = Branching = + When releasing Hadoop X.Y.Z, the following branching changes are required. Note that a release can match more than one of the following if-conditions. For a major release, one needs to make the changes for minor and point releases as well. Similarly, a new minor release is also a new point release. - When releasing Hadoop X.Y.Z, the following branching changes are required. Note that a release can match more than one of the following if-conditions. For a major release, one needs to make the changes for minor and point releases as well. Similarly, a new minor release is also a new point release. + 1. Add the release X.Y.Z to CHANGES.txt files if it doesn't already exist (leave the date as unreleased for now). Commit these changes to trunk and any of branch-X, branch-X.Y if they exist. + {{{ + git commit -a -m Adding release X.Y.Z to CHANGES.txt + }}} + 1. If this is a new major release (i.e., Y = 0 and Z = 0) + 1. Create a new branch (branch-X) for all releases in this major release. + 1. Update the version on trunk to (X+1).0.0-SNAPSHOT + {{{ + mvn versions:set -DnewVersion=(X+1).0.0-SNAPSHOT + }}} + 1. Commit the version change to trunk. + {{{ + git commit -a -m Preparing for (X+1).0.0 development + }}} + 1. If this is a new minor release (i.e., Z = 0) + 1. Create a new branch (branch-X.Y) for all releases in this minor release. + 1. Update the version on branch-X to X.(Y+1).0-SNAPSHOT + {{{ + mvn versions:set -DnewVersion=X.(Y+1).0-SNAPSHOT + }}} + 1. Commit the version change to branch-X. + {{{ + git commit -a -m Preparing for X.(Y+1).0 development + }}} + 1. If this is a
[Hadoop Wiki] Trivial Update of HowToRelease by VinodKumarVavilapalli
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The HowToRelease page has been changed by VinodKumarVavilapalli: https://wiki.apache.org/hadoop/HowToRelease?action=diffrev1=71rev2=72 Comment: Updating doc link to the website index page. 1. Commit the version change to branch-X.Y. {{{ git commit -a -m Preparing for X.Y.(Z+1) development}}} 1. Release branch (branch-X.Y.Z) updates: - 1. Update {{{hadoop-project/src/site/apt/index.apt.vm}}} to reflect the right versions, new features and big improvements. + 1. Update {{{hadoop-project//src/site/markdown/index.md.vm}}} to reflect the right versions, new features and big improvements. 1. Update the version on branch-X.Y.Z TO X.Y.Z {{{ mvn versions:set -DnewVersion=X.Y.Z}}} 1. Generate {{{releasenotes.html}}} with release notes for this release. You generate these with: {{{
[Hadoop Wiki] Update of PoweredBy by JeanBaptisteNote
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The PoweredBy page has been changed by JeanBaptisteNote: https://wiki.apache.org/hadoop/PoweredBy?action=diffrev1=435rev2=436 * ''[[http://criteo.com|Criteo]] - Criteo is a global leader in online performance advertising '' * ''[[http://labs.criteo.com/blog|Criteo RD]] uses Hadoop as a consolidated platform for storage, analytics and back-end processing, including Machine Learning algorithms '' - * ''We currently have a dedicated cluster of 850 nodes, 30PB storage, 65TB RAM, 16000 cores running full steam 24/7, and growing by the day '' + * ''We currently have a dedicated cluster of 1117 nodes, 39PB storage, 75TB RAM, 22000 cores running full steam 24/7, and growing by the day '' * ''Each node has 24 HT cores, 96GB RAM, 42TB HDD '' * ''Hardware and platform management is done through [[http://www.getchef.com/|Chef]], we run YARN '' * ''We run a mix of ad-hoc Hive queries for BI, [[http://www.cascading.org/|Cascading]] jobs, raw mapreduce jobs, and streaming [[http://www.mono-project.com/|Mono]] jobs, as well as some Pig '' + * ''To be delivered in Q2 2015 a second cluster of 600 nodes, each 48HT cores, 256GB RAM, 96TB HDD '' * ''[[http://www.crs4.it|CRS4]] '' * ''Hadoop deployed dynamically on subsets of a 400-node cluster ''
[Hadoop Wiki] Update of HowToContribute by QwertyManiac
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The HowToContribute page has been changed by QwertyManiac: https://wiki.apache.org/hadoop/HowToContribute?action=diffrev1=109rev2=110 - = How to Contribute to Hadoop Common = + = How to Contribute to Hadoop = - This page describes the mechanics of ''how'' to contribute software to Hadoop Common. For ideas about ''what'' you might contribute, please see the ProjectSuggestions page. + This page describes the mechanics of ''how'' to contribute software to Apache Hadoop. For ideas about ''what'' you might contribute, please see the ProjectSuggestions page. TableOfContents(4) @@ -76, +76 @@ * Place your class in the {{{src/test}}} tree. * {{{TestFileSystem.java}}} and {{{TestMapRed.java}}} are examples of standalone MapReduce-based tests. * {{{TestPath.java}}} is an example of a non MapReduce-based test. - * You can run all the Common unit tests with {{{mvn test}}}, or a specific unit test with {{{mvn -Dtest=class name without package prefix test}}}. Run these commands from the {{{hadoop-trunk}}} directory. + * You can run all the project unit tests with {{{mvn test}}}, or a specific unit test with {{{mvn -Dtest=class name without package prefix test}}}. Run these commands from the {{{hadoop-trunk}}} directory. * If you modify the Unix shell scripts, see the UnixShellScriptProgrammingGuide. === Generating a patch ===
[Hadoop Wiki] Update of ContributorsGroup by SteveLoughran
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The ContributorsGroup page has been changed by SteveLoughran: https://wiki.apache.org/hadoop/ContributorsGroup?action=diffrev1=107rev2=108 Comment: add lresende * liangxie * linebeeLabs * LohitVijayarenu + * lresende * ltomuno * LukeLu * luisdans
[Hadoop Wiki] Update of Books by AlexHolmes
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Books page has been changed by AlexHolmes: https://wiki.apache.org/hadoop/Books?action=diffrev1=19rev2=20 Comment: Added Hadoop in Action, Second Edition to the forthcoming books section. == Forthcoming Books == + + === Hadoop in Action, Second Edition === + + '''Name:''' [[http://www.manning.com/lam2/|Hadoop in Action, Second Edition]] + + '''Author:''' Chuck P. Lam, Mark W. Davis + + '''Hadoop Version:''' 2.x + + '''Publisher:''' Manning + + '''Date of Publishing (est.):''' October 2015 + + Hadoop in Action introduces the subject and shows how to write programs in the MapReduce style. It starts with a few easy examples and then moves quickly to show Hadoop use in more complex data analysis tasks. Included are best practices and design patterns of MapReduce programming. +
[Hadoop Wiki] Update of HowToContribute by ZhengShao
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The HowToContribute page has been changed by ZhengShao: https://wiki.apache.org/hadoop/HowToContribute?action=diffrev1=108rev2=109 Comment: Removed Hudson and replaced with Jenkins and https://builds.apache.org/view/All/ It's also OK to upload a new patch to Jira with the same name as an existing patch. If you select the ActivityAll tab then the different versions are linked in the comment stream, providing context. However, many reviewers find it helpful to include a version number in the patch name (three-digit version number is recommended), '''so including a version number is the preferred style'''. === Testing your patch === - Before submitting your patch, you are encouraged to run the same tools that the automated Jenkins patch test system will run on your patch. This enables you to fix problems with your patch before you submit it. The {{{dev-support/test-patch.sh}}} script in the trunk directory will run your patch through the same checks that Hudson currently does ''except'' for executing the unit tests. + Before submitting your patch, you are encouraged to run the same tools that the automated Jenkins patch test system will run on your patch. This enables you to fix problems with your patch before you submit it. The {{{dev-support/test-patch.sh}}} script in the trunk directory will run your patch through the same checks that Jenkins currently does ''except'' for executing the unit tests. Run this command from a clean workspace (ie {{{git status}}} shows no modifications or additions) as follows: @@ -206, +206 @@ == Contributing your work == 1. Finally, patches should be ''attached'' to an issue report in [[http://issues.apache.org/jira/browse/HADOOP|Jira]] via the '''Attach File''' link on the issue's Jira. Please add a comment that asks for a code review following our [[CodeReviewChecklist|code review checklist]]. Please note that the attachment should be granted license to ASF for inclusion in ASF works (as per the [[http://www.apache.org/licenses/LICENSE-2.0|Apache License]] §5). - 1. When you believe that your patch is ready to be committed, select the '''Submit Patch''' link on the issue's Jira. Submitted patches will be automatically tested against trunk by [[http://hudson.zones.apache.org/hudson/view/Hadoop/|Hudson]], the project's continuous integration engine. Upon test completion, Hudson will add a success (+1) message or failure (-1) to your issue report in Jira. If your issue contains multiple patch versions, Hudson tests the last patch uploaded. It is preferable to upload the trunk version last. + 1. When you believe that your patch is ready to be committed, select the '''Submit Patch''' link on the issue's Jira. Submitted patches will be automatically tested against trunk by [[https://builds.apache.org/view/All/|Jenkins]], the project's continuous integration engine. Upon test completion, Jenkins will add a success (+1) message or failure (-1) to your issue report in Jira. If your issue contains multiple patch versions, Jenkins tests the last patch uploaded. It is preferable to upload the trunk version last. 1. Folks should run {{{mvn clean install javadoc:javadoc checkstyle:checkstyle}}} before selecting '''Submit Patch'''. 1. Tests must all pass. 1. Javadoc should report '''no''' warnings or errors. - 1. Checkstyle's error count should not exceed that listed at [[http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/lastSuccessfulBuild/artifact/trunk/build/test/checkstyle-errors.html|Checkstyle Errors]] + 1. Checkstyle's error count should not exceed that listed at lastSuccessfulBuild/artifact/trunk/build/test/checkstyle-errors.html - . Hudson's tests are meant to double-check things, and not be used as a primary patch tester, which would create too much noise on the mailing list and in Jira. Submitting patches that fail Hudson testing is frowned on, (unless the failure is not actually due to the patch). + . Jenkins's tests are meant to double-check things, and not be used as a primary patch tester, which would create too much noise on the mailing list and in Jira. Submitting patches that fail Jenkins testing is frowned on, (unless the failure is not actually due to the patch). 1. If your patch involves performance optimizations, they should be validated by benchmarks that demonstrate an improvement. 1. If your patch creates an incompatibility with the latest major release, then you must set the '''Incompatible change''' flag on the issue's Jira 'and' fill in the '''Release Note''' field with an explanation of the impact of the incompatibility and the necessary steps users must take. 1. If your patch implements a major feature or improvement, then you must fill in the '''Release Note''' field on the issue's Jira with an explanation of the
[Hadoop Wiki] Update of ContributorsGroup by SteveLoughran
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The ContributorsGroup page has been changed by SteveLoughran: https://wiki.apache.org/hadoop/ContributorsGroup?action=diffrev1=106rev2=107 Comment: add busbey * BrockNoland * BrunoDumon * BryanBeaudreault + * busbey * CamiloGonzalez * CarlSteinbach * cbrooks
[Hadoop Wiki] Update of dineshs/IsolatingYarnAppsInDockerContainers by Abin Shahab
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The dineshs/IsolatingYarnAppsInDockerContainers page has been changed by Abin Shahab: https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers?action=diffrev1=3rev2=4 = Isolating YARN Applications in Docker Containers = The Docker executor for YARN involves work on YARN along with its counterpart in Docker to forge the necessary API end points. The purpose of this page is to collect related tickets across both projects in one location. + + == May 2015 Update == + + The initial implementation of Docker Container Executor required that + all YARN containers be launched in a Docker container. While this + approach allowed us to more quickly get hands-on experiencing bringing + Docker to YARN, it's not practical for production clusters. Also, we + noticed that a production-quality implementation of Docker Container + Executor would require borrowing a large amount of important -- and + security-sensitive -- code and configuration from Linux Container + Executor (and it's supporting binary). + + As a result, we've concluded that the best way to bring Docker to a + production cluster is to add Docker as a feature of the Linux + Container Executor. With this features, individual jobs -- and even + individual YARN containers -- can be configured to use Docker + containers, while other jobs can continue to use regular Linux + containers. + + Based on this conclusion, we have developed the following plan for + moving forward: + + * Add to the Linux Container Executor (LCE) the option to launch + containers using Docker. + + * Add this functionality in a way that leverages LCE's existing + ability to create cgroups to obtain cgroups for Docker containers. + + * Add the ability to load the Docker image from a localized tar + file (in addition to being able to load from a Docker registry). + + * Extend our Docker work to behave correctly in Kerberized clusters. + + * Verify that the distributed cache works correctly in Docker + containers (we think it does, but we haven't fully tested). + + We are targeting a beta release of this functionality for Hadoop 2.8. == Motivation ==
[Hadoop Wiki] Update of Roadmap by SomeOtherAccount
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Roadmap page has been changed by SomeOtherAccount: https://wiki.apache.org/hadoop/Roadmap?action=diffrev1=54rev2=55 * Removal of hftp in favor of webhdfs [[https://issues.apache.org/jira/browse/HDFS-5570|HDFS-5570]] * YARN * MAPREDUCE +* Derive heap size or mapreduce.*.memory.mb automatically [[https://issues.apache.org/jira/browse/MAPREDUCE-5785|MAPREDUCE-5785]] == Hadoop 2.x Releases ==
[Hadoop Wiki] Update of ContributorsGroup by SomeOtherAccount
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The ContributorsGroup page has been changed by SomeOtherAccount: https://wiki.apache.org/hadoop/ContributorsGroup?action=diffrev1=105rev2=106 * AaronKimball * abhimehta + * Abin Shahab * AC * AdamKawa * adi @@ -37, +38 @@ * Apeksha * Arun C Murthy * AsankhaPerera - * ashahab * AshishThusoo * Asif Jan * AndreiSavu
[Hadoop Wiki] Trivial Update of GitAndHadoop by AkiraAjisaka
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The GitAndHadoop page has been changed by AkiraAjisaka: https://wiki.apache.org/hadoop/GitAndHadoop?action=diffrev1=22rev2=23 Comment: Fixes wrong command git co This gives you a local repository with two remote repositories: apache and github. Apache has the trunk branch, which you can update whenever you want to get the latest ASF version: {{{ - git co trunk + git checkout trunk git pull apache }}}
[Hadoop Wiki] Update of ContributorsGroup by SomeOtherAccount
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The ContributorsGroup page has been changed by SomeOtherAccount: https://wiki.apache.org/hadoop/ContributorsGroup?action=diffrev1=104rev2=105 * AaronKimball * abhimehta - * ashahab * AC * AdamKawa * adi @@ -38, +37 @@ * Apeksha * Arun C Murthy * AsankhaPerera + * ashahab * AshishThusoo * Asif Jan * AndreiSavu
[Hadoop Wiki] Update of ContributorsGroup by SomeOtherAccount
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The ContributorsGroup page has been changed by SomeOtherAccount: https://wiki.apache.org/hadoop/ContributorsGroup?action=diffrev1=104rev2=105 * AaronKimball * abhimehta - * ashahab * AC * AdamKawa * adi @@ -38, +37 @@ * Apeksha * Arun C Murthy * AsankhaPerera + * ashahab * AshishThusoo * Asif Jan * AndreiSavu
[Hadoop Wiki] Trivial Update of HowToContribute by AkiraAjisaka
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The HowToContribute page has been changed by AkiraAjisaka: https://wiki.apache.org/hadoop/HowToContribute?action=diffrev1=107rev2=108 Comment: Fix broken list git apply -p0 cool_patch.patch }}} - If you are an Eclipse user, you can apply a patch by : 1. Right click project name in Package Explorer , 2. Team - Apply Patch + If you are an Eclipse user, you can apply a patch by : + + 1. Right click project name in Package Explorer + 1. Team - Apply Patch === Changes that span projects === You may find that you need to modify both the common project and MapReduce or HDFS. Or perhaps you have changed something in common, and need to verify that these changes do not break the existing unit tests for HDFS and MapReduce. Hadoop's build system integrates with a local maven repository to support cross-project development. Use this general workflow for your development:
[Hadoop Wiki] Update of HowToContribute by AkiraAjisaka
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The HowToContribute page has been changed by AkiraAjisaka: https://wiki.apache.org/hadoop/HowToContribute?action=diffrev1=106rev2=107 Comment: Reflected HADOOP-11746 (rewrite test-patch.sh). Patches for trunk should be named according to the Jira, with a version number: '''jiraName.versionNum.patch''', e.g. HADOOP-1234.001.patch, HDFS-4321.002.patch. - Patches for a non-trunk branch should be named '''jiraName-branchName.versionNum.patch''', e.g. HDFS-1234-branch-0.23.003.patch. The branch name suffix should be the exact name of a git branch, such as branch-0.23. Please note that the Jenkins pre-commit build is only run against trunk. + Patches for a non-trunk branch should be named '''jiraName-branchName.versionNum.patch''', e.g. HDFS-1234-branch-2.003.patch. The branch name suffix should be the exact name of a git branch, such as branch-2. Jenkins will check the name of the patch and detect the appropriate branch for testing. + + Please note that the Jenkins pre-commit build is not run against minor branches (e.g. branch-2.7) or the branches starting with branch-0 (e.g. branch-0.23). It's also OK to upload a new patch to Jira with the same name as an existing patch. If you select the ActivityAll tab then the different versions are linked in the comment stream, providing context. However, many reviewers find it helpful to include a version number in the patch name (three-digit version number is recommended), '''so including a version number is the preferred style'''. @@ -164, +166 @@ Run this command from a clean workspace (ie {{{git status}}} shows no modifications or additions) as follows: {{{ - dev-support/test-patch.sh /path/to/my.patch + dev-support/test-patch.sh [options] patch-file | defect-number }}} - At the end, you should get a message on your console that is similar to the comment added to Jira by Jenkins's automated patch test system, listing +1 and -1 results. For non-trunk patches (prior to HADOOP-7435 being implemented), please copy this results summary into the Jira as a comment. Generally you should expect a +1 overall in order to have your patch committed; exceptions will be made for false positives that are unrelated to your patch. The scratch directory (which defaults to the value of {{{${user.home}/tmp}}}) will contain some output files that will be useful in determining cause if issues were found in the patch. + At the end, you should get a message on your console that is similar to the comment added to Jira by Jenkins's automated patch test system, listing +1 and -1 results. Generally you should expect a +1 overall in order to have your patch committed; exceptions will be made for false positives that are unrelated to your patch. The scratch directory (which defaults to the value of {{{${user.home}/tmp}}}) will contain some output files that will be useful in determining cause if issues were found in the patch. Some things to note:
[Hadoop Wiki] Trivial Update of ContributorsGroup by SomeOtherAccount
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The ContributorsGroup page has been changed by SomeOtherAccount: https://wiki.apache.org/hadoop/ContributorsGroup?action=diffrev1=102rev2=103 * RaviPhulari * raviprak * Ravishankar + * RayChiang * rding * Remis * RickFarnell
[Hadoop Wiki] Update of AdminGroup by OwenOMalley
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The AdminGroup page has been changed by OwenOMalley: https://wiki.apache.org/hadoop/AdminGroup?action=diffrev1=14rev2=15 Comment: Add Allen Wittenauer (SomeOtherAccount) * QwertyManiac * SanjayRadia * SebastianBazley + * SomeOtherAccount * stack * SteveLoughran * SureshSrinivas
[Hadoop Wiki] Trivial Update of TestPatchTips by RayChiang
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The TestPatchTips page has been changed by RayChiang: https://wiki.apache.org/hadoop/TestPatchTips?action=diffrev1=2rev2=3 Comment: Fix typo in argument {{{ $ git diff --no-prefix trunk /tmp/1.patch - $ dev-support/test-patch.sh --resetrepo --runtests --basedir=/test/repo /tmp/1.patch + $ dev-support/test-patch.sh --resetrepo --run-tests --basedir=/test/repo /tmp/1.patch }}} This will run the freshly built patch against the tests in a fresh repo.
[Hadoop Wiki] Update of HadoopStreaming by EricMoyer
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The HadoopStreaming page has been changed by EricMoyer: https://wiki.apache.org/hadoop/HadoopStreaming?action=diffrev1=15rev2=16 Comment: Changed link to current hadoop streaming docs == See Also == * HowToDebugMapReducePrograms * HadoopStreaming/AlternativeInterfaces - * [[http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/HadoopStreaming.html|Hadoop Streaming]] + * [[http://hadoop.apache.org/docs/current/hadoop-streaming/HadoopStreaming.html|Hadoop Streaming]]
[Hadoop Wiki] Update of TestPatchTips by MasatakeIwasaki
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The TestPatchTips page has been changed by MasatakeIwasaki: https://wiki.apache.org/hadoop/TestPatchTips?action=diffrev1=1rev2=2 }}} Download a patch from a JIRA and run just the basic checks in a checkout that can be destroyed: + {{{ - {{}}} - - {{{$ dev-support/test-patch.sh --resetrepo HADOOP-11820}}} + $ dev-support/test-patch.sh --resetrepo HADOOP-11820 - }}} '''Recommended Usage'''
[Hadoop Wiki] Trivial Update of 2015MayBugBash by SomeOtherAccount
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The 2015MayBugBash page has been changed by SomeOtherAccount: https://wiki.apache.org/hadoop/2015MayBugBash?action=diffrev1=6rev2=7 Note that in some cases, the functionality in older patches may already exist. Please close these JIRA, preferably as a duplicate to the JIRA that added that functionality or as Invalid with a comment stating that you believe the issue is stale and already fixed. + There is also a possibility that the patch requires no further changes and is ready for the committer to review. In that case, just change the label. + Committers 1. Read through the [[https://issues.apache.org/jira/issues/?filter=12331695|Ready For Committer queue]]. 1. Find a JIRA to work on.
[Hadoop Wiki] Update of TestPatchTips by SomeOtherAccount
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The TestPatchTips page has been changed by SomeOtherAccount: https://wiki.apache.org/hadoop/TestPatchTips New page: === Introduction === In the Hadoop source tree is {{{dev-support/test-patch.sh}}}. This script used by the Jenkins servers to run the automated QA tests. It is possible and highly recommended to run this script locally prior to uploading a patch to JIRA. In order to get the full power of the tool set, you'll want to make sure that both {{{findbugs}}} and {{{shellcheck}}} are installed. == Using test-patch.sh == Running {{{test-patch.sh}}} will show a usage message that describes all of its options. While there are many listed, there are a few key ones: * {{{--basedir}}} = location of the source repo * {{{--dirty-workspace}}} = the repo isn't pristine, but run anyway * {{{--reset-repo}}} = the repo is allowed to be modified NOTE: This will '''DESTROY''' any changes in the given repo! * {{{--run-tests}}} = run appropriate unit tests * filename or JIRA # or HTTP URL = the location of the patch that needs to be tested Apply and run just the basic checks in a checkout that has other stuff in it: {{{ $ dev-support/test-patch.sh --dirty-workspace /tmp/patchfile }}} Apply and run the full unit test: {{{ $ dev-support/test-patch.sh --dirty-workspace --run-tests /tmp/patchfile }}} Download a patch from a JIRA and run just the basic checks in a checkout that can be destroyed: {{}}} {{{$ dev-support/test-patch.sh --resetrepo HADOOP-11820}}} }}} '''Recommended Usage''' In general, the easiest way to use {{{test-patch.sh}}} is to use two repos. One repo is used to build patches. The other repo is used to to test them. {{{ $ git diff --no-prefix trunk /tmp/1.patch $ dev-support/test-patch.sh --resetrepo --runtests --basedir=/test/repo /tmp/1.patch }}} This will run the freshly built patch against the tests in a fresh repo.
[Hadoop Wiki] Update of 2015MayBugBash by SomeOtherAccount
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The 2015MayBugBash page has been changed by SomeOtherAccount: https://wiki.apache.org/hadoop/2015MayBugBash?action=diffrev1=5rev2=6 1. Name the patch file (something).patch 1. Verify the patch applies cleanly 1. Fix any pre-existing comments - 1. Test the patch locally using {{{test-patch.sh}}} + 1. Test the patch locally using {{{test-patch.sh}}} [See TestPatchTips for more!] 1. '''Before uploading''', did you run {{{test-patch.sh}}}? 1. Upload the reworked patch back into JIRA. 1. Set the label to '''BB2015-05-RFC'''.
[Hadoop Wiki] Update of 2015MayBugBash by SomeOtherAccount
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The 2015MayBugBash page has been changed by SomeOtherAccount: https://wiki.apache.org/hadoop/2015MayBugBash?action=diffrev1=3rev2=4 TableOfContents(4) + + {{{#!wiki red/solid + IMPORTANT NOTE: Apache has limited QA capabilities. It is extremely important that everyone avoids submitting patches to Jenkins for testing unless they are absolutely certain that all relevant tests will pass. HDFS unit tests, for example, will tie up a test slot for '''over two hours'''. + }}} + == Information == - - With over 900 patches not yet reviewed and approved for Apache Hadoop, it's time to make some strong progress on the bug list! + With over 900 patches not yet reviewed and approved for Apache Hadoop, it's time to make some strong progress on the bug list! A number of Apache Hadoop committers and Hadoop-related tech companies are hosting an Apache Hadoop Community event on Friday, May 8th, after HBaseCon 2015. You are hereby invited for a fun, daylong event devoted to identifying and registering important patches and cleaning up the queue. @@ -18, +22 @@ 1. Hang out on the #hadoop channel on irc.freenode.net . == Procedures for that day == - === Source Code Contributions === - Non-committers - 1. Read through the [[https://issues.apache.org/jira/issues/?filter=12331694|To Be Reviewed queue]]. 1. Find a JIRA to work on. 1. Remove the '''BB2015-05-TBR''' label from that JIRA so that it leaves the queue. - 1. Work through the issues in that JIRA. Work make sure the patch applies cleanly, test pasts locally by running against test-patch.sh, any pre-existing committer comments are covered, etc. - 1. When you think it is ready, set the label to '''BB2015-05-RFC'''. + 1. Work through the issues in that JIRA: + 1. Make sure the patch applies cleanly + 1. Fix any pre-existing comments + 1. '''Before uploading''', test the patch locally using {{{test-patch.sh}}} to prevent overloading the QA servers. + 1. Upload the reworked patch back into JIRA. + 1. Set the label to '''BB2015-05-RFC'''. Note that in some cases, the functionality in older patches may already exist. Please close these JIRA, preferably as a duplicate to the JIRA that added that functionality or as Invalid with a comment stating that you believe the issue is stale and already fixed. Committers - 1. Read through the [[https://issues.apache.org/jira/issues/?filter=12331695|Ready For Committer queue]]. 1. Find a JIRA to work on. 1. Remove the '''BB2015-05-RFC''' label from that JIRA so that it leaves the queue. 1. Review the patch. If it needs more work, add a comment and add the''' BB2015-05-TBR''' label so that it goes back into the non-committer queue. 1. Commit the patch as per usual if it is ready to go. - === Non-source Code Contributions === The vast majority of Hadoop content, including almost all of the documentation is part of the source tree. However, there are multiple ways in which those who are unfamiliar with developer environments can contribute:
[Hadoop Wiki] Update of 2015MayBugBash by SomeOtherAccount
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The 2015MayBugBash page has been changed by SomeOtherAccount: https://wiki.apache.org/hadoop/2015MayBugBash?action=diffrev1=4rev2=5 1. Find a JIRA to work on. 1. Remove the '''BB2015-05-TBR''' label from that JIRA so that it leaves the queue. 1. Work through the issues in that JIRA: + 1. Name the patch file (something).patch - 1. Make sure the patch applies cleanly + 1. Verify the patch applies cleanly 1. Fix any pre-existing comments - 1. '''Before uploading''', test the patch locally using {{{test-patch.sh}}} to prevent overloading the QA servers. + 1. Test the patch locally using {{{test-patch.sh}}} + 1. '''Before uploading''', did you run {{{test-patch.sh}}}? 1. Upload the reworked patch back into JIRA. 1. Set the label to '''BB2015-05-RFC'''.
[Hadoop Wiki] Update of ProjectSuggestions by SomeOtherAccount
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The ProjectSuggestions page has been changed by SomeOtherAccount: https://wiki.apache.org/hadoop/ProjectSuggestions?action=diffrev1=19rev2=20 Here are some suggestions for some interesting Hadoop projects. For more information, please inquire on the [[http://hadoop.apache.org/core/mailing_lists.html#Developers|Hadoop mailing lists]]. Also, please update and add to these lists. - For good small JIRAs to get started on, see [[https://issues.apache.org/jira/issues/?jql=project%20in%20(%22HADOOP%22%2C%20%22MAPREDUCE%22%2C%20%22HDFS%22%2C%20%22YARN%22)%20and%20labels%20%3D%20%22newbie%22%20and%20resolution%20%3D%20Unresolved|this list of newbie jiras]] and [[https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=truejqlQuery=project+in+%28HADOOP%2C+MAPREDUCE%2C+HDFS%29+AND+resolution+%3D+Unresolved+AND+labels+%3D+test-fail+ORDER+BY+priority+DESC%2C+key+DESC|this list of test failures]]. + For good small JIRAs to get started on, see [[https://issues.apache.org/jira/issues/?jql=project%20in%20(%22HADOOP%22%2C%20%22MAPREDUCE%22%2C%20%22HDFS%22%2C%20%22YARN%22)%20and%20labels%20%3D%20%22newbie%22%20and%20resolution%20%3D%20Unresolved|this list of newbie jiras]] and [[https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=truejqlQuery=project+in+%28HADOOP%2C+MAPREDUCE%2C+HDFS%29+AND+resolution+%3D+Unresolved+AND+labels+%3D+test-fail+ORDER+BY+priority+DESC%2C+key+DESC|this list of test failures]]. There is also the list of [[https://issues.apache.org/jira/issues/?filter=12327844|all open Hadoop Issues with no patch]]. 1. [[#test_projects|Test Projects]] 1. [[#research_projects|Research Projects]]
[Hadoop Wiki] Update of 2015MayBugBash by SomeOtherAccount
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The 2015MayBugBash page has been changed by SomeOtherAccount: https://wiki.apache.org/hadoop/2015MayBugBash New page: == Information == With over 900 patches not yet reviewed and approved for Apache Hadoop, it's time to make some strong progress on the bug list! A number of Apache Hadoop committers and Hadoop-related tech companies are hosting a n Apache Hadoop Community event on Friday, May 8th, after HBaseCon 2015. You are hereby invited for a fun, daylong event devoted to identifying and registering important patches and cleaning up the queue. Bring your own computing power. We hope to see you there! == How to Get Involved == For this first event, we are primarily targeting individuals with some experience either working with Hadoop or working with large, open source projects. In order to hit the ground running on that day, it is recommended some pre-work be done first: 1. Have or create an account on [[https://issues.apache.org/jira/|Apache JIRA]]. 1. Be familiar with the instructions on HowToContribute. 1. Have a working development and test environment. See [[https://git-wip-us.apache.org/repos/asf?p=hadoop.git;a=blob;f=BUILDING.txt|BUILDING.txt]] for requirements as well as how to execute a script to create a Docker image that has all of the components. 1. Register with https://www.eventbrite.com/e/apache-hadoop-global-bug-bash-tickets-16507188445 1. Hang out on the #hadoop channel on irc.freenode.net . == Procedures for that day == === Non-source Code Contributions === The vast majority of Hadoop content, including almost all of the documentation is part of the source tree. There are multiple ways in which those who are unfamiliar with developer environments can contribute: 1. Go through the open JIRA list and verify whether functionality already exists or bug fixed. 1. [[https://wiki.apache.org/hadoop/FrontPage?action=newaccount|Create a wiki account]] and update the documentation here. 1. Collaborate with another person in testing, documentation, and other non-source related tasks that almost all patches require. === Source Code Contributions === Non-committers: 1. Read through the [[https://issues.apache.org/jira/issues/?filter=12331694|To Be Reviewed queue]]. 1. Find a JIRA to work on. 1. Remove the '''BB2015-05-TBR''' label from that JIRA so that it leaves the queue. 1. Work through the issues in that JIRA. Work make sure the patch applies cleanly, test pasts locally by running against test-patch.sh, any pre-existing committer comments are covered, etc. 1. When you think it is ready, set the label to '''BB2015-05-RFC'''. Note that in some cases, the functionality in older patches may already exist. Please close these JIRA, preferably as a duplicate to the JIRA that added that functionality or as Invalid with a comment stating that you believe the issue is stale and already fixed. Committers: 1. Read through the [[https://issues.apache.org/jira/issues/?filter=12331695|Ready For Committer queue]] 1. Find a JIRA to work on 1. Remove the '''BB2015-05-RFC''' label from that JIRA so that it leaves the queue. 1. Review the patch. If it needs more work, add a comment and add the''' BB2015-05-TBR''' label so that it goes back into the non-committer queue. 1. Commit the patch as per usual if it is ready to go.
[Hadoop Wiki] Update of 2015MayBugBash by SomeOtherAccount
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The 2015MayBugBash page has been changed by SomeOtherAccount: https://wiki.apache.org/hadoop/2015MayBugBash?action=diffrev1=1rev2=2 + TableOfContents(4) + == Information == - With over 900 patches not yet reviewed and approved for Apache Hadoop, it's time to make some strong progress on the bug list! A number of Apache Hadoop committers and Hadoop-related tech companies are hosting a n Apache Hadoop Community event on Friday, May 8th, after HBaseCon 2015. You are hereby invited for a fun, daylong event devoted to identifying and registering important patches and cleaning up the queue. Bring your own computing power. We hope to see you there! + + With over 900 patches not yet reviewed and approved for Apache Hadoop, it's time to make some strong progress on the bug list! + + A number of Apache Hadoop committers and Hadoop-related tech companies are hosting an Apache Hadoop Community event on Friday, May 8th, after HBaseCon 2015. You are hereby invited for a fun, daylong event devoted to identifying and registering important patches and cleaning up the queue. == How to Get Involved == For this first event, we are primarily targeting individuals with some experience either working with Hadoop or working with large, open source projects. In order to hit the ground running on that day, it is recommended some pre-work be done first: @@ -13, +18 @@ 1. Hang out on the #hadoop channel on irc.freenode.net . == Procedures for that day == - === Non-source Code Contributions === - The vast majority of Hadoop content, including almost all of the documentation is part of the source tree. There are multiple ways in which those who are unfamiliar with developer environments can contribute: - - 1. Go through the open JIRA list and verify whether functionality already exists or bug fixed. - 1. [[https://wiki.apache.org/hadoop/FrontPage?action=newaccount|Create a wiki account]] and update the documentation here. - 1. Collaborate with another person in testing, documentation, and other non-source related tasks that almost all patches require. === Source Code Contributions === - Non-committers: + + Non-committers 1. Read through the [[https://issues.apache.org/jira/issues/?filter=12331694|To Be Reviewed queue]]. 1. Find a JIRA to work on. @@ -31, +31 @@ Note that in some cases, the functionality in older patches may already exist. Please close these JIRA, preferably as a duplicate to the JIRA that added that functionality or as Invalid with a comment stating that you believe the issue is stale and already fixed. - Committers: + Committers 1. Read through the [[https://issues.apache.org/jira/issues/?filter=12331695|Ready For Committer queue]] 1. Find a JIRA to work on @@ -39, +39 @@ 1. Review the patch. If it needs more work, add a comment and add the''' BB2015-05-TBR''' label so that it goes back into the non-committer queue. 1. Commit the patch as per usual if it is ready to go. + + === Non-source Code Contributions === + The vast majority of Hadoop content, including almost all of the documentation is part of the source tree. However, there are multiple ways in which those who are unfamiliar with developer environments can contribute: + + 1. Go through the open JIRA list and verify whether functionality already exists or bug fixed. + 1. [[https://wiki.apache.org/hadoop/FrontPage?action=newaccount|Create a wiki account]] and update the documentation here. + 1. Collaborate with another person in testing, documentation, and other non-source related tasks that almost all patches require. +
[Hadoop Wiki] Update of 2015MayBugBash by SomeOtherAccount
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The 2015MayBugBash page has been changed by SomeOtherAccount: https://wiki.apache.org/hadoop/2015MayBugBash?action=diffrev1=2rev2=3 Committers - 1. Read through the [[https://issues.apache.org/jira/issues/?filter=12331695|Ready For Committer queue]] + 1. Read through the [[https://issues.apache.org/jira/issues/?filter=12331695|Ready For Committer queue]]. - 1. Find a JIRA to work on + 1. Find a JIRA to work on. 1. Remove the '''BB2015-05-RFC''' label from that JIRA so that it leaves the queue. 1. Review the patch. If it needs more work, add a comment and add the''' BB2015-05-TBR''' label so that it goes back into the non-committer queue. 1. Commit the patch as per usual if it is ready to go. @@ -43, +43 @@ === Non-source Code Contributions === The vast majority of Hadoop content, including almost all of the documentation is part of the source tree. However, there are multiple ways in which those who are unfamiliar with developer environments can contribute: - 1. Go through the open JIRA list and verify whether functionality already exists or bug fixed. + 1. Go through the [[https://issues.apache.org/jira/issues/?filter=12327844|open JIRA list]] and verify whether it is functionality that already exists, a bug that has already been fixed, or is still a valid issue. 1. [[https://wiki.apache.org/hadoop/FrontPage?action=newaccount|Create a wiki account]] and update the documentation here. 1. Collaborate with another person in testing, documentation, and other non-source related tasks that almost all patches require.
[Hadoop Wiki] Update of Roadmap by VinodKumarVavilapalli
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The Roadmap page has been changed by VinodKumarVavilapalli: https://wiki.apache.org/hadoop/Roadmap?action=diffrev1=53rev2=54 Comment: Adding initial proposal for Hadoop 2.8 == Hadoop 2.x Releases == + === hadoop-2.8 === + * HADOOP + * Support *both* JDK7 and JDK8 runtimes [[https://issues.apache.org/jira/browse/HADOOP-11090|HADOOP-11090]] + * Compatibility tools to catch backwards, forwards compatibility issues at patch submission, release times. Some of it is captured at YARN-3292. This also involves resurrecting jdiff (HADOOP-11776/YARN-3426/MAPREDUCE-6310) and/or investing in new tools. + * Classpath isolation for downstream clients [[https://issues.apache.org/jira/browse/HADOOP-11656|HADOOP-11656]] + * HDFS + * Support for Erasure Codes in HDFS [[https://issues.apache.org/jira/browse/HDFS-7285|HDFS-7285]] + * YARN + * Early work for disk and network isolation in YARN: [[https://issues.apache.org/jira/browse/YARN-2139|YARN-2139]], [[https://issues.apache.org/jira/browse/YARN-2140|YARN-2140]] + * YARN Timeline Service Next generation: [[https://issues.apache.org/jira/browse/YARN-2928|YARN-2928]]. + * Supporting non-exclusive node-labels: [[https://issues.apache.org/jira/browse/YARN-3214|YARN-3214]] + * Support priorities across applications within the same queue [[https://issues.apache.org/jira/browse/YARN-1963|YARN-1963]] + === hadoop-2.7 === * HADOOP * Move to JDK7+ [[https://issues.apache.org/jira/browse/HADOOP-10530|HADOOP-10530]] * Support JDK8 in Hadoop [[https://issues.apache.org/jira/browse/HADOOP-11090|HADOOP-11090]] * HDFS - * Support for Erasure Codes in HDFS [[https://issues.apache.org/jira/browse/HDFS-7285|HDFS-7285]] + * Removal of old Web UI [[https://issues.apache.org/jira/browse/HDFS-6252|HDFS-6252]] * YARN * Support disk as a resource in YARN for scheduling and isolation [[https://issues.apache.org/jira/browse/YARN-2139|YARN-2139]]
[Hadoop Wiki] Update of AmazonS3 by SteveLoughran
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The AmazonS3 page has been changed by SteveLoughran: https://wiki.apache.org/hadoop/AmazonS3?action=diffrev1=18rev2=19 Comment: remove all content on configuring the S3 filesystems -point to the markdown docs on github instead. = History = * The S3 block filesystem was introduced in Hadoop 0.10.0 ([[http://issues.apache.org/jira/browse/HADOOP-574|HADOOP-574]]). * The S3 native filesystem was introduced in Hadoop 0.18.0 ([[http://issues.apache.org/jira/browse/HADOOP-930|HADOOP-930]]) and rename support was added in Hadoop 0.19.0 ([[https://issues.apache.org/jira/browse/HADOOP-3361|HADOOP-3361]]). - * The S3A filesystem was introduced in Hadoop 2.6.0. Some issues were found and fixed for later Hadoop versions[[https://issues.apache.org/jira/browse/HADOOP-11571|HADOOP-11571]], so Hadoop-2.6.0's support of s3a must be considered an incomplete replacement for the s3n FS. + * The S3A filesystem was introduced in Hadoop 2.6.0. Some issues were found and fixed for later Hadoop versions [[https://issues.apache.org/jira/browse/HADOOP-11571|HADOOP-11571]]. - = Why you cannot use S3 as a replacement for HDFS = + = Configuring and using the S3 filesystem support = + + Consult the [[https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md|Latest Hadoop documentation]] for the specifics on using any of the S3 clients. + + + = Important: you cannot use S3 as a replacement for HDFS = + - You cannot use either of the S3 filesystem clients as a drop-in replacement for HDFS. Amazon S3 is an object store with + You cannot use any of the S3 filesystem clients as a drop-in replacement for HDFS. Amazon S3 is an object store with * eventual consistency: changes made by one application (creation, updates and deletions) will not be visible until some undefined time. * s3n and s3a: non-atomic rename and delete operations. Renaming or deleting large directories takes time proportional to the number of entries -and visible to other processes during this time, and indeed, until the eventual consistency has been resolved. S3 is not a filesystem. The Hadoop S3 filesystem bindings make it pretend to be a filesystem, but it is not. It can act as a source of data, and as a destination -though in the latter case, you must remember that the output may not be immediately visible. - - == Configuring to use s3/ s3n filesystems == - - Edit your `core-site.xml` file to include your S3 keys - - {{{ - - property - namefs.s3.awsAccessKeyId/name - valueID/value - /property - - property - namefs.s3.awsSecretAccessKey/name - valueSECRET/value - /property - }}} - - You can then use URLs to your bucket : ``s3n://MYBUCKET/``, or directories and files inside it. - - {{{ - - s3n://BUCKET/ - s3n://BUCKET/dir - s3n://BUCKET/dir/files.csv.tar.gz - s3n://BUCKET/dir/*.gz - - }}} - - Alternatively, you can put the access key ID and the secret access key into a ''s3n'' (or ''s3'') URI as the user info: - - {{{ - s3n://ID:SECRET@BUCKET - }}} - - Note that since the secret - access key can contain slashes, you must remember to escape them by replacing each slash `/` with the string `%2F`. - Keys specified in the URI take precedence over any specified using the properties `fs.s3.awsAccessKeyId` and - `fs.s3.awsSecretAccessKey`. - - This option is less secure as the URLs are likely to appear in output logs and error messages, so being exposed to remote users. = Security =
[Hadoop Wiki] Update of HowToRelease by AndrewWang
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The HowToRelease page has been changed by AndrewWang: https://wiki.apache.org/hadoop/HowToRelease?action=diffrev1=70rev2=71 Comment: no more svn up for the website = Preparation = 1. Bulk update Jira to unassign from this release all issues that are open non-blockers and send follow-up notification to the developer list that this was done. - 1. If you have not already done so, [[http://www.apache.org/dev/release-signing.html#keys-policy|append your code signing key]] to the [[https://dist.apache.org/repos/dist/release/hadoop/common/KEYS|KEYS]] file on the website. Also [[http://www.apache.org/dev/release-signing.html#keys-policy|upload your key to a public key server]] if you haven't. End users use the KEYS file (along with the [[http://www.apache.org/dev/release-signing.html#web-of-trust|web of trust]]) to validate that releases were done by an Apache committer. Once you commit your changes, log into {{{people.apache.org}}} and pull updates to {{{/www/www.apache.org/dist/hadoop/core}}}. For more details on signing releases, see [[http://www.apache.org/dev/release-signing.html|Signing Releases]] and [[http://www.apache.org/dev/mirror-step-by-step.html?Step-By-Step|Step-By-Step Guide to Mirroring Releases]]. + 1. If you have not already done so, [[http://www.apache.org/dev/release-signing.html#keys-policy|append your code signing key]] to the [[https://dist.apache.org/repos/dist/release/hadoop/common/KEYS|KEYS]] file. Once you commit your changes, they will automatically be propagated to the website. Also [[http://www.apache.org/dev/release-signing.html#keys-policy|upload your key to a public key server]] if you haven't. End users use the KEYS file (along with the [[http://www.apache.org/dev/release-signing.html#web-of-trust|web of trust]]) to validate that releases were done by an Apache committer. For more details on signing releases, see [[http://www.apache.org/dev/release-signing.html|Signing Releases]] and [[http://www.apache.org/dev/mirror-step-by-step.html?Step-By-Step|Step-By-Step Guide to Mirroring Releases]]. 1. To deploy artifacts to the Apache Maven repository create {{{~/.m2/settings.xml}}}:{{{ settings xmlns=http://maven.apache.org/SETTINGS/1.0.0; xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance;
[Hadoop Wiki] Update of HowToRelease by AndrewWang
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The HowToRelease page has been changed by AndrewWang: https://wiki.apache.org/hadoop/HowToRelease?action=diffrev1=69rev2=70 Comment: update KEYS file instructions = Preparation = 1. Bulk update Jira to unassign from this release all issues that are open non-blockers and send follow-up notification to the developer list that this was done. + 1. If you have not already done so, [[http://www.apache.org/dev/release-signing.html#keys-policy|append your code signing key]] to the [[https://dist.apache.org/repos/dist/release/hadoop/common/KEYS|KEYS]] file on the website. Also [[http://www.apache.org/dev/release-signing.html#keys-policy|upload your key to a public key server]] if you haven't. End users use the KEYS file (along with the [[http://www.apache.org/dev/release-signing.html#web-of-trust|web of trust]]) to validate that releases were done by an Apache committer. Once you commit your changes, log into {{{people.apache.org}}} and pull updates to {{{/www/www.apache.org/dist/hadoop/core}}}. For more details on signing releases, see [[http://www.apache.org/dev/release-signing.html|Signing Releases]] and [[http://www.apache.org/dev/mirror-step-by-step.html?Step-By-Step|Step-By-Step Guide to Mirroring Releases]]. - 1. If you have not already done so, update your @apache.org account via [[http://id.apache.org/|id.apache.org]] with your key; also add and commit your public key to the Hadoop repository [[http://svn.apache.org/repos/asf/hadoop/common/dist/KEYS|KEYS]], appending the output of the following commands:{{{ - gpg --armor --fingerprint --list-sigs keyid - gpg --armor --export keyid - }}} and publish your key at [[http://pgp.mit.edu/]]. Once you commit your changes, log into {{{people.apache.org}}} and pull updates to {{{/www/www.apache.org/dist/hadoop/core}}}. For more details on signing releases, see [[http://www.apache.org/dev/release-signing.html|Signing Releases]] and [[http://www.apache.org/dev/mirror-step-by-step.html?Step-By-Step|Step-By-Step Guide to Mirroring Releases]]. 1. To deploy artifacts to the Apache Maven repository create {{{~/.m2/settings.xml}}}:{{{ settings xmlns=http://maven.apache.org/SETTINGS/1.0.0; xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance;
[Hadoop Wiki] Trivial Update of HowToRelease by VinodKumarVavilapalli
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The HowToRelease page has been changed by VinodKumarVavilapalli: https://wiki.apache.org/hadoop/HowToRelease?action=diffrev1=68rev2=69 1. Commit the changes {{{ svn ci -m Publishing the bits for release ${version} }}} - 1. In [[https://repository.apache.org|Nexus]], effect the release of artifacts by right-clicking the staged repository and select {{{Release}}} + 1. In [[https://repository.apache.org|Nexus]], effect the release of artifacts by selecting the staged repository and then clicking {{{Release}}} 1. Wait 24 hours for release to propagate to mirrors. 1. Edit the website. 1. Checkout the website if you haven't already {{{
[Hadoop Wiki] Update of LibHDFS by SteveLoughran
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The LibHDFS page has been changed by SteveLoughran: https://wiki.apache.org/hadoop/LibHDFS?action=diffrev1=11rev2=12 Comment: time to remove the @lucene email ref! Anchor(Contact) = Contact Information = - Please drop us an email at '''hadoop-us...@lucene.apache.org''' if you have any questions or any suggestions. Use [[http://issues.apache.org/jira/browse/HADOOP|Jira]] (component: dfs) to report bugs. + Please drop us an email at '''us...@hadoop.apache.org''' if you have any questions or any suggestions. Use [[http://issues.apache.org/jira/browse/HADOOP|Jira]] (component: hdfs) to report bugs. BR Anchor(Conclusion)
[Hadoop Wiki] Update of HowToContribute by SomeOtherAccount
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The HowToContribute page has been changed by SomeOtherAccount: https://wiki.apache.org/hadoop/HowToContribute?action=diffrev1=105rev2=106 * Keep your computer's clock up to date via an NTP server, and set up the time zone correctly. This is good for avoiding change-log confusion. == Making Changes == - Before you start, send a message to the [[http://hadoop.apache.org/core/mailing_lists.html|Hadoop developer mailing list]], or file a bug report in [[Jira]]. Describe your proposed changes and check that they fit in with what others are doing and have planned for the project. Be patient, it may take folks a while to understand your requirements. If you want to start with pre-existing issues, look for Jiras labeled `newbie`. + Before you start, send a message to the [[http://hadoop.apache.org/core/mailing_lists.html|Hadoop developer mailing list]], or file a bug report in [[Jira]]. Describe your proposed changes and check that they fit in with what others are doing and have planned for the project. Be patient, it may take folks a while to understand your requirements. If you want to start with pre-existing issues, look for Jiras labeled `newbie`. You can find them using [[https://issues.apache.org/jira/issues/?filter=12331506|this filter]]. Modify the source code and add some (very) nice features using your favorite IDE.BR
[Hadoop Wiki] Update of HadoopStreaming by EricMoyer
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The HadoopStreaming page has been changed by EricMoyer: https://wiki.apache.org/hadoop/HadoopStreaming?action=diffrev1=14rev2=15 Comment: Fix hadoop streaming link == See Also == * HowToDebugMapReducePrograms * HadoopStreaming/AlternativeInterfaces - * [[http://hadoop.apache.org/mapreduce/docs/current/streaming.html|Hadoop Streaming]] + * [[http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/HadoopStreaming.html|Hadoop Streaming]]
[Hadoop Wiki] Trivial Update of ContributorsGroup by QwertyManiac
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The ContributorsGroup page has been changed by QwertyManiac: https://wiki.apache.org/hadoop/ContributorsGroup?action=diffrev1=101rev2=102 * ErikBakker * Erik Gorset * EricHwang + * EricMoyer * EricYang * ermanpattuk * EstebanMolinaEstolano
[Hadoop Wiki] Update of SocketException by SteveLoughran
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The SocketException page has been changed by SteveLoughran: https://wiki.apache.org/hadoop/SocketException?action=diffrev1=1rev2=2 Comment: Connection Reset = SocketException = - Low level socket exceptions. Some diagnostics may be provided. + Low level socket exceptions. Some diagnostics may be provided. + + == Host is Down == Example: {{{ - java.io.IOException: Failed on local exception: java.net.SocketException: Host is down; Host Details : local host is: client1.example.org/192.168.1.86; destination host is: hdfs.example.org:8020; + java.io.IOException: Failed on local exception: java.net.SocketException: Host is down; Host Details : local host is: client1.example.org/192.168.1.86; destination host is: hdfs.example.org:8020; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) at org.apache.hadoop.ipc.Client.call(Client.java:1472) at org.apache.hadoop.ipc.Client.call(Client.java:1399) @@ -34, +36 @@ 1. the destination host is hdfs.example.org:8020 . This is the host to look for 1. The exception is triggered by an HDFS call. (see {{{org.apache.hadoop.hdfs}}} at the bottom of the stack trace). - That information is enough to hint to us that an HDFS operation is failing as the HDFS server hdfs.example.org is down. + That information is enough to hint to us that an HDFS operation is failing as the HDFS server hdfs.example.org is down. It's not guaranteed to be the cause, as there could be other reasons, including configuration ones 1. The URI to HDFS, as set in {{{core-site.xml}}} could be wrong; the client trying to talk to the wrong host —one that is down. 1. The IP address of the host, as set in DNS or {{{/etc/hosts}}} is wrong. The client is trying to talk to a machine at the wrong IP address, a machine that the network stack thinks is down. + == Connection Reset == + + The connection was reset at the TCP layer + + There is good coverage of this issue on [[StackOverflow|http://stackoverflow.com/questions/62929/java-net-socketexception-connection-reset]] Remember: These are [[YourNetworkYourProblem|your network configuration problems]] . Only you can fix them.
[Hadoop Wiki] Update of ContributorsGroup by SteveLoughran
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The ContributorsGroup page has been changed by SteveLoughran: https://wiki.apache.org/hadoop/ContributorsGroup?action=diffrev1=100rev2=101 Comment: +JoshBaer * JonathanHsieh * JonathanSmith * joshdsullivan + * JoshBaer * JoydeepSensarma * JunpingDu * karthikkambatla
[Hadoop Wiki] Trivial Update of PoweredBy by JoshBaer
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The PoweredBy page has been changed by JoshBaer: https://wiki.apache.org/hadoop/PoweredBy?action=diffrev1=434rev2=435 Comment: Updated Spotify specs * ''[[http://www.spotify.com|Spotify]] '' * ''We use Apache Hadoop for content generation, data aggregation, reporting and analysis (see more: [[http://files.meetup.com/5139282/SHUG%201%20-%20Hadoop%20at%20Spotify.pdf|Hadoop at Spotify]]) '' - * ''690 node cluster = 8280 physical cores, 38TB RAM, 28 PB storage (read more about our Hadoop issues while growing fast: [[http://www.slideshare.net/AdamKawa/hadoop-adventures-at-spotify-strata-conference-hadoop-world-2013|Hadoop Adventures At Spotify]])'' + * ''1300 node cluster : 15,600 physical cores, ~70TB RAM, ~60 PB storage (read more about our Hadoop issues while growing fast: [[http://www.slideshare.net/AdamKawa/hadoop-adventures-at-spotify-strata-conference-hadoop-world-2013|Hadoop Adventures At Spotify]])'' - * ''+7,500 daily Hadoop jobs (scheduled by Luigi, our home-grown and recently open-sourced job scheduler - [[https://github.com/spotify/luigi|code]] and [[https://vimeo.com/63435580|video]])'' + * ''+8,000 daily Hadoop jobs (scheduled by Luigi, our open-sourced job orchestrator - [[https://github.com/spotify/luigi|code]] and [[https://vimeo.com/63435580|video]])'' * ''[[http://stampedehost.com/|Stampede Data Solutions (Stampedehost.com)]] '' * ''Hosted Apache Hadoop data warehouse solution provider ''
[Hadoop Wiki] Update of UnixShellScriptProgrammingGuide by SomeOtherAccount
Dear Wiki user, You have subscribed to a wiki page or wiki category on Hadoop Wiki for change notification. The UnixShellScriptProgrammingGuide page has been changed by SomeOtherAccount: https://wiki.apache.org/hadoop/UnixShellScriptProgrammingGuide?action=diffrev1=18rev2=19 In addition to all of the variables documented in `*-env.sh` and `hadoop-layout.sh`, there are a handful of special env vars: - * `JAVA_HEAP_MAX` - This is the Xmx parameter to be passed to Java. (e.g., `-Xmx1g`). This is present for backward compatibility, however it should be added to `HADOOP_OPTS` via `hadoop_add_param HADOOP_OPTS Xmx ${JAVA_HEAP_MAX}` prior to calling `hadoop_finalize`. + * `HADOOP_HEAP_MAX` - This is the Xmx parameter to be passed to Java. (e.g., `-Xmx1g`). This is present for backward compatibility, however it should be added to `HADOOP_OPTS` via `hadoop_add_param HADOOP_OPTS Xmx ${JAVA_HEAP_MAX}` prior to calling `hadoop_finalize`. * `HADOOP_DAEMON_MODE` - This will be set to `start` or `stop` based upon what `hadoop-config.sh` has determined from the command line options.