Re: [VOTE] Release Apache Hadoop 2.4.0

2014-04-03 Thread Tsuyoshi OZAWA
Hi, Updated a test result log based on the result of 2.4.0-rc0: https://gist.github.com/oza/9965197 IMO, there are some blockers to be fixed: * MAPREDUCE-5815(TestMRAppMaster failure) * YARN-1872(TestDistributedShell failure) * HDFS: TestSymlinkLocalFSFileSystem failure on Linux (I cannot find JI

[jira] [Created] (HADOOP-10461) Runtime DI based injector for FileSystem tests

2014-04-03 Thread jay vyas (JIRA)
jay vyas created HADOOP-10461: - Summary: Runtime DI based injector for FileSystem tests Key: HADOOP-10461 URL: https://issues.apache.org/jira/browse/HADOOP-10461 Project: Hadoop Common Issue Type

[jira] [Resolved] (HADOOP-10409) Bzip2 error message isn't clear

2014-04-03 Thread Mohammad Kamrul Islam (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-10409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam resolved HADOOP-10409. Resolution: Won't Fix > Bzip2 error message isn't clear > ---

Re: [VOTE] Release Apache Hadoop 2.4.0

2014-04-03 Thread Tsuyoshi OZAWA
Hi, Ran tests and confirmed that some tests(TestSymlinkLocalFSFileSystem) fail. The log of the test failure is as follows: https://gist.github.com/oza/9965197 Should we fix or disable the feature? Thanks, - Tsuyoshi On Mon, Mar 31, 2014 at 6:22 PM, Arun C Murthy wrote: > Folks, > > I've creat

Re: [VOTE] Release Apache Hadoop 2.4.0

2014-04-03 Thread Tsuyoshi OZAWA
Azurry, can you file the RM-failover related bug as a JIRA? I'm going to reproduce it in my local. Thanks, - Tsuyoshi On Fri, Apr 4, 2014 at 7:47 AM, Azuryy wrote: > Did you tested RM failover on Hive? There is bug. > > > Sent from my iPhone5s > >> On 2014年4月4日, at 2:12, Xuan Gong wrote: >> >>

Re: [VOTE] Release Apache Hadoop 2.4.0

2014-04-03 Thread Haohui Mai
HDFS-6180 seems to be a blocker of 2.4. I'll post a patch later today. ~Haohui On Thu, Apr 3, 2014 at 3:47 PM, Azuryy wrote: > Did you tested RM failover on Hive? There is bug. > > > Sent from my iPhone5s > > > On 2014年4月4日, at 2:12, Xuan Gong wrote: > > > > +1 non-binding > > > > Built from

Re: Yarn / mapreduce scheduling

2014-04-03 Thread Sandy Ryza
The MapReduce application master reads the split info from HDFS and then submits requests to the scheduler based on the locations there. On Thu, Apr 3, 2014 at 1:22 PM, Brad Childs wrote: > Sandy/Shekhar Thank you very much for the helpful responses. > > One last question/clarification- the get

Re: Yarn / mapreduce scheduling

2014-04-03 Thread Brad Childs
Sandy/Shekhar Thank you very much for the helpful responses. One last question/clarification- the getFileBlockLocations(..) in the FileSystem API is the only file-->node mapping that I'm aware of, and it seems the only place its called is in the map/reduce client (FileInputFormat, MultiFileSpli

Re: [VOTE] Release Apache Hadoop 2.4.0

2014-04-03 Thread Karthik Kambatla
We should definitely get YARN-1726 in, if we are spinning a new RC. Otherwise, I guess we could include it in 2.4.1, along with any other critical issues that get reported on 2.4.0 or have missed this RC. On Thu, Apr 3, 2014 at 12:45 PM, Sandy Ryza wrote: > While the Scheduler Load Simulator is

Re: [VOTE] Release Apache Hadoop 2.4.0

2014-04-03 Thread Sandy Ryza
While the Scheduler Load Simulator isn't part of YARN's core, it's a tool that YARN includes, and it's broken entirely in the current RC. YARN-1726 seems to me like something worth including in the release. -Sandy On Thu, Apr 3, 2014 at 11:12 AM, Xuan Gong wrote: > +1 non-binding > > Built fr

Re: Yarn / mapreduce scheduling

2014-04-03 Thread Sandy Ryza
The equivalent code in the Fair Scheduler is in AppSchedulable.java, under assignContainer(FSSchedulerNode node, boolean reserved). YARN uses delay scheduling ( http://people.csail.mit.edu/matei/papers/2010/eurosys_delay_scheduling.pdf) for achieving data-locality. -Sandy On Thu, Apr 3, 2014 at

Re: Yarn / mapreduce scheduling

2014-04-03 Thread Shekhar Gupta
Hi Brad, YARN scheduling does take care of data locality. In YARN, tasks are not assigned based on capacity. Actually certain number of containers are allocated on every node based on node's capacity. Tasks are executed on those containers. While scheduling tasks on containers YARN scheduler s

Re: [VOTE] Release Apache Hadoop 2.4.0

2014-04-03 Thread Xuan Gong
+1 non-binding Built from source code, tested on a single node cluster. Successfully ran a few MR sample jobs. Tested RM failover while job is running. Thanks Xuan Gong On Wed, Apr 2, 2014 at 10:21 PM, Zhijie Shen wrote: > +1 non-binding > > I built from source code, and setup a single node

Re: [VOTE] Release Apache Hadoop 2.4.0

2014-04-03 Thread Wei Yan
The Hadoop-SLS (scheduler load simulator) is failed in 2.4.0-rc0 due to the change of scheduler interface hierarchy. A fix patch is available in YARN-1726. Anybody have a chance to look into it? thanks, Wei 2014-04-03 0:21 GMT-05:00 Zhijie Shen : > +1 non-binding > > I built from source code, a

Yarn / mapreduce scheduling

2014-04-03 Thread Brad Childs
Sorry if this is the wrong list, i am looking for deep technical/hadoop source help :) How does job scheduling work on yarn framework for map reduce jobs? I see the yarn scheduler discussed here: https://hadoop.apache.org/docs/r2.2.0/hadoop-yarn/hadoop-yarn-site/YARN.html which leads me to

[jira] [Created] (HADOOP-10460) Please update on-line documentation for hadoop 2.3

2014-04-03 Thread Darek (JIRA)
Darek created HADOOP-10460: -- Summary: Please update on-line documentation for hadoop 2.3 Key: HADOOP-10460 URL: https://issues.apache.org/jira/browse/HADOOP-10460 Project: Hadoop Common Issue Type:

Build failed in Jenkins: Hadoop-Common-trunk #1088

2014-04-03 Thread Apache Jenkins Server
See Changes: [atm] Move MAPREDUCE-5014 to the right section now that it's been merged to branch-2. [atm] HADOOP-10459. distcp V2 doesn't preserve root dir's attributes when -p is specified. Contributed by Yongjun Zhang. [wheat9]