[jira] [Commented] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs

2013-05-16 Thread Sergey Tryuber (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13659278#comment-13659278
 ] 

Sergey Tryuber commented on MAPREDUCE-3859:
---

Mike, that's great! So I think this task can be closed, unless someone from 
Cloudera (their MR1 in CDH4 is still be affected) wants to take care about this 
issue and port the fix to old Capacity Scheduler into their sources.

For the others who faces this issue, below is a brief step-by-step instruction 
for CDH4.1.2:
   
* Download sources from https://ccp.cloudera.com/display/SUPPORT/CDH+Downloads. 
Note: you need hadoop-0.20-mapreduce-0.20.2+1265 tarball.
* Unpack it and go to root directory.
* Apply changes from the first comment and test case from attached patch. 
* Also you should add the following lines:
{code}
reactor.repo=https\://repository.cloudera.com/content/repositories/snapshots
version=2.0.0-mr1-cdh4.1.2
{code}
into src/contrib/index/ivy/libraries.properties and 
src/contrib/capacity-scheduler/ivy/libraries.properties files.
* Test fixes that were made:
{code}
ant test-contrib
{code}
* Build a jar file:
{code}
cd src/contrib/capacity-scheduler/
ant jar
cd -
{code}
* The result file will be placed at 
build/contrib/capacity-scheduler/hadoop-capacity-scheduler-2.0.0-mr1-cdh4.1.2.jar.
* Replace original file with the fixed on a node where JobTracker is started. 
Original file is placed in 
/usr/lib/hadoop-0.20-mapreduce/contrib/capacity-scheduler/ directory.

 CapacityScheduler incorrectly utilizes extra-resources of queue for 
 high-memory jobs
 

 Key: MAPREDUCE-3859
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: capacity-sched
Affects Versions: 1.0.0
 Environment: CDH3u1
Reporter: Sergey Tryuber
Assignee: Sergey Tryuber
 Attachments: test-to-fail.patch.txt


 Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, 
 jobs which use 3 map slots will never consume more than 9 slots, regardless 
 how many free slots on a cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Mayank Bansal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-5234:
-

Attachment: MAPREDUCE-5234-trunk-3.patch

Fixing Warning

Thanks,
Mayank

 Signature changes for getTaskId of TaskReport in mapred
 ---

 Key: MAPREDUCE-5234
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Mayank Bansal
 Attachments: MAPREDUCE-5234-trunk-1.patch, 
 MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch


 TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
 getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Mayank Bansal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-5234:
-

Status: Open  (was: Patch Available)

 Signature changes for getTaskId of TaskReport in mapred
 ---

 Key: MAPREDUCE-5234
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Mayank Bansal
 Attachments: MAPREDUCE-5234-trunk-1.patch, 
 MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch


 TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
 getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Mayank Bansal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-5234:
-

Status: Patch Available  (was: Open)

 Signature changes for getTaskId of TaskReport in mapred
 ---

 Key: MAPREDUCE-5234
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Mayank Bansal
 Attachments: MAPREDUCE-5234-trunk-1.patch, 
 MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch


 TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
 getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13659308#comment-13659308
 ] 

Hadoop QA commented on MAPREDUCE-5234:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12583438/MAPREDUCE-5234-trunk-3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3644//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3644//console

This message is automatically generated.

 Signature changes for getTaskId of TaskReport in mapred
 ---

 Key: MAPREDUCE-5234
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Mayank Bansal
 Attachments: MAPREDUCE-5234-trunk-1.patch, 
 MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch


 TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
 getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5240) inside of FileOutputCommitter the initialized Credentials cache appears to be empty

2013-05-16 Thread Roman Shaposhnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Shaposhnik updated MAPREDUCE-5240:


Attachment: MAPREDUCE-5240.2.0.4.rvs.patch.txt

Attaching a modified patch for branch-2.0.4

 inside of FileOutputCommitter the initialized Credentials cache appears to be 
 empty
 ---

 Key: MAPREDUCE-5240
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5240
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.4-alpha
Reporter: Roman Shaposhnik
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
  Labels: 2.0.4.1
 Fix For: 2.0.5-beta, 2.0.4.1-alpha

 Attachments: LostCreds.java, MAPREDUCE-5240-20130512.txt, 
 MAPREDUCE-5240-20130513.txt, MAPREDUCE-5240.2.0.4.rvs.patch.txt


 I am attaching a modified wordcount job that clearly demonstrates the problem 
 we've encountered in running Sqoop2 on YARN (BIGTOP-949).
 Here's what running it produces:
 {noformat}
 $ hadoop fs -mkdir in
 $ hadoop fs -put /etc/passwd in
 $ hadoop jar ./bug.jar org.myorg.LostCreds
 13/05/12 03:13:46 WARN mapred.JobConf: The variable mapred.child.ulimit is no 
 longer used.
 numberOfSecretKeys: 1
 numberOfTokens: 0
 ..
 ..
 ..
 13/05/12 03:05:35 INFO mapreduce.Job: Job job_1368318686284_0013 failed with 
 state FAILED due to: Job commit failed: java.io.IOException:
 numberOfSecretKeys: 0
 numberOfTokens: 0
   at 
 org.myorg.LostCreds$DestroyerFileOutputCommitter.commitJob(LostCreds.java:43)
   at 
 org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:249)
   at 
 org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:212)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:619)
 {noformat}
 As you can see, even though we've clearly initialized the creds via:
 {noformat}
 job.getCredentials().addSecretKey(new Text(mykey), mysecret.getBytes());
 {noformat}
 It doesn't seem to appear later in the job.
 This is a pretty critical issue for Sqoop 2 since it appears to be DOA for 
 YARN in Hadoop 2.0.4-alpha

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5240) inside of FileOutputCommitter the initialized Credentials cache appears to be empty

2013-05-16 Thread Roman Shaposhnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Shaposhnik updated MAPREDUCE-5240:


Status: Patch Available  (was: Reopened)

 inside of FileOutputCommitter the initialized Credentials cache appears to be 
 empty
 ---

 Key: MAPREDUCE-5240
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5240
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.4-alpha
Reporter: Roman Shaposhnik
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
  Labels: 2.0.4.1
 Fix For: 2.0.5-beta, 2.0.4.1-alpha

 Attachments: LostCreds.java, MAPREDUCE-5240-20130512.txt, 
 MAPREDUCE-5240-20130513.txt, MAPREDUCE-5240.2.0.4.rvs.patch.txt


 I am attaching a modified wordcount job that clearly demonstrates the problem 
 we've encountered in running Sqoop2 on YARN (BIGTOP-949).
 Here's what running it produces:
 {noformat}
 $ hadoop fs -mkdir in
 $ hadoop fs -put /etc/passwd in
 $ hadoop jar ./bug.jar org.myorg.LostCreds
 13/05/12 03:13:46 WARN mapred.JobConf: The variable mapred.child.ulimit is no 
 longer used.
 numberOfSecretKeys: 1
 numberOfTokens: 0
 ..
 ..
 ..
 13/05/12 03:05:35 INFO mapreduce.Job: Job job_1368318686284_0013 failed with 
 state FAILED due to: Job commit failed: java.io.IOException:
 numberOfSecretKeys: 0
 numberOfTokens: 0
   at 
 org.myorg.LostCreds$DestroyerFileOutputCommitter.commitJob(LostCreds.java:43)
   at 
 org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:249)
   at 
 org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:212)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:619)
 {noformat}
 As you can see, even though we've clearly initialized the creds via:
 {noformat}
 job.getCredentials().addSecretKey(new Text(mykey), mysecret.getBytes());
 {noformat}
 It doesn't seem to appear later in the job.
 This is a pretty critical issue for Sqoop 2 since it appears to be DOA for 
 YARN in Hadoop 2.0.4-alpha

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5240) inside of FileOutputCommitter the initialized Credentials cache appears to be empty

2013-05-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13659321#comment-13659321
 ] 

Hadoop QA commented on MAPREDUCE-5240:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12583446/MAPREDUCE-5240.2.0.4.rvs.patch.txt
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3645//console

This message is automatically generated.

 inside of FileOutputCommitter the initialized Credentials cache appears to be 
 empty
 ---

 Key: MAPREDUCE-5240
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5240
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.4-alpha
Reporter: Roman Shaposhnik
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
  Labels: 2.0.4.1
 Fix For: 2.0.5-beta, 2.0.4.1-alpha

 Attachments: LostCreds.java, MAPREDUCE-5240-20130512.txt, 
 MAPREDUCE-5240-20130513.txt, MAPREDUCE-5240.2.0.4.rvs.patch.txt


 I am attaching a modified wordcount job that clearly demonstrates the problem 
 we've encountered in running Sqoop2 on YARN (BIGTOP-949).
 Here's what running it produces:
 {noformat}
 $ hadoop fs -mkdir in
 $ hadoop fs -put /etc/passwd in
 $ hadoop jar ./bug.jar org.myorg.LostCreds
 13/05/12 03:13:46 WARN mapred.JobConf: The variable mapred.child.ulimit is no 
 longer used.
 numberOfSecretKeys: 1
 numberOfTokens: 0
 ..
 ..
 ..
 13/05/12 03:05:35 INFO mapreduce.Job: Job job_1368318686284_0013 failed with 
 state FAILED due to: Job commit failed: java.io.IOException:
 numberOfSecretKeys: 0
 numberOfTokens: 0
   at 
 org.myorg.LostCreds$DestroyerFileOutputCommitter.commitJob(LostCreds.java:43)
   at 
 org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:249)
   at 
 org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:212)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:619)
 {noformat}
 As you can see, even though we've clearly initialized the creds via:
 {noformat}
 job.getCredentials().addSecretKey(new Text(mykey), mysecret.getBytes());
 {noformat}
 It doesn't seem to appear later in the job.
 This is a pretty critical issue for Sqoop 2 since it appears to be DOA for 
 YARN in Hadoop 2.0.4-alpha

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5124) AM lacks flow control for task events

2013-05-16 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13659596#comment-13659596
 ] 

Robert Joseph Evans commented on MAPREDUCE-5124:


I believe in most cases it is enough to restrict it at the server side and 
retry at the client side, but there are some RPC calls that are different and 
perhaps should be handled slightly differently.  YARN-309 went in to try and 
throttle the hearbeats, instead of rejecting them and asking them to retry.  I 
think this is preferable for heartbeats over an outright rejection.  Simply 
because we know that the heartbeats are going to come regularly and asking the 
next one to wait does not reduce the total amount of work that we are going to 
need to do.

So I would throw a ToBusyRetryLater type of exception for once time RPC calls 
when the AsyncDispatcher's queue is over a high water mark, but for heartbeats 
I would want them to scale the frequency based off of how busy the 
AsyncDispatcher is.  

 AM lacks flow control for task events
 -

 Key: MAPREDUCE-5124
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5124
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 2.0.3-alpha, 0.23.5
Reporter: Jason Lowe
 Attachments: MAPREDUCE-5124-prototype.txt


 The AM does not have any flow control to limit the incoming rate of events 
 from tasks.  If the AM is unable to keep pace with the rate of incoming 
 events for a sufficient period of time then it will eventually exhaust the 
 heap and crash.  MAPREDUCE-5043 addressed a major bottleneck for event 
 processing, but the AM could still get behind if it's starved for CPU and/or 
 handling a very large job with tens of thousands of active tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (MAPREDUCE-5199) AppTokens file can/should be removed

2013-05-16 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp reassigned MAPREDUCE-5199:
--

Assignee: Daryn Sharp  (was: Vinod Kumar Vavilapalli)

 AppTokens file can/should be removed
 

 Key: MAPREDUCE-5199
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5199
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: security
Reporter: Vinod Kumar Vavilapalli
Assignee: Daryn Sharp

 All the required tokens are propagated to AMs and containers via 
 startContainer(), no need for explicitly creating the app-token file that we 
 have today..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5199) AppTokens file can/should be removed

2013-05-16 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated MAPREDUCE-5199:
---

 Priority: Blocker  (was: Major)
 Target Version/s: 3.0.0, 2.0.5-beta
Affects Version/s: 2.0.5-beta
   3.0.0

Moving to blocker because oozie cannot launch child jobs on a secure cluster.

 AppTokens file can/should be removed
 

 Key: MAPREDUCE-5199
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5199
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0, 2.0.5-beta
Reporter: Vinod Kumar Vavilapalli
Assignee: Daryn Sharp
Priority: Blocker

 All the required tokens are propagated to AMs and containers via 
 startContainer(), no need for explicitly creating the app-token file that we 
 have today..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4366) mapred metrics shows negative count of waiting maps and reduces

2013-05-16 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13659621#comment-13659621
 ] 

Arun C Murthy commented on MAPREDUCE-4366:
--

Sorry, I've had a hard time coming around to this.

{quote}
There didn't seem to be a clear definition of speculative(Map|Reduce)Tasks, so 
the one I came up with is that the number of speculative(Map|Reduce)Tasks is 
the number of attempts running that are not on the critical path of the job 
completing. This makes sense in the context of computing pending(Map|Reduce)s, 
which is the only place the variable is used.
{quote}

Thanks for the explanation.

The definition of speculative(Map|Reduce)Tasks, at least in my head, has been 
the number of task-attempts have an alternate... no, it's not a great one, or a 
documented one! *smile* 

However, this has been the basis for a number of assumptions related to 
computing pending tasks etc. in various schedulers. (See call hierarchy for 
JIP.pendingTasks).

Since your change re-defines this, I'm afraid it breaks schedulers e.g. 
CapacityScheduler. Hence, I'm against the change.

I fully agree it isn't ideal, but I'd rather not make invasive changes in MR1 - 
the JT/JIP/Scheduler nexus scares me a lot... in fact, I'm officially terrified 
of it! *smile*

Now, to get around the metrics problem, how about making a more local change in 
JIP.garbageCollect? 

An option is to just call decWaiting(Maps|Reduces) in JIP.garbageCollect with 
JIP.num(Maps|Reduces)... currently if you follow the opposite side i.e 
addWaiting(Maps|Reduces), they are just static and are done at JIP.initTasks 
with num(Maps|Reduces). That would solve the immediate problem at hand?

Thoughts?



Thanks again for checking in with me, and being patient in working through the 
mess we have!

 mapred metrics shows negative count of waiting maps and reduces
 ---

 Key: MAPREDUCE-4366
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4366
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 1.0.2
Reporter: Thomas Graves
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-4366-branch-1-1.patch, 
 MAPREDUCE-4366-branch-1.patch


 Negative waiting_maps and waiting_reduces count is observed in the mapred 
 metrics.  MAPREDUCE-1238 partially fixed this but it appears there is still 
 issues as we are seeing it, but not as bad.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4366) mapred metrics shows negative count of waiting maps and reduces

2013-05-16 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-4366:
-

Status: Open  (was: Patch Available)

 mapred metrics shows negative count of waiting maps and reduces
 ---

 Key: MAPREDUCE-4366
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4366
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 1.0.2
Reporter: Thomas Graves
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-4366-branch-1-1.patch, 
 MAPREDUCE-4366-branch-1.patch


 Negative waiting_maps and waiting_reduces count is observed in the mapred 
 metrics.  MAPREDUCE-1238 partially fixed this but it appears there is still 
 issues as we are seeing it, but not as bad.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5199) AppTokens file can/should be removed

2013-05-16 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated MAPREDUCE-5199:
---

Attachment: MAPREDUCE-5199.patch

* child job is guaranteed to not acquire the parent job's app token
* AM uses full complement of tokens from the container launch context passed 
via the UGI.  
* AM strips out the app token from the credentials of tasks

Patch appears to work with preliminary testing.  Later today will report 
results of testing with oozie on a secure cluster.

 AppTokens file can/should be removed
 

 Key: MAPREDUCE-5199
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5199
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0, 2.0.5-beta
Reporter: Vinod Kumar Vavilapalli
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: MAPREDUCE-5199.patch


 All the required tokens are propagated to AMs and containers via 
 startContainer(), no need for explicitly creating the app-token file that we 
 have today..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5199) AppTokens file can/should be removed

2013-05-16 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated MAPREDUCE-5199:
---

Status: Patch Available  (was: Open)

 AppTokens file can/should be removed
 

 Key: MAPREDUCE-5199
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5199
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0, 2.0.5-beta
Reporter: Vinod Kumar Vavilapalli
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: MAPREDUCE-5199.patch


 All the required tokens are propagated to AMs and containers via 
 startContainer(), no need for explicitly creating the app-token file that we 
 have today..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5028) Maps fail when io.sort.mb is set to high value

2013-05-16 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13659666#comment-13659666
 ] 

Gopal V commented on MAPREDUCE-5028:


[~kkambatl], I took some time to go through your patch.

Patch contains 2 different fixes, which deserve their own tests  commits. 

Good catch on the overflow with the kvindex,kvend variables. That is a bug with 
the mapper with large buffers. That is a good  clean fix.

But for the second issue, I found out it triggered when the inline Combiner is 
run when there are  3 spills in the SpillThread. This wasn't tested in 
[~acmurthy]'s test-app (but the word-count sum combiner does trigger it 
cleanly).

And there I found your fix to be suspect. So, for the sake of data I logged 
every call to reset and crawled a 13 gb log file to find out offenders in reset 
(i.e where (long)start + (long)length  input.length).

This particular back-trace stood out as a key offender. I found that to be 
significant instead of merely locating the overflow cases.

{code}
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$MRResultIterator.getKey(MapTask.java:1784)
org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:138)
{code}

I will take a closer look at that code, it might be cleaner to tackle the issue 
at the first-cause location.

 Maps fail when io.sort.mb is set to high value
 --

 Key: MAPREDUCE-5028
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5028
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.1, 2.0.3-alpha, 0.23.5
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Critical
 Fix For: 1.2.0, 2.0.5-beta

 Attachments: mr-5028-branch1.patch, mr-5028-branch1.patch, 
 mr-5028-branch1.patch, MR-5028_testapp.patch, mr-5028-trunk.patch, 
 mr-5028-trunk.patch, mr-5028-trunk.patch, repro-mr-5028.patch


 Verified the problem exists on branch-1 with the following configuration:
 Pseudo-dist mode: 2 maps/ 1 reduce, mapred.child.java.opts=-Xmx2048m, 
 io.sort.mb=1280, dfs.block.size=2147483648
 Run teragen to generate 4 GB data
 Maps fail when you run wordcount on this configuration with the following 
 error: 
 {noformat}
 java.io.IOException: Spill failed
   at 
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1031)
   at 
 org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:692)
   at 
 org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
   at 
 org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:45)
   at 
 org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:34)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.io.EOFException
   at java.io.DataInputStream.readInt(DataInputStream.java:375)
   at org.apache.hadoop.io.IntWritable.readFields(IntWritable.java:38)
   at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
   at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
   at 
 org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116)
   at 
 org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
   at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
   at 
 org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1505)
   at 
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1438)
   at 
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:855)
   at 
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1346)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5199) AppTokens file can/should be removed

2013-05-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13659684#comment-13659684
 ] 

Hadoop QA commented on MAPREDUCE-5199:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12583491/MAPREDUCE-5199.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3646//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3646//console

This message is automatically generated.

 AppTokens file can/should be removed
 

 Key: MAPREDUCE-5199
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5199
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0, 2.0.5-beta
Reporter: Vinod Kumar Vavilapalli
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: MAPREDUCE-5199.patch


 All the required tokens are propagated to AMs and containers via 
 startContainer(), no need for explicitly creating the app-token file that we 
 have today..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4366) mapred metrics shows negative count of waiting maps and reduces

2013-05-16 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13659714#comment-13659714
 ] 

Sandy Ryza commented on MAPREDUCE-4366:
---

Thanks delving into this with me Arun.  First, please excuse in advance any 
errors I'm about to make here.  Trying to be careful, but the counting code is 
subtle and has been hard to think about.

bq. An option is to just call decWaiting(Maps|Reduces) in JIP.garbageCollect 
with JIP.num(Maps|Reduces)... currently if you follow the opposite side i.e 
addWaiting(Maps|Reduces), they are just static and are done at JIP.initTasks 
with num(Maps|Reduces). That would solve the immediate problem at hand?

Waiting maps and reduces are updated in the job tracker metrics every time that 
a task is launched is fails/completes, so this would not work unless I am 
missing something.

bq. The definition of speculative(Map|Reduce)Tasks, at least in my head, has 
been the number of task-attempts have an alternate...

This definition can lead to thinking there are fewer pending tasks than there 
actually are.  Consider the following situation:
My job has two maps.  Attempts are run for both of them.  One map gets a 
speculative attempt because it's running slow.  The other map's attempt fails.  
The speculative one completes.  initialMaps=2 + speculativeMaps=0 - 
runningMaps=1 - finishedMaps=1 - failedMaps=0.  So pendingMaps is now 0 even 
though we have a pending map task.  The way this has not caused jobs to starve 
is that the running speculative map will fail later on and bring pendingMaps 
back up to 1.

Wanted to make sure it was clear that the current behavior is wrong in an 
objective way.  If your stance is still that the code has been working so far 
and messing with it is just a bad idea, I trust your experience.  In that case, 
we could keep speculativeMapTasks how it is and have a separate variable, 
nonCriticalRunningTasks, that is used for updating the metrics?

 mapred metrics shows negative count of waiting maps and reduces
 ---

 Key: MAPREDUCE-4366
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4366
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 1.0.2
Reporter: Thomas Graves
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-4366-branch-1-1.patch, 
 MAPREDUCE-4366-branch-1.patch


 Negative waiting_maps and waiting_reduces count is observed in the mapred 
 metrics.  MAPREDUCE-1238 partially fixed this but it appears there is still 
 issues as we are seeing it, but not as bad.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5254) Fix exception unwrapping and unit tests using UndeclaredThrowable

2013-05-16 Thread Siddharth Seth (JIRA)
Siddharth Seth created MAPREDUCE-5254:
-

 Summary: Fix exception unwrapping and unit tests using 
UndeclaredThrowable
 Key: MAPREDUCE-5254
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5254
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth
Assignee: Siddharth Seth


Follow up to YARN-628. Exception unwrapping for MRClientProtocol needs some 
work. Also, there's a bunch of MR tests still relying on 
UndeclaredThrowableException which should no longer be thrown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13659728#comment-13659728
 ] 

Zhijie Shen commented on MAPREDUCE-5234:


It's mapreduce.TaskReport, not mapred.TaskReport, as the test class is in the 
mapreduce package as well.

{code}
--- 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/TestJob.java
+++ 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/TestJob.java
{code}

{code}
+TaskReport treport =
+new TaskReport(tid1, 0.0f, State.FAILED.toString(), null,
+  TIPStatus.FAILED, 100, 100, new Counters());
{code}

The following code seems irrelevant. TaskReport can be tested independently.

{code}
+Cluster cluster = mock(Cluster.class);
+ClientProtocol client = mock(ClientProtocol.class);
{code}

{code}
+when(client.getJobStatus(jobid)).thenReturn(status);
+TaskReport[] tr = new TaskReport[1];
+tr[0] = treport;
+when(client.getTaskReports(jobid, TaskType.MAP)).thenReturn(tr);
+when(client.getTaskReports(jobid, TaskType.REDUCE)).thenReturn(tr);
+when(client.getTaskCompletionEvents(jobid, 0, 10)).thenReturn(
+  new TaskCompletionEvent[0]);
+Job job = Job.getInstance(cluster, status, new JobConf());
+Assert.assertNotNull(job.toString());
+TaskReport[] tr1 = client.getTaskReports(jobid, TaskType.MAP);
{code}

 Signature changes for getTaskId of TaskReport in mapred
 ---

 Key: MAPREDUCE-5234
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Mayank Bansal
 Attachments: MAPREDUCE-5234-trunk-1.patch, 
 MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch


 TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
 getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated MAPREDUCE-5234:
---

Status: Open  (was: Patch Available)

 Signature changes for getTaskId of TaskReport in mapred
 ---

 Key: MAPREDUCE-5234
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Mayank Bansal
 Attachments: MAPREDUCE-5234-trunk-1.patch, 
 MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch


 TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
 getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5199) AppTokens file can/should be removed

2013-05-16 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13659730#comment-13659730
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-5199:


Looking at the patch. But I don't understand the problem, but not discounting 
that there isn't any problem at all.

bq. MAPREDUCE-5205 fixed the AM to pick up the app token, but jobs launching 
jobs (ex. oozie) still fail. The child job reads in the appTokens file 
generated by the parent job which causes the child to overwrite the app token 
with that of the parent job.
I don't understand this. AMRMToken is never part of the appTokens files. Where 
is the child job failing? Can you share some exception traces?

 AppTokens file can/should be removed
 

 Key: MAPREDUCE-5199
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5199
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0, 2.0.5-beta
Reporter: Vinod Kumar Vavilapalli
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: MAPREDUCE-5199.patch


 All the required tokens are propagated to AMs and containers via 
 startContainer(), no need for explicitly creating the app-token file that we 
 have today..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5199) AppTokens file can/should be removed

2013-05-16 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13659804#comment-13659804
 ] 

Daryn Sharp commented on MAPREDUCE-5199:


The referenced jiras add the app token to the launch context, which causes the 
app token to leak to the task.  When the task launches a child job, it dumps 
out its credentials (including the leaked app token) to the appTokens file.  
The new AM gets its credentials from the login UGI which contains a new app 
token from the RM, but when it reads the appTokens file the new app token is 
squashed with the parent job's app token.  The AM never starts up.

 AppTokens file can/should be removed
 

 Key: MAPREDUCE-5199
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5199
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0, 2.0.5-beta
Reporter: Vinod Kumar Vavilapalli
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: MAPREDUCE-5199.patch


 All the required tokens are propagated to AMs and containers via 
 startContainer(), no need for explicitly creating the app-token file that we 
 have today..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5199) AppTokens file can/should be removed

2013-05-16 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13659840#comment-13659840
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-5199:


bq. The referenced jiras add the app token to the launch context, which causes 
the app token to leak to the task. When the task launches a child job, it dumps 
out its credentials (including the leaked app token) to the appTokens file.
That's not true either. Even after the referred patches, the only tokens that 
are passed to tasks are MR specific JobToken and FSTokens (See TaskImpl 
constructor and where the credentials field coming from - from job.fsTokens 
which is from MRAppMaster.fsTokens which only has tokens from the AppTokensFile 
which *does not* have the AMRMToken).

The patches only add the AMRMToken to MRAppMaster's UGI. Which isn't what Tasks 
are given via the launch-context.

I am clearly missing something. Let me run it through Sid too who equally 
understands this code well.

 AppTokens file can/should be removed
 

 Key: MAPREDUCE-5199
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5199
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0, 2.0.5-beta
Reporter: Vinod Kumar Vavilapalli
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: MAPREDUCE-5199.patch


 All the required tokens are propagated to AMs and containers via 
 startContainer(), no need for explicitly creating the app-token file that we 
 have today..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5130) Add missing job config options to mapred-default.xml

2013-05-16 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-5130:
---

Status: Open  (was: Patch Available)

We can close MAPREDUCE-5236 as a duplicate.

Comments on the latest patch:

Overall, don't do more changes than necessary. Just change the defaults, and 
deprecate the DISABLED_MEMORY_LIMIT. If you just do this the following issues 
will be taken care of automatically
 - You can leave around the comment from getMemoryForMapTask() and 
getMemoryForReduceTask() and instead of -1, refer to the default property.
 - JobConf.normalizeMemoryConfigValue() is public, shouldn't be removed.
 - When someone gives a negative value for the vmem properties, we should just 
use the default one.
 - Given above, testNegativeValuesForMemoryParams() should be modified instead 
of removing completely.

One minor question:
 - Why remove the timeout for testJobConf()?

You missed this
bq. mapreduce.job.jvm.numtasks isn't supported in MR over YARN.

 Add missing job config options to mapred-default.xml
 

 Key: MAPREDUCE-5130
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5130
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.0.4-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-5130-1.patch, MAPREDUCE-5130-1.patch, 
 MAPREDUCE-5130-2.patch, MAPREDUCE-5130-3.patch, MAPREDUCE-5130.patch


 I came across that mapreduce.map.child.java.opts and 
 mapreduce.reduce.child.java.opts were missing in mapred-default.xml.  I'll do 
 a fuller sweep to see what else is missing before posting a patch.
 List so far:
 mapreduce.map/reduce.child.java.opts
 mapreduce.map/reduce.memory.mb
 mapreduce.job.jvm.numtasks
 mapreduce.input.lineinputformat.linespermap
 mapreduce.task.combine.progress.records

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Mayank Bansal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-5234:
-

Attachment: MAPREDUCE-5234-trunk-4.patch

Incorporating the Zhijie's comments.

Thanks,
Mayank

 Signature changes for getTaskId of TaskReport in mapred
 ---

 Key: MAPREDUCE-5234
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Mayank Bansal
 Attachments: MAPREDUCE-5234-trunk-1.patch, 
 MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch, 
 MAPREDUCE-5234-trunk-4.patch


 TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
 getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Mayank Bansal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-5234:
-

Status: Patch Available  (was: Open)

 Signature changes for getTaskId of TaskReport in mapred
 ---

 Key: MAPREDUCE-5234
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Mayank Bansal
 Attachments: MAPREDUCE-5234-trunk-1.patch, 
 MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch, 
 MAPREDUCE-5234-trunk-4.patch


 TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
 getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-5236) references to JobConf.DISABLE_MEMORY_LIMIT don't make sense in the context of MR2

2013-05-16 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza resolved MAPREDUCE-5236.
---

Resolution: Duplicate

 references to JobConf.DISABLE_MEMORY_LIMIT don't make sense in the context of 
 MR2
 -

 Key: MAPREDUCE-5236
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5236
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.4-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza

 In MR1, a special value of -1 could be given for 
 mapreduce.job.map|reduce.memory.mb when memory limits were disabled.  In MR2, 
 this makes no sense, as with slots gone, this value is used for requesting 
 resources and scheduling.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13659891#comment-13659891
 ] 

Hadoop QA commented on MAPREDUCE-5234:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12583529/MAPREDUCE-5234-trunk-4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3647//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3647//console

This message is automatically generated.

 Signature changes for getTaskId of TaskReport in mapred
 ---

 Key: MAPREDUCE-5234
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Mayank Bansal
 Attachments: MAPREDUCE-5234-trunk-1.patch, 
 MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch, 
 MAPREDUCE-5234-trunk-4.patch


 TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
 getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13659924#comment-13659924
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-5234:


Please don't make the constructor public, we don't want users to create 
TaskReports. You can move the test to other test-cases in the same package.

 Signature changes for getTaskId of TaskReport in mapred
 ---

 Key: MAPREDUCE-5234
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Mayank Bansal
 Attachments: MAPREDUCE-5234-trunk-1.patch, 
 MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch, 
 MAPREDUCE-5234-trunk-4.patch


 TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
 getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5130) Add missing job config options to mapred-default.xml

2013-05-16 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13659936#comment-13659936
 ] 

Sandy Ryza commented on MAPREDUCE-5130:
---

bq. Why remove the timeout for testJobConf()?
Sorry, careless error.  Had removed it to debug something.

Will upload a patch that leaves out mapreduce.job.jvm.numtasks for real this 
time, puts back in the comments for getMemoryForMapTask() and 
getMemoryForReduceTask(), puts back in normalizeConfigValue(), and puts back in 
a modified version of testNegativeValuesForMemoryParams().

bq. When someone gives a negative value for the vmem properties, we should just 
use the default one.
By this you mean that we should check to see whether the configured number is 
invalid and silently return the default if it is?  I haven't seen this for 
other properties.

 Add missing job config options to mapred-default.xml
 

 Key: MAPREDUCE-5130
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5130
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.0.4-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-5130-1.patch, MAPREDUCE-5130-1.patch, 
 MAPREDUCE-5130-2.patch, MAPREDUCE-5130-3.patch, MAPREDUCE-5130.patch


 I came across that mapreduce.map.child.java.opts and 
 mapreduce.reduce.child.java.opts were missing in mapred-default.xml.  I'll do 
 a fuller sweep to see what else is missing before posting a patch.
 List so far:
 mapreduce.map/reduce.child.java.opts
 mapreduce.map/reduce.memory.mb
 mapreduce.job.jvm.numtasks
 mapreduce.input.lineinputformat.linespermap
 mapreduce.task.combine.progress.records

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5028) Maps fail when io.sort.mb is set to high value

2013-05-16 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13659951#comment-13659951
 ] 

Gopal V commented on MAPREDUCE-5028:


I ran the tests again because something didn't seem right - my '+' operation 
was turning into a string concat operation in logging (*ugh*).

{code}
2013-05-15 18:52:47,876 INFO [SpillThread] 
org.apache.hadoop.io.DataInputBuffer: input.length = 1342177280, start = 
687161440, length = 687161444
2013-05-15 18:52:47,876 INFO [SpillThread] 
org.apache.hadoop.io.DataInputBuffer: count math 687161440 + 687161444 = 
1374322884
2013-05-15 18:52:47,876 INFO [SpillThread] 
org.apache.hadoop.io.DataInputBuffer: 
org.apache.hadoop.io.DataInputBuffer$Buffer.reset(DataInputBuffer.java:58)
2013-05-15 18:52:47,876 INFO [SpillThread] 
org.apache.hadoop.io.DataInputBuffer: 
org.apache.hadoop.io.DataInputBuffer.reset(DataInputBuffer.java:92)
2013-05-15 18:52:47,876 INFO [SpillThread] 
org.apache.hadoop.io.DataInputBuffer: 
org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:144)
2013-05-15 18:52:47,876 INFO [SpillThread] 
org.apache.hadoop.io.DataInputBuffer: 
org.apache.hadoop.mapreduce.task.ReduceContextImpl$ValueIterator.next(ReduceContextImpl.java:237)

2013-05-15 18:52:47,861 INFO [SpillThread] 
org.apache.hadoop.io.DataInputBuffer: input.length = 1342177280, start = 
905211353, length = 905211357
2013-05-15 18:52:47,861 INFO [SpillThread] 
org.apache.hadoop.io.DataInputBuffer: count math 905211353 + 905211357 = 
1810422710
2013-05-15 18:52:47,861 INFO [SpillThread] 
org.apache.hadoop.io.DataInputBuffer: 
org.apache.hadoop.io.DataInputBuffer$Buffer.reset(DataInputBuffer.java:58)
2013-05-15 18:52:47,861 INFO [SpillThread] 
org.apache.hadoop.io.DataInputBuffer: 
org.apache.hadoop.io.DataInputBuffer.reset(DataInputBuffer.java:92)
2013-05-15 18:52:47,861 INFO [SpillThread] 
org.apache.hadoop.io.DataInputBuffer: 
org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:144)
2013-05-15 18:52:47,861 INFO [SpillThread] 
org.apache.hadoop.io.DataInputBuffer: 
org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
{code}

Those are wrong, definitely wrong.

 Maps fail when io.sort.mb is set to high value
 --

 Key: MAPREDUCE-5028
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5028
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.1, 2.0.3-alpha, 0.23.5
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Critical
 Fix For: 1.2.0, 2.0.5-beta

 Attachments: mr-5028-branch1.patch, mr-5028-branch1.patch, 
 mr-5028-branch1.patch, MR-5028_testapp.patch, mr-5028-trunk.patch, 
 mr-5028-trunk.patch, mr-5028-trunk.patch, repro-mr-5028.patch


 Verified the problem exists on branch-1 with the following configuration:
 Pseudo-dist mode: 2 maps/ 1 reduce, mapred.child.java.opts=-Xmx2048m, 
 io.sort.mb=1280, dfs.block.size=2147483648
 Run teragen to generate 4 GB data
 Maps fail when you run wordcount on this configuration with the following 
 error: 
 {noformat}
 java.io.IOException: Spill failed
   at 
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1031)
   at 
 org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:692)
   at 
 org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
   at 
 org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:45)
   at 
 org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:34)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.io.EOFException
   at java.io.DataInputStream.readInt(DataInputStream.java:375)
   at org.apache.hadoop.io.IntWritable.readFields(IntWritable.java:38)
   at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
   at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
   at 
 org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116)
   at 
 org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
  

[jira] [Commented] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660019#comment-13660019
 ] 

Zhijie Shen commented on MAPREDUCE-5234:


The following code seems unnecessary as well. Either cluster or client is used 
afterwards.
{code}
+Cluster cluster = mock(Cluster.class);
+ClientProtocol client = mock(ClientProtocol.class);
+when(cluster.getClient()).thenReturn(client);
{code}

 Signature changes for getTaskId of TaskReport in mapred
 ---

 Key: MAPREDUCE-5234
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Mayank Bansal
 Attachments: MAPREDUCE-5234-trunk-1.patch, 
 MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch, 
 MAPREDUCE-5234-trunk-4.patch


 TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
 getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5199) AppTokens file can/should be removed

2013-05-16 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660039#comment-13660039
 ] 

Siddharth Seth commented on MAPREDUCE-5199:
---

Daryn, I'd looked at the AM code a couple of days ago as part of our offline 
conversation. I'm still not sure how the AppToken gets clobbered in the Task. 
From looking at the AM code, it doesn't look like the CLC for the Task (Oozie 
launcher map task) gets anything from the AM's ugi. It only gets the job token 
generated by the AM, and the tokens from the 'appTokens' file. This file is 
written out by the client - at which point the AMToken is not available.
Has Oozie, by any chance, changed to launch it's tasks via an AM itself (which 
has similar ugi magic as the MR AM)?

 AppTokens file can/should be removed
 

 Key: MAPREDUCE-5199
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5199
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0, 2.0.5-beta
Reporter: Vinod Kumar Vavilapalli
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: MAPREDUCE-5199.patch


 All the required tokens are propagated to AMs and containers via 
 startContainer(), no need for explicitly creating the app-token file that we 
 have today..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Mayank Bansal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-5234:
-

Status: Patch Available  (was: Open)

 Signature changes for getTaskId of TaskReport in mapred
 ---

 Key: MAPREDUCE-5234
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Mayank Bansal
 Attachments: MAPREDUCE-5234-trunk-1.patch, 
 MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch, 
 MAPREDUCE-5234-trunk-4.patch, MAPREDUCE-5234-trunk-5.patch


 TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
 getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Mayank Bansal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-5234:
-

Attachment: MAPREDUCE-5234-trunk-5.patch

fixed.

Thanks,
Mayank

 Signature changes for getTaskId of TaskReport in mapred
 ---

 Key: MAPREDUCE-5234
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Mayank Bansal
 Attachments: MAPREDUCE-5234-trunk-1.patch, 
 MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch, 
 MAPREDUCE-5234-trunk-4.patch, MAPREDUCE-5234-trunk-5.patch


 TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
 getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5253) Whitespace value entry in mapred-site.xml for name=mapred.reduce.child.java.opts causes child tasks to fail at launch

2013-05-16 Thread Karl D. Gierach (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660100#comment-13660100
 ] 

Karl D. Gierach commented on MAPREDUCE-5253:


The patch also should be applied to this file under the current trunk (as noted 
by Chris Nauroth).

https://github.com/apache/hadoop-common/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/MapReduceChildJVM.java#L156

 Whitespace value entry in mapred-site.xml for 
 name=mapred.reduce.child.java.opts causes child tasks to fail at launch
 -

 Key: MAPREDUCE-5253
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5253
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: task
Affects Versions: 1.1.2
 Environment: Centos 6.2 32Bit, OpenJDK
Reporter: Karl D. Gierach
 Fix For: 1.1.3


 Hi,
 Below is a patch for Hadoop v1.1.2.  I'm new to this list, so if I need to 
 write up a JIRA ticket for this, please let me know.
 The defect scenario is that if you enter any white space within values in 
 this file:
 /etc/hadoop/mapred-site.xml
 e.g.: (a white space prior to the -X...)
   property
 namemapred.reduce.child.java.opts/name
 value -Xmx1G/value
   /property
 All of the child jobs fail, and each child gets an error in the stderr log 
 like:
 Could not find the main class: . Program will exit.
 The root cause is obvious in the patch below - the split on the value was 
 done on whitespace, and any preceding whitespace ultimately becomes a 
 zero-length entry on the child jvm command line, causing the jvm to think 
 that a '' argument is the main class.   The patch just skips over any 
 zero-length entries prior to adding them to the jvm vargs list.  I looked in 
 trunk as well, to see if the patch would apply there but it looks like Tasks 
 were refactored and this code file is not present any more.
 This error occurred on Open JDK, Centos 6.2, 32 bit.
 Regards,
 Karl
 Index: src/mapred/org/apache/hadoop/mapred/TaskRunner.java
 ===
 --- src/mapred/org/apache/hadoop/mapred/TaskRunner.java(revision 1482686)
 +++ src/mapred/org/apache/hadoop/mapred/TaskRunner.java(working copy)
 @@ -437,7 +437,9 @@
vargs.add(-Djava.library.path= + libraryPath);
  }
  for (int i = 0; i  javaOptsSplit.length; i++) {
 -  vargs.add(javaOptsSplit[i]);
 +  if( javaOptsSplit[i].trim().length()  0 ) {
 +vargs.add(javaOptsSplit[i]);
 +  }
  }
  
  Path childTmpDir = createChildTmpDir(workDir, conf, false);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660102#comment-13660102
 ] 

Hadoop QA commented on MAPREDUCE-5234:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12583553/MAPREDUCE-5234-trunk-5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3648//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3648//console

This message is automatically generated.

 Signature changes for getTaskId of TaskReport in mapred
 ---

 Key: MAPREDUCE-5234
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Mayank Bansal
 Attachments: MAPREDUCE-5234-trunk-1.patch, 
 MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch, 
 MAPREDUCE-5234-trunk-4.patch, MAPREDUCE-5234-trunk-5.patch


 TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
 getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4927) Historyserver 500 error due to NPE when accessing specific counters page for failed job

2013-05-16 Thread Ashwin Shankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashwin Shankar updated MAPREDUCE-4927:
--

Attachment: MAPREDUCE-4927.txt

The problem is that a failed task doesn't have counters and we assume that we 
always get counters which causes an NPE. I've added a null check for counters 
to fix this. Also I've changed a unit test to incorporate this case.

 Historyserver 500 error due to NPE when accessing specific counters page for 
 failed job
 ---

 Key: MAPREDUCE-4927
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4927
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 2.0.3-alpha, 0.23.6
Reporter: Jason Lowe
 Attachments: MAPREDUCE-4927.txt


 Went to the historyserver page for a job that failed and examined the 
 counters page.  When I clicked on a specific counter, the historyserver 
 returned a 500 error.  The historyserver logs showed it encountered an NPE 
 error, full traceback to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4927) Historyserver 500 error due to NPE when accessing specific counters page for failed job

2013-05-16 Thread Ashwin Shankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashwin Shankar updated MAPREDUCE-4927:
--

Assignee: Ashwin Shankar
Target Version/s: 2.0.5-beta, 0.23.8
  Status: Patch Available  (was: Open)

 Historyserver 500 error due to NPE when accessing specific counters page for 
 failed job
 ---

 Key: MAPREDUCE-4927
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4927
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 0.23.6, 2.0.3-alpha
Reporter: Jason Lowe
Assignee: Ashwin Shankar
 Attachments: MAPREDUCE-4927.txt


 Went to the historyserver page for a job that failed and examined the 
 counters page.  When I clicked on a specific counter, the historyserver 
 returned a 500 error.  The historyserver logs showed it encountered an NPE 
 error, full traceback to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4927) Historyserver 500 error due to NPE when accessing specific counters page for failed job

2013-05-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660126#comment-13660126
 ] 

Hadoop QA commented on MAPREDUCE-4927:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12583567/MAPREDUCE-4927.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3649//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3649//console

This message is automatically generated.

 Historyserver 500 error due to NPE when accessing specific counters page for 
 failed job
 ---

 Key: MAPREDUCE-4927
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4927
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 2.0.3-alpha, 0.23.6
Reporter: Jason Lowe
Assignee: Ashwin Shankar
 Attachments: MAPREDUCE-4927.txt


 Went to the historyserver page for a job that failed and examined the 
 counters page.  When I clicked on a specific counter, the historyserver 
 returned a 500 error.  The historyserver logs showed it encountered an NPE 
 error, full traceback to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660131#comment-13660131
 ] 

Zhijie Shen commented on MAPREDUCE-5234:


+1. It look goot to me.

 Signature changes for getTaskId of TaskReport in mapred
 ---

 Key: MAPREDUCE-5234
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Mayank Bansal
 Attachments: MAPREDUCE-5234-trunk-1.patch, 
 MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch, 
 MAPREDUCE-5234-trunk-4.patch, MAPREDUCE-5234-trunk-5.patch


 TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
 getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5240) inside of FileOutputCommitter the initialized Credentials cache appears to be empty

2013-05-16 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660195#comment-13660195
 ] 

Konstantin Boudnik commented on MAPREDUCE-5240:
---

Roman, thanks a lot for the backport of the original patch. It applies nicely, 
I am building Hadoop right now and will do some tests right aftet that.

 inside of FileOutputCommitter the initialized Credentials cache appears to be 
 empty
 ---

 Key: MAPREDUCE-5240
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5240
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.4-alpha
Reporter: Roman Shaposhnik
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
  Labels: 2.0.4.1
 Fix For: 2.0.5-beta, 2.0.4.1-alpha

 Attachments: LostCreds.java, MAPREDUCE-5240-20130512.txt, 
 MAPREDUCE-5240-20130513.txt, MAPREDUCE-5240.2.0.4.rvs.patch.txt


 I am attaching a modified wordcount job that clearly demonstrates the problem 
 we've encountered in running Sqoop2 on YARN (BIGTOP-949).
 Here's what running it produces:
 {noformat}
 $ hadoop fs -mkdir in
 $ hadoop fs -put /etc/passwd in
 $ hadoop jar ./bug.jar org.myorg.LostCreds
 13/05/12 03:13:46 WARN mapred.JobConf: The variable mapred.child.ulimit is no 
 longer used.
 numberOfSecretKeys: 1
 numberOfTokens: 0
 ..
 ..
 ..
 13/05/12 03:05:35 INFO mapreduce.Job: Job job_1368318686284_0013 failed with 
 state FAILED due to: Job commit failed: java.io.IOException:
 numberOfSecretKeys: 0
 numberOfTokens: 0
   at 
 org.myorg.LostCreds$DestroyerFileOutputCommitter.commitJob(LostCreds.java:43)
   at 
 org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:249)
   at 
 org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:212)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:619)
 {noformat}
 As you can see, even though we've clearly initialized the creds via:
 {noformat}
 job.getCredentials().addSecretKey(new Text(mykey), mysecret.getBytes());
 {noformat}
 It doesn't seem to appear later in the job.
 This is a pretty critical issue for Sqoop 2 since it appears to be DOA for 
 YARN in Hadoop 2.0.4-alpha

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4085) Kill task attempts longer than a configured queue max time

2013-05-16 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-4085:


Attachment: MAPREDUCE-4085-branch-1.0.4.txt

Here's an updated version for anyone who wants it.  This one also includes the 
ability for users to set a smaller task time limit 
(mapred.job.{map|reduce}.task-wallclock-limit) in case they want something 
faster. i.e., I know my task should finish in 5 minutes, so kill it if it 
doesn't.  Of course, the queue time out will still kick in if the user 
provided time is longer.

 Kill task attempts longer than a configured queue max time
 --

 Key: MAPREDUCE-4085
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4085
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: task
Reporter: Allen Wittenauer
 Attachments: MAPREDUCE-4085-branch-1.0.4.txt, 
 MAPREDUCE-4085-branch-1.0.txt


 For some environments, it is desirable to have certain queues have an SLA 
 with regards to task turnover.  (i.e., a slot will be free in X minutes and 
 scheduled to the appropriate job)  Queues should have a 'task time limit' 
 that would cause task attempts over this time to be killed. This leaves open 
 the possibility that if the task was on a bad node, it could still be 
 rescheduled up to max.task.attempt times.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira