[jira] [Updated] (MAPREDUCE-4085) Kill task attempts longer than a configured queue max time

2013-05-16 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-4085:


Attachment: MAPREDUCE-4085-branch-1.0.4.txt

Here's an updated version for anyone who wants it.  This one also includes the 
ability for users to set a smaller task time limit 
(mapred.job.{map|reduce}.task-wallclock-limit) in case they want something 
faster. i.e., "I know my task should finish in 5 minutes, so kill it if it 
doesn't".  Of course, the queue time out will still kick in if the user 
provided time is longer.

> Kill task attempts longer than a configured queue max time
> --
>
> Key: MAPREDUCE-4085
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4085
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: task
>Reporter: Allen Wittenauer
> Attachments: MAPREDUCE-4085-branch-1.0.4.txt, 
> MAPREDUCE-4085-branch-1.0.txt
>
>
> For some environments, it is desirable to have certain queues have an SLA 
> with regards to task turnover.  (i.e., a slot will be free in X minutes and 
> scheduled to the appropriate job)  Queues should have a 'task time limit' 
> that would cause task attempts over this time to be killed. This leaves open 
> the possibility that if the task was on a bad node, it could still be 
> rescheduled up to max.task.attempt times.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5240) inside of FileOutputCommitter the initialized Credentials cache appears to be empty

2013-05-16 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13660195#comment-13660195
 ] 

Konstantin Boudnik commented on MAPREDUCE-5240:
---

Roman, thanks a lot for the backport of the original patch. It applies nicely, 
I am building Hadoop right now and will do some tests right aftet that.

> inside of FileOutputCommitter the initialized Credentials cache appears to be 
> empty
> ---
>
> Key: MAPREDUCE-5240
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5240
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.4-alpha
>Reporter: Roman Shaposhnik
>Assignee: Vinod Kumar Vavilapalli
>Priority: Blocker
>  Labels: 2.0.4.1
> Fix For: 2.0.5-beta, 2.0.4.1-alpha
>
> Attachments: LostCreds.java, MAPREDUCE-5240-20130512.txt, 
> MAPREDUCE-5240-20130513.txt, MAPREDUCE-5240.2.0.4.rvs.patch.txt
>
>
> I am attaching a modified wordcount job that clearly demonstrates the problem 
> we've encountered in running Sqoop2 on YARN (BIGTOP-949).
> Here's what running it produces:
> {noformat}
> $ hadoop fs -mkdir in
> $ hadoop fs -put /etc/passwd in
> $ hadoop jar ./bug.jar org.myorg.LostCreds
> 13/05/12 03:13:46 WARN mapred.JobConf: The variable mapred.child.ulimit is no 
> longer used.
> numberOfSecretKeys: 1
> numberOfTokens: 0
> ..
> ..
> ..
> 13/05/12 03:05:35 INFO mapreduce.Job: Job job_1368318686284_0013 failed with 
> state FAILED due to: Job commit failed: java.io.IOException:
> numberOfSecretKeys: 0
> numberOfTokens: 0
>   at 
> org.myorg.LostCreds$DestroyerFileOutputCommitter.commitJob(LostCreds.java:43)
>   at 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:249)
>   at 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:212)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:619)
> {noformat}
> As you can see, even though we've clearly initialized the creds via:
> {noformat}
> job.getCredentials().addSecretKey(new Text("mykey"), "mysecret".getBytes());
> {noformat}
> It doesn't seem to appear later in the job.
> This is a pretty critical issue for Sqoop 2 since it appears to be DOA for 
> YARN in Hadoop 2.0.4-alpha

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13660131#comment-13660131
 ] 

Zhijie Shen commented on MAPREDUCE-5234:


+1. It look goot to me.

> Signature changes for getTaskId of TaskReport in mapred
> ---
>
> Key: MAPREDUCE-5234
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Mayank Bansal
> Attachments: MAPREDUCE-5234-trunk-1.patch, 
> MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch, 
> MAPREDUCE-5234-trunk-4.patch, MAPREDUCE-5234-trunk-5.patch
>
>
> TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
> getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4927) Historyserver 500 error due to NPE when accessing specific counters page for failed job

2013-05-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13660126#comment-13660126
 ] 

Hadoop QA commented on MAPREDUCE-4927:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12583567/MAPREDUCE-4927.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3649//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3649//console

This message is automatically generated.

> Historyserver 500 error due to NPE when accessing specific counters page for 
> failed job
> ---
>
> Key: MAPREDUCE-4927
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4927
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Affects Versions: 2.0.3-alpha, 0.23.6
>Reporter: Jason Lowe
>Assignee: Ashwin Shankar
> Attachments: MAPREDUCE-4927.txt
>
>
> Went to the historyserver page for a job that failed and examined the 
> counters page.  When I clicked on a specific counter, the historyserver 
> returned a 500 error.  The historyserver logs showed it encountered an NPE 
> error, full traceback to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4927) Historyserver 500 error due to NPE when accessing specific counters page for failed job

2013-05-16 Thread Ashwin Shankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashwin Shankar updated MAPREDUCE-4927:
--

Assignee: Ashwin Shankar
Target Version/s: 2.0.5-beta, 0.23.8
  Status: Patch Available  (was: Open)

> Historyserver 500 error due to NPE when accessing specific counters page for 
> failed job
> ---
>
> Key: MAPREDUCE-4927
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4927
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Affects Versions: 0.23.6, 2.0.3-alpha
>Reporter: Jason Lowe
>Assignee: Ashwin Shankar
> Attachments: MAPREDUCE-4927.txt
>
>
> Went to the historyserver page for a job that failed and examined the 
> counters page.  When I clicked on a specific counter, the historyserver 
> returned a 500 error.  The historyserver logs showed it encountered an NPE 
> error, full traceback to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4927) Historyserver 500 error due to NPE when accessing specific counters page for failed job

2013-05-16 Thread Ashwin Shankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashwin Shankar updated MAPREDUCE-4927:
--

Attachment: MAPREDUCE-4927.txt

The problem is that a failed task doesn't have counters and we assume that we 
always get counters which causes an NPE. I've added a null check for counters 
to fix this. Also I've changed a unit test to incorporate this case.

> Historyserver 500 error due to NPE when accessing specific counters page for 
> failed job
> ---
>
> Key: MAPREDUCE-4927
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4927
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Affects Versions: 2.0.3-alpha, 0.23.6
>Reporter: Jason Lowe
> Attachments: MAPREDUCE-4927.txt
>
>
> Went to the historyserver page for a job that failed and examined the 
> counters page.  When I clicked on a specific counter, the historyserver 
> returned a 500 error.  The historyserver logs showed it encountered an NPE 
> error, full traceback to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13660102#comment-13660102
 ] 

Hadoop QA commented on MAPREDUCE-5234:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12583553/MAPREDUCE-5234-trunk-5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3648//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3648//console

This message is automatically generated.

> Signature changes for getTaskId of TaskReport in mapred
> ---
>
> Key: MAPREDUCE-5234
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Mayank Bansal
> Attachments: MAPREDUCE-5234-trunk-1.patch, 
> MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch, 
> MAPREDUCE-5234-trunk-4.patch, MAPREDUCE-5234-trunk-5.patch
>
>
> TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
> getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5253) Whitespace value entry in mapred-site.xml for name=mapred.reduce.child.java.opts causes child tasks to fail at launch

2013-05-16 Thread Karl D. Gierach (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13660100#comment-13660100
 ] 

Karl D. Gierach commented on MAPREDUCE-5253:


The patch also should be applied to this file under the current trunk (as noted 
by Chris Nauroth).

https://github.com/apache/hadoop-common/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/MapReduceChildJVM.java#L156

> Whitespace value entry in mapred-site.xml for 
> name=mapred.reduce.child.java.opts causes child tasks to fail at launch
> -
>
> Key: MAPREDUCE-5253
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5253
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 1.1.2
> Environment: Centos 6.2 32Bit, OpenJDK
>Reporter: Karl D. Gierach
> Fix For: 1.1.3
>
>
> Hi,
> Below is a patch for Hadoop v1.1.2.  I'm new to this list, so if I need to 
> write up a JIRA ticket for this, please let me know.
> The defect scenario is that if you enter any white space within values in 
> this file:
> /etc/hadoop/mapred-site.xml
> e.g.: (a white space prior to the -X...)
>   
> mapred.reduce.child.java.opts
>  -Xmx1G
>   
> All of the child jobs fail, and each child gets an error in the stderr log 
> like:
> Could not find the main class: . Program will exit.
> The root cause is obvious in the patch below - the split on the value was 
> done on whitespace, and any preceding whitespace ultimately becomes a 
> zero-length entry on the child jvm command line, causing the jvm to think 
> that a '' argument is the main class.   The patch just skips over any 
> zero-length entries prior to adding them to the jvm vargs list.  I looked in 
> trunk as well, to see if the patch would apply there but it looks like Tasks 
> were refactored and this code file is not present any more.
> This error occurred on Open JDK, Centos 6.2, 32 bit.
> Regards,
> Karl
> Index: src/mapred/org/apache/hadoop/mapred/TaskRunner.java
> ===
> --- src/mapred/org/apache/hadoop/mapred/TaskRunner.java(revision 1482686)
> +++ src/mapred/org/apache/hadoop/mapred/TaskRunner.java(working copy)
> @@ -437,7 +437,9 @@
>vargs.add("-Djava.library.path=" + libraryPath);
>  }
>  for (int i = 0; i < javaOptsSplit.length; i++) {
> -  vargs.add(javaOptsSplit[i]);
> +  if( javaOptsSplit[i].trim().length() > 0 ) {
> +vargs.add(javaOptsSplit[i]);
> +  }
>  }
>  
>  Path childTmpDir = createChildTmpDir(workDir, conf, false);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Mayank Bansal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-5234:
-

Status: Patch Available  (was: Open)

> Signature changes for getTaskId of TaskReport in mapred
> ---
>
> Key: MAPREDUCE-5234
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Mayank Bansal
> Attachments: MAPREDUCE-5234-trunk-1.patch, 
> MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch, 
> MAPREDUCE-5234-trunk-4.patch, MAPREDUCE-5234-trunk-5.patch
>
>
> TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
> getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Mayank Bansal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-5234:
-

Attachment: MAPREDUCE-5234-trunk-5.patch

fixed.

Thanks,
Mayank

> Signature changes for getTaskId of TaskReport in mapred
> ---
>
> Key: MAPREDUCE-5234
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Mayank Bansal
> Attachments: MAPREDUCE-5234-trunk-1.patch, 
> MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch, 
> MAPREDUCE-5234-trunk-4.patch, MAPREDUCE-5234-trunk-5.patch
>
>
> TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
> getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Mayank Bansal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-5234:
-

Status: Open  (was: Patch Available)

> Signature changes for getTaskId of TaskReport in mapred
> ---
>
> Key: MAPREDUCE-5234
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Mayank Bansal
> Attachments: MAPREDUCE-5234-trunk-1.patch, 
> MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch, 
> MAPREDUCE-5234-trunk-4.patch, MAPREDUCE-5234-trunk-5.patch
>
>
> TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
> getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5199) AppTokens file can/should be removed

2013-05-16 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13660039#comment-13660039
 ] 

Siddharth Seth commented on MAPREDUCE-5199:
---

Daryn, I'd looked at the AM code a couple of days ago as part of our offline 
conversation. I'm still not sure how the AppToken gets clobbered in the Task. 
From looking at the AM code, it doesn't look like the CLC for the Task (Oozie 
launcher map task) gets anything from the AM's ugi. It only gets the job token 
generated by the AM, and the tokens from the 'appTokens' file. This file is 
written out by the client - at which point the AMToken is not available.
Has Oozie, by any chance, changed to launch it's tasks via an AM itself (which 
has similar ugi magic as the MR AM)?

> AppTokens file can/should be removed
> 
>
> Key: MAPREDUCE-5199
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5199
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 3.0.0, 2.0.5-beta
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Daryn Sharp
>Priority: Blocker
> Attachments: MAPREDUCE-5199.patch
>
>
> All the required tokens are propagated to AMs and containers via 
> startContainer(), no need for explicitly creating the app-token file that we 
> have today..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13660019#comment-13660019
 ] 

Zhijie Shen commented on MAPREDUCE-5234:


The following code seems unnecessary as well. Either cluster or client is used 
afterwards.
{code}
+Cluster cluster = mock(Cluster.class);
+ClientProtocol client = mock(ClientProtocol.class);
+when(cluster.getClient()).thenReturn(client);
{code}

> Signature changes for getTaskId of TaskReport in mapred
> ---
>
> Key: MAPREDUCE-5234
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Mayank Bansal
> Attachments: MAPREDUCE-5234-trunk-1.patch, 
> MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch, 
> MAPREDUCE-5234-trunk-4.patch
>
>
> TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
> getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5028) Maps fail when io.sort.mb is set to high value

2013-05-16 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659951#comment-13659951
 ] 

Gopal V commented on MAPREDUCE-5028:


I ran the tests again because something didn't seem right - my '+' operation 
was turning into a string concat operation in logging (*ugh*).

{code}
2013-05-15 18:52:47,876 INFO [SpillThread] 
org.apache.hadoop.io.DataInputBuffer: input.length = 1342177280, start = 
687161440, length = 687161444
2013-05-15 18:52:47,876 INFO [SpillThread] 
org.apache.hadoop.io.DataInputBuffer: count math 687161440 + 687161444 = 
1374322884
2013-05-15 18:52:47,876 INFO [SpillThread] 
org.apache.hadoop.io.DataInputBuffer: 
org.apache.hadoop.io.DataInputBuffer$Buffer.reset(DataInputBuffer.java:58)
2013-05-15 18:52:47,876 INFO [SpillThread] 
org.apache.hadoop.io.DataInputBuffer: 
org.apache.hadoop.io.DataInputBuffer.reset(DataInputBuffer.java:92)
2013-05-15 18:52:47,876 INFO [SpillThread] 
org.apache.hadoop.io.DataInputBuffer: 
org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:144)
2013-05-15 18:52:47,876 INFO [SpillThread] 
org.apache.hadoop.io.DataInputBuffer: 
org.apache.hadoop.mapreduce.task.ReduceContextImpl$ValueIterator.next(ReduceContextImpl.java:237)

2013-05-15 18:52:47,861 INFO [SpillThread] 
org.apache.hadoop.io.DataInputBuffer: input.length = 1342177280, start = 
905211353, length = 905211357
2013-05-15 18:52:47,861 INFO [SpillThread] 
org.apache.hadoop.io.DataInputBuffer: count math 905211353 + 905211357 = 
1810422710
2013-05-15 18:52:47,861 INFO [SpillThread] 
org.apache.hadoop.io.DataInputBuffer: 
org.apache.hadoop.io.DataInputBuffer$Buffer.reset(DataInputBuffer.java:58)
2013-05-15 18:52:47,861 INFO [SpillThread] 
org.apache.hadoop.io.DataInputBuffer: 
org.apache.hadoop.io.DataInputBuffer.reset(DataInputBuffer.java:92)
2013-05-15 18:52:47,861 INFO [SpillThread] 
org.apache.hadoop.io.DataInputBuffer: 
org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:144)
2013-05-15 18:52:47,861 INFO [SpillThread] 
org.apache.hadoop.io.DataInputBuffer: 
org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
{code}

Those are wrong, definitely wrong.

> Maps fail when io.sort.mb is set to high value
> --
>
> Key: MAPREDUCE-5028
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5028
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1.1.1, 2.0.3-alpha, 0.23.5
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Critical
> Fix For: 1.2.0, 2.0.5-beta
>
> Attachments: mr-5028-branch1.patch, mr-5028-branch1.patch, 
> mr-5028-branch1.patch, MR-5028_testapp.patch, mr-5028-trunk.patch, 
> mr-5028-trunk.patch, mr-5028-trunk.patch, repro-mr-5028.patch
>
>
> Verified the problem exists on branch-1 with the following configuration:
> Pseudo-dist mode: 2 maps/ 1 reduce, mapred.child.java.opts=-Xmx2048m, 
> io.sort.mb=1280, dfs.block.size=2147483648
> Run teragen to generate 4 GB data
> Maps fail when you run wordcount on this configuration with the following 
> error: 
> {noformat}
> java.io.IOException: Spill failed
>   at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1031)
>   at 
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:692)
>   at 
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>   at 
> org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:45)
>   at 
> org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:34)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.io.EOFException
>   at java.io.DataInputStream.readInt(DataInputStream.java:375)
>   at org.apache.hadoop.io.IntWritable.readFields(IntWritable.java:38)
>   at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
>   at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
>   at 
> org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116)
>   at 
> org.apache.hadoop.

[jira] [Commented] (MAPREDUCE-5130) Add missing job config options to mapred-default.xml

2013-05-16 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659936#comment-13659936
 ] 

Sandy Ryza commented on MAPREDUCE-5130:
---

bq. Why remove the timeout for testJobConf()?
Sorry, careless error.  Had removed it to debug something.

Will upload a patch that leaves out mapreduce.job.jvm.numtasks for real this 
time, puts back in the comments for getMemoryForMapTask() and 
getMemoryForReduceTask(), puts back in normalizeConfigValue(), and puts back in 
a modified version of testNegativeValuesForMemoryParams().

bq. When someone gives a negative value for the vmem properties, we should just 
use the default one.
By this you mean that we should check to see whether the configured number is 
invalid and silently return the default if it is?  I haven't seen this for 
other properties.

> Add missing job config options to mapred-default.xml
> 
>
> Key: MAPREDUCE-5130
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5130
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.0.4-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-5130-1.patch, MAPREDUCE-5130-1.patch, 
> MAPREDUCE-5130-2.patch, MAPREDUCE-5130-3.patch, MAPREDUCE-5130.patch
>
>
> I came across that mapreduce.map.child.java.opts and 
> mapreduce.reduce.child.java.opts were missing in mapred-default.xml.  I'll do 
> a fuller sweep to see what else is missing before posting a patch.
> List so far:
> mapreduce.map/reduce.child.java.opts
> mapreduce.map/reduce.memory.mb
> mapreduce.job.jvm.numtasks
> mapreduce.input.lineinputformat.linespermap
> mapreduce.task.combine.progress.records

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659924#comment-13659924
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-5234:


Please don't make the constructor public, we don't want users to create 
TaskReports. You can move the test to other test-cases in the same package.

> Signature changes for getTaskId of TaskReport in mapred
> ---
>
> Key: MAPREDUCE-5234
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Mayank Bansal
> Attachments: MAPREDUCE-5234-trunk-1.patch, 
> MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch, 
> MAPREDUCE-5234-trunk-4.patch
>
>
> TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
> getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659891#comment-13659891
 ] 

Hadoop QA commented on MAPREDUCE-5234:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12583529/MAPREDUCE-5234-trunk-4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3647//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3647//console

This message is automatically generated.

> Signature changes for getTaskId of TaskReport in mapred
> ---
>
> Key: MAPREDUCE-5234
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Mayank Bansal
> Attachments: MAPREDUCE-5234-trunk-1.patch, 
> MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch, 
> MAPREDUCE-5234-trunk-4.patch
>
>
> TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
> getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-5236) references to JobConf.DISABLE_MEMORY_LIMIT don't make sense in the context of MR2

2013-05-16 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza resolved MAPREDUCE-5236.
---

Resolution: Duplicate

> references to JobConf.DISABLE_MEMORY_LIMIT don't make sense in the context of 
> MR2
> -
>
> Key: MAPREDUCE-5236
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5236
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.4-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
>
> In MR1, a special value of -1 could be given for 
> mapreduce.job.map|reduce.memory.mb when memory limits were disabled.  In MR2, 
> this makes no sense, as with slots gone, this value is used for requesting 
> resources and scheduling.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Mayank Bansal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-5234:
-

Attachment: MAPREDUCE-5234-trunk-4.patch

Incorporating the Zhijie's comments.

Thanks,
Mayank

> Signature changes for getTaskId of TaskReport in mapred
> ---
>
> Key: MAPREDUCE-5234
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Mayank Bansal
> Attachments: MAPREDUCE-5234-trunk-1.patch, 
> MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch, 
> MAPREDUCE-5234-trunk-4.patch
>
>
> TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
> getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Mayank Bansal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-5234:
-

Status: Patch Available  (was: Open)

> Signature changes for getTaskId of TaskReport in mapred
> ---
>
> Key: MAPREDUCE-5234
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Mayank Bansal
> Attachments: MAPREDUCE-5234-trunk-1.patch, 
> MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch, 
> MAPREDUCE-5234-trunk-4.patch
>
>
> TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
> getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5130) Add missing job config options to mapred-default.xml

2013-05-16 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-5130:
---

Status: Open  (was: Patch Available)

We can close MAPREDUCE-5236 as a duplicate.

Comments on the latest patch:

Overall, don't do more changes than necessary. Just change the defaults, and 
deprecate the DISABLED_MEMORY_LIMIT. If you just do this the following issues 
will be taken care of automatically
 - You can leave around the comment from getMemoryForMapTask() and 
getMemoryForReduceTask() and instead of -1, refer to the default property.
 - JobConf.normalizeMemoryConfigValue() is public, shouldn't be removed.
 - When someone gives a negative value for the vmem properties, we should just 
use the default one.
 - Given above, testNegativeValuesForMemoryParams() should be modified instead 
of removing completely.

One minor question:
 - Why remove the timeout for testJobConf()?

You missed this
bq. mapreduce.job.jvm.numtasks isn't supported in MR over YARN.

> Add missing job config options to mapred-default.xml
> 
>
> Key: MAPREDUCE-5130
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5130
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.0.4-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-5130-1.patch, MAPREDUCE-5130-1.patch, 
> MAPREDUCE-5130-2.patch, MAPREDUCE-5130-3.patch, MAPREDUCE-5130.patch
>
>
> I came across that mapreduce.map.child.java.opts and 
> mapreduce.reduce.child.java.opts were missing in mapred-default.xml.  I'll do 
> a fuller sweep to see what else is missing before posting a patch.
> List so far:
> mapreduce.map/reduce.child.java.opts
> mapreduce.map/reduce.memory.mb
> mapreduce.job.jvm.numtasks
> mapreduce.input.lineinputformat.linespermap
> mapreduce.task.combine.progress.records

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5199) AppTokens file can/should be removed

2013-05-16 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659840#comment-13659840
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-5199:


bq. The referenced jiras add the app token to the launch context, which causes 
the app token to leak to the task. When the task launches a child job, it dumps 
out its credentials (including the leaked app token) to the appTokens file.
That's not true either. Even after the referred patches, the only tokens that 
are passed to tasks are MR specific JobToken and FSTokens (See TaskImpl 
constructor and where the credentials field coming from - from job.fsTokens 
which is from MRAppMaster.fsTokens which only has tokens from the AppTokensFile 
which *does not* have the AMRMToken).

The patches only add the AMRMToken to MRAppMaster's UGI. Which isn't what Tasks 
are given via the launch-context.

I am clearly missing something. Let me run it through Sid too who equally 
understands this code well.

> AppTokens file can/should be removed
> 
>
> Key: MAPREDUCE-5199
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5199
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 3.0.0, 2.0.5-beta
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Daryn Sharp
>Priority: Blocker
> Attachments: MAPREDUCE-5199.patch
>
>
> All the required tokens are propagated to AMs and containers via 
> startContainer(), no need for explicitly creating the app-token file that we 
> have today..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5199) AppTokens file can/should be removed

2013-05-16 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659804#comment-13659804
 ] 

Daryn Sharp commented on MAPREDUCE-5199:


The referenced jiras add the app token to the launch context, which causes the 
app token to leak to the task.  When the task launches a child job, it dumps 
out its credentials (including the leaked app token) to the appTokens file.  
The new AM gets its credentials from the login UGI which contains a new app 
token from the RM, but when it reads the appTokens file the new app token is 
squashed with the parent job's app token.  The AM never starts up.

> AppTokens file can/should be removed
> 
>
> Key: MAPREDUCE-5199
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5199
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 3.0.0, 2.0.5-beta
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Daryn Sharp
>Priority: Blocker
> Attachments: MAPREDUCE-5199.patch
>
>
> All the required tokens are propagated to AMs and containers via 
> startContainer(), no need for explicitly creating the app-token file that we 
> have today..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5199) AppTokens file can/should be removed

2013-05-16 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659730#comment-13659730
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-5199:


Looking at the patch. But I don't understand the problem, but not discounting 
that there isn't any problem at all.

bq. MAPREDUCE-5205 fixed the AM to pick up the app token, but jobs launching 
jobs (ex. oozie) still fail. The child job reads in the appTokens file 
generated by the parent job which causes the child to overwrite the app token 
with that of the parent job.
I don't understand this. AMRMToken is never part of the appTokens files. Where 
is the child job failing? Can you share some exception traces?

> AppTokens file can/should be removed
> 
>
> Key: MAPREDUCE-5199
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5199
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 3.0.0, 2.0.5-beta
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Daryn Sharp
>Priority: Blocker
> Attachments: MAPREDUCE-5199.patch
>
>
> All the required tokens are propagated to AMs and containers via 
> startContainer(), no need for explicitly creating the app-token file that we 
> have today..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated MAPREDUCE-5234:
---

Status: Open  (was: Patch Available)

> Signature changes for getTaskId of TaskReport in mapred
> ---
>
> Key: MAPREDUCE-5234
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Mayank Bansal
> Attachments: MAPREDUCE-5234-trunk-1.patch, 
> MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch
>
>
> TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
> getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5234) Signature changes for getTaskId of TaskReport in mapred

2013-05-16 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659728#comment-13659728
 ] 

Zhijie Shen commented on MAPREDUCE-5234:


It's mapreduce.TaskReport, not mapred.TaskReport, as the test class is in the 
mapreduce package as well.

{code}
--- 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/TestJob.java
+++ 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/TestJob.java
{code}

{code}
+TaskReport treport =
+new TaskReport(tid1, 0.0f, State.FAILED.toString(), null,
+  TIPStatus.FAILED, 100, 100, new Counters());
{code}

The following code seems irrelevant. TaskReport can be tested independently.

{code}
+Cluster cluster = mock(Cluster.class);
+ClientProtocol client = mock(ClientProtocol.class);
{code}

{code}
+when(client.getJobStatus(jobid)).thenReturn(status);
+TaskReport[] tr = new TaskReport[1];
+tr[0] = treport;
+when(client.getTaskReports(jobid, TaskType.MAP)).thenReturn(tr);
+when(client.getTaskReports(jobid, TaskType.REDUCE)).thenReturn(tr);
+when(client.getTaskCompletionEvents(jobid, 0, 10)).thenReturn(
+  new TaskCompletionEvent[0]);
+Job job = Job.getInstance(cluster, status, new JobConf());
+Assert.assertNotNull(job.toString());
+TaskReport[] tr1 = client.getTaskReports(jobid, TaskType.MAP);
{code}

> Signature changes for getTaskId of TaskReport in mapred
> ---
>
> Key: MAPREDUCE-5234
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5234
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Mayank Bansal
> Attachments: MAPREDUCE-5234-trunk-1.patch, 
> MAPREDUCE-5234-trunk-2.patch, MAPREDUCE-5234-trunk-3.patch
>
>
> TaskReport in mapred of MR2 extends TaskReport in mapreduce, and inherits 
> getTaskId, which return TaskID object. in MR1, this function returns String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5254) Fix exception unwrapping and unit tests using UndeclaredThrowable

2013-05-16 Thread Siddharth Seth (JIRA)
Siddharth Seth created MAPREDUCE-5254:
-

 Summary: Fix exception unwrapping and unit tests using 
UndeclaredThrowable
 Key: MAPREDUCE-5254
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5254
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth
Assignee: Siddharth Seth


Follow up to YARN-628. Exception unwrapping for MRClientProtocol needs some 
work. Also, there's a bunch of MR tests still relying on 
UndeclaredThrowableException which should no longer be thrown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4366) mapred metrics shows negative count of waiting maps and reduces

2013-05-16 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659714#comment-13659714
 ] 

Sandy Ryza commented on MAPREDUCE-4366:
---

Thanks delving into this with me Arun.  First, please excuse in advance any 
errors I'm about to make here.  Trying to be careful, but the counting code is 
subtle and has been hard to think about.

bq. An option is to just call decWaiting(Maps|Reduces) in JIP.garbageCollect 
with JIP.num(Maps|Reduces)... currently if you follow the opposite side i.e 
addWaiting(Maps|Reduces), they are just static and are done at JIP.initTasks 
with num(Maps|Reduces). That would solve the immediate problem at hand?

Waiting maps and reduces are updated in the job tracker metrics every time that 
a task is launched is fails/completes, so this would not work unless I am 
missing something.

bq. The definition of speculative(Map|Reduce)Tasks, at least in my head, has 
been the number of task-attempts have an alternate...

This definition can lead to thinking there are fewer pending tasks than there 
actually are.  Consider the following situation:
My job has two maps.  Attempts are run for both of them.  One map gets a 
speculative attempt because it's running slow.  The other map's attempt fails.  
The speculative one completes.  initialMaps=2 + speculativeMaps=0 - 
runningMaps=1 - finishedMaps=1 - failedMaps=0.  So pendingMaps is now 0 even 
though we have a pending map task.  The way this has not caused jobs to starve 
is that the running speculative map will fail later on and bring pendingMaps 
back up to 1.

Wanted to make sure it was clear that the current behavior is wrong in an 
objective way.  If your stance is still that the code has been working so far 
and messing with it is just a bad idea, I trust your experience.  In that case, 
we could keep speculativeMapTasks how it is and have a separate variable, 
nonCriticalRunningTasks, that is used for updating the metrics?

> mapred metrics shows negative count of waiting maps and reduces
> ---
>
> Key: MAPREDUCE-4366
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4366
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1.0.2
>Reporter: Thomas Graves
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-4366-branch-1-1.patch, 
> MAPREDUCE-4366-branch-1.patch
>
>
> Negative waiting_maps and waiting_reduces count is observed in the mapred 
> metrics.  MAPREDUCE-1238 partially fixed this but it appears there is still 
> issues as we are seeing it, but not as bad.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5199) AppTokens file can/should be removed

2013-05-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659684#comment-13659684
 ] 

Hadoop QA commented on MAPREDUCE-5199:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12583491/MAPREDUCE-5199.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3646//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3646//console

This message is automatically generated.

> AppTokens file can/should be removed
> 
>
> Key: MAPREDUCE-5199
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5199
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 3.0.0, 2.0.5-beta
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Daryn Sharp
>Priority: Blocker
> Attachments: MAPREDUCE-5199.patch
>
>
> All the required tokens are propagated to AMs and containers via 
> startContainer(), no need for explicitly creating the app-token file that we 
> have today..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5028) Maps fail when io.sort.mb is set to high value

2013-05-16 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659666#comment-13659666
 ] 

Gopal V commented on MAPREDUCE-5028:


[~kkambatl], I took some time to go through your patch.

Patch contains 2 different fixes, which deserve their own tests & commits. 

Good catch on the overflow with the kvindex,kvend variables. That is a bug with 
the mapper with large buffers. That is a good & clean fix.

But for the second issue, I found out it triggered when the inline Combiner is 
run when there are > 3 spills in the SpillThread. This wasn't tested in 
[~acmurthy]'s test-app (but the word-count sum combiner does trigger it 
cleanly).

And there I found your fix to be suspect. So, for the sake of data I logged 
every call to reset and crawled a 13 gb log file to find out offenders in reset 
(i.e where (long)start + (long)length > input.length).

This particular back-trace stood out as a key offender. I found that to be 
significant instead of merely locating the overflow cases.

{code}
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$MRResultIterator.getKey(MapTask.java:1784)
org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:138)
{code}

I will take a closer look at that code, it might be cleaner to tackle the issue 
at the first-cause location.

> Maps fail when io.sort.mb is set to high value
> --
>
> Key: MAPREDUCE-5028
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5028
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1.1.1, 2.0.3-alpha, 0.23.5
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Critical
> Fix For: 1.2.0, 2.0.5-beta
>
> Attachments: mr-5028-branch1.patch, mr-5028-branch1.patch, 
> mr-5028-branch1.patch, MR-5028_testapp.patch, mr-5028-trunk.patch, 
> mr-5028-trunk.patch, mr-5028-trunk.patch, repro-mr-5028.patch
>
>
> Verified the problem exists on branch-1 with the following configuration:
> Pseudo-dist mode: 2 maps/ 1 reduce, mapred.child.java.opts=-Xmx2048m, 
> io.sort.mb=1280, dfs.block.size=2147483648
> Run teragen to generate 4 GB data
> Maps fail when you run wordcount on this configuration with the following 
> error: 
> {noformat}
> java.io.IOException: Spill failed
>   at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1031)
>   at 
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:692)
>   at 
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>   at 
> org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:45)
>   at 
> org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:34)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.io.EOFException
>   at java.io.DataInputStream.readInt(DataInputStream.java:375)
>   at org.apache.hadoop.io.IntWritable.readFields(IntWritable.java:38)
>   at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
>   at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
>   at 
> org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116)
>   at 
> org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
>   at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
>   at 
> org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1505)
>   at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1438)
>   at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:855)
>   at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1346)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5199) AppTokens file can/should be removed

2013-05-16 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated MAPREDUCE-5199:
---

Status: Patch Available  (was: Open)

> AppTokens file can/should be removed
> 
>
> Key: MAPREDUCE-5199
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5199
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 3.0.0, 2.0.5-beta
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Daryn Sharp
>Priority: Blocker
> Attachments: MAPREDUCE-5199.patch
>
>
> All the required tokens are propagated to AMs and containers via 
> startContainer(), no need for explicitly creating the app-token file that we 
> have today..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5199) AppTokens file can/should be removed

2013-05-16 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated MAPREDUCE-5199:
---

Attachment: MAPREDUCE-5199.patch

* child job is guaranteed to not acquire the parent job's app token
* AM uses full complement of tokens from the container launch context passed 
via the UGI.  
* AM strips out the app token from the credentials of tasks

Patch appears to work with preliminary testing.  Later today will report 
results of testing with oozie on a secure cluster.

> AppTokens file can/should be removed
> 
>
> Key: MAPREDUCE-5199
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5199
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 3.0.0, 2.0.5-beta
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Daryn Sharp
>Priority: Blocker
> Attachments: MAPREDUCE-5199.patch
>
>
> All the required tokens are propagated to AMs and containers via 
> startContainer(), no need for explicitly creating the app-token file that we 
> have today..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4366) mapred metrics shows negative count of waiting maps and reduces

2013-05-16 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-4366:
-

Status: Open  (was: Patch Available)

> mapred metrics shows negative count of waiting maps and reduces
> ---
>
> Key: MAPREDUCE-4366
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4366
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1.0.2
>Reporter: Thomas Graves
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-4366-branch-1-1.patch, 
> MAPREDUCE-4366-branch-1.patch
>
>
> Negative waiting_maps and waiting_reduces count is observed in the mapred 
> metrics.  MAPREDUCE-1238 partially fixed this but it appears there is still 
> issues as we are seeing it, but not as bad.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4366) mapred metrics shows negative count of waiting maps and reduces

2013-05-16 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659621#comment-13659621
 ] 

Arun C Murthy commented on MAPREDUCE-4366:
--

Sorry, I've had a hard time coming around to this.

{quote}
There didn't seem to be a clear definition of speculative(Map|Reduce)Tasks, so 
the one I came up with is that the number of speculative(Map|Reduce)Tasks is 
the number of attempts running that are not on the critical path of the job 
completing. This makes sense in the context of computing pending(Map|Reduce)s, 
which is the only place the variable is used.
{quote}

Thanks for the explanation.

The definition of speculative(Map|Reduce)Tasks, at least in my head, has been 
the number of task-attempts have an alternate... no, it's not a great one, or a 
documented one! *smile* 

However, this has been the basis for a number of assumptions related to 
computing pending tasks etc. in various schedulers. (See call hierarchy for 
JIP.pendingTasks).

Since your change re-defines this, I'm afraid it breaks schedulers e.g. 
CapacityScheduler. Hence, I'm against the change.

I fully agree it isn't ideal, but I'd rather not make invasive changes in MR1 - 
the JT/JIP/Scheduler nexus scares me a lot... in fact, I'm officially terrified 
of it! *smile*

Now, to get around the metrics problem, how about making a more local change in 
JIP.garbageCollect? 

An option is to just call decWaiting(Maps|Reduces) in JIP.garbageCollect with 
JIP.num(Maps|Reduces)... currently if you follow the opposite side i.e 
addWaiting(Maps|Reduces), they are just static and are done at JIP.initTasks 
with num(Maps|Reduces). That would solve the immediate problem at hand?

Thoughts?



Thanks again for checking in with me, and being patient in working through the 
mess we have!

> mapred metrics shows negative count of waiting maps and reduces
> ---
>
> Key: MAPREDUCE-4366
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4366
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1.0.2
>Reporter: Thomas Graves
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-4366-branch-1-1.patch, 
> MAPREDUCE-4366-branch-1.patch
>
>
> Negative waiting_maps and waiting_reduces count is observed in the mapred 
> metrics.  MAPREDUCE-1238 partially fixed this but it appears there is still 
> issues as we are seeing it, but not as bad.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5199) AppTokens file can/should be removed

2013-05-16 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated MAPREDUCE-5199:
---

 Priority: Blocker  (was: Major)
 Target Version/s: 3.0.0, 2.0.5-beta
Affects Version/s: 2.0.5-beta
   3.0.0

Moving to blocker because oozie cannot launch child jobs on a secure cluster.

> AppTokens file can/should be removed
> 
>
> Key: MAPREDUCE-5199
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5199
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 3.0.0, 2.0.5-beta
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Daryn Sharp
>Priority: Blocker
>
> All the required tokens are propagated to AMs and containers via 
> startContainer(), no need for explicitly creating the app-token file that we 
> have today..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (MAPREDUCE-5199) AppTokens file can/should be removed

2013-05-16 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp reassigned MAPREDUCE-5199:
--

Assignee: Daryn Sharp  (was: Vinod Kumar Vavilapalli)

> AppTokens file can/should be removed
> 
>
> Key: MAPREDUCE-5199
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5199
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: security
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Daryn Sharp
>
> All the required tokens are propagated to AMs and containers via 
> startContainer(), no need for explicitly creating the app-token file that we 
> have today..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5124) AM lacks flow control for task events

2013-05-16 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659596#comment-13659596
 ] 

Robert Joseph Evans commented on MAPREDUCE-5124:


I believe in most cases it is enough to restrict it at the server side and 
retry at the client side, but there are some RPC calls that are different and 
perhaps should be handled slightly differently.  YARN-309 went in to try and 
throttle the hearbeats, instead of rejecting them and asking them to retry.  I 
think this is preferable for heartbeats over an outright rejection.  Simply 
because we know that the heartbeats are going to come regularly and asking the 
next one to wait does not reduce the total amount of work that we are going to 
need to do.

So I would throw a ToBusyRetryLater type of exception for once time RPC calls 
when the AsyncDispatcher's queue is over a high water mark, but for heartbeats 
I would want them to scale the frequency based off of how busy the 
AsyncDispatcher is.  

> AM lacks flow control for task events
> -
>
> Key: MAPREDUCE-5124
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5124
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.3-alpha, 0.23.5
>Reporter: Jason Lowe
> Attachments: MAPREDUCE-5124-prototype.txt
>
>
> The AM does not have any flow control to limit the incoming rate of events 
> from tasks.  If the AM is unable to keep pace with the rate of incoming 
> events for a sufficient period of time then it will eventually exhaust the 
> heap and crash.  MAPREDUCE-5043 addressed a major bottleneck for event 
> processing, but the AM could still get behind if it's starved for CPU and/or 
> handling a very large job with tens of thousands of active tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5240) inside of FileOutputCommitter the initialized Credentials cache appears to be empty

2013-05-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659321#comment-13659321
 ] 

Hadoop QA commented on MAPREDUCE-5240:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12583446/MAPREDUCE-5240.2.0.4.rvs.patch.txt
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3645//console

This message is automatically generated.

> inside of FileOutputCommitter the initialized Credentials cache appears to be 
> empty
> ---
>
> Key: MAPREDUCE-5240
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5240
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.4-alpha
>Reporter: Roman Shaposhnik
>Assignee: Vinod Kumar Vavilapalli
>Priority: Blocker
>  Labels: 2.0.4.1
> Fix For: 2.0.5-beta, 2.0.4.1-alpha
>
> Attachments: LostCreds.java, MAPREDUCE-5240-20130512.txt, 
> MAPREDUCE-5240-20130513.txt, MAPREDUCE-5240.2.0.4.rvs.patch.txt
>
>
> I am attaching a modified wordcount job that clearly demonstrates the problem 
> we've encountered in running Sqoop2 on YARN (BIGTOP-949).
> Here's what running it produces:
> {noformat}
> $ hadoop fs -mkdir in
> $ hadoop fs -put /etc/passwd in
> $ hadoop jar ./bug.jar org.myorg.LostCreds
> 13/05/12 03:13:46 WARN mapred.JobConf: The variable mapred.child.ulimit is no 
> longer used.
> numberOfSecretKeys: 1
> numberOfTokens: 0
> ..
> ..
> ..
> 13/05/12 03:05:35 INFO mapreduce.Job: Job job_1368318686284_0013 failed with 
> state FAILED due to: Job commit failed: java.io.IOException:
> numberOfSecretKeys: 0
> numberOfTokens: 0
>   at 
> org.myorg.LostCreds$DestroyerFileOutputCommitter.commitJob(LostCreds.java:43)
>   at 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:249)
>   at 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:212)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:619)
> {noformat}
> As you can see, even though we've clearly initialized the creds via:
> {noformat}
> job.getCredentials().addSecretKey(new Text("mykey"), "mysecret".getBytes());
> {noformat}
> It doesn't seem to appear later in the job.
> This is a pretty critical issue for Sqoop 2 since it appears to be DOA for 
> YARN in Hadoop 2.0.4-alpha

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5240) inside of FileOutputCommitter the initialized Credentials cache appears to be empty

2013-05-16 Thread Roman Shaposhnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Shaposhnik updated MAPREDUCE-5240:


Status: Patch Available  (was: Reopened)

> inside of FileOutputCommitter the initialized Credentials cache appears to be 
> empty
> ---
>
> Key: MAPREDUCE-5240
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5240
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.4-alpha
>Reporter: Roman Shaposhnik
>Assignee: Vinod Kumar Vavilapalli
>Priority: Blocker
>  Labels: 2.0.4.1
> Fix For: 2.0.5-beta, 2.0.4.1-alpha
>
> Attachments: LostCreds.java, MAPREDUCE-5240-20130512.txt, 
> MAPREDUCE-5240-20130513.txt, MAPREDUCE-5240.2.0.4.rvs.patch.txt
>
>
> I am attaching a modified wordcount job that clearly demonstrates the problem 
> we've encountered in running Sqoop2 on YARN (BIGTOP-949).
> Here's what running it produces:
> {noformat}
> $ hadoop fs -mkdir in
> $ hadoop fs -put /etc/passwd in
> $ hadoop jar ./bug.jar org.myorg.LostCreds
> 13/05/12 03:13:46 WARN mapred.JobConf: The variable mapred.child.ulimit is no 
> longer used.
> numberOfSecretKeys: 1
> numberOfTokens: 0
> ..
> ..
> ..
> 13/05/12 03:05:35 INFO mapreduce.Job: Job job_1368318686284_0013 failed with 
> state FAILED due to: Job commit failed: java.io.IOException:
> numberOfSecretKeys: 0
> numberOfTokens: 0
>   at 
> org.myorg.LostCreds$DestroyerFileOutputCommitter.commitJob(LostCreds.java:43)
>   at 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:249)
>   at 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:212)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:619)
> {noformat}
> As you can see, even though we've clearly initialized the creds via:
> {noformat}
> job.getCredentials().addSecretKey(new Text("mykey"), "mysecret".getBytes());
> {noformat}
> It doesn't seem to appear later in the job.
> This is a pretty critical issue for Sqoop 2 since it appears to be DOA for 
> YARN in Hadoop 2.0.4-alpha

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5240) inside of FileOutputCommitter the initialized Credentials cache appears to be empty

2013-05-16 Thread Roman Shaposhnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Shaposhnik updated MAPREDUCE-5240:


Attachment: MAPREDUCE-5240.2.0.4.rvs.patch.txt

Attaching a modified patch for branch-2.0.4

> inside of FileOutputCommitter the initialized Credentials cache appears to be 
> empty
> ---
>
> Key: MAPREDUCE-5240
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5240
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.4-alpha
>Reporter: Roman Shaposhnik
>Assignee: Vinod Kumar Vavilapalli
>Priority: Blocker
>  Labels: 2.0.4.1
> Fix For: 2.0.5-beta, 2.0.4.1-alpha
>
> Attachments: LostCreds.java, MAPREDUCE-5240-20130512.txt, 
> MAPREDUCE-5240-20130513.txt, MAPREDUCE-5240.2.0.4.rvs.patch.txt
>
>
> I am attaching a modified wordcount job that clearly demonstrates the problem 
> we've encountered in running Sqoop2 on YARN (BIGTOP-949).
> Here's what running it produces:
> {noformat}
> $ hadoop fs -mkdir in
> $ hadoop fs -put /etc/passwd in
> $ hadoop jar ./bug.jar org.myorg.LostCreds
> 13/05/12 03:13:46 WARN mapred.JobConf: The variable mapred.child.ulimit is no 
> longer used.
> numberOfSecretKeys: 1
> numberOfTokens: 0
> ..
> ..
> ..
> 13/05/12 03:05:35 INFO mapreduce.Job: Job job_1368318686284_0013 failed with 
> state FAILED due to: Job commit failed: java.io.IOException:
> numberOfSecretKeys: 0
> numberOfTokens: 0
>   at 
> org.myorg.LostCreds$DestroyerFileOutputCommitter.commitJob(LostCreds.java:43)
>   at 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:249)
>   at 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:212)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:619)
> {noformat}
> As you can see, even though we've clearly initialized the creds via:
> {noformat}
> job.getCredentials().addSecretKey(new Text("mykey"), "mysecret".getBytes());
> {noformat}
> It doesn't seem to appear later in the job.
> This is a pretty critical issue for Sqoop 2 since it appears to be DOA for 
> YARN in Hadoop 2.0.4-alpha

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira