[jira] [Updated] (MAPREDUCE-1380) Adaptive Scheduler

2014-02-24 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordà Polo updated MAPREDUCE-1380:
--

Attachment: MAPREDUCE-1380-branch-1.2.patch

Attaching a more up-to-date version of the scheduler that should apply cleanly 
against 1.2.x.

 Adaptive Scheduler
 --

 Key: MAPREDUCE-1380
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1380
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Jordà Polo
Priority: Minor
 Attachments: MAPREDUCE-1380-branch-1.2.patch, 
 MAPREDUCE-1380_0.1.patch, MAPREDUCE-1380_1.1.patch, MAPREDUCE-1380_1.1.pdf


 The Adaptive Scheduler is a pluggable Hadoop scheduler that automatically 
 adjusts the amount of used resources depending on the performance of jobs and 
 on user-defined high-level business goals.
 Existing Hadoop schedulers are focused on managing large, static clusters in 
 which nodes are added or removed manually. On the other hand, the goal of 
 this scheduler is to improve the integration of Hadoop and the applications 
 that run on top of it with environments that allow a more dynamic 
 provisioning of resources.
 The current implementation is quite straightforward. Users specify a deadline 
 at job submission time, and the scheduler adjusts the resources to meet that 
 deadline (at the moment, the scheduler can be configured to either minimize 
 or maximize the amount of resources). If multiple jobs are run 
 simultaneously, the scheduler prioritizes them by deadline. Note that the 
 current approach to estimate the completion time of jobs is quite simplistic: 
 it is based on the time it takes to finish each task, so it works well with 
 regular jobs, but there is still room for improvement for unpredictable jobs.
 The idea is to further integrate it with cloud-like and virtual environments 
 (such as Amazon EC2, Emotive, etc.) so that if, for instance, a job isn't 
 able to meet its deadline, the scheduler automatically requests more 
 resources.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-4950) MR App Master fails to write the history due to AvroTypeException

2014-02-24 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910215#comment-13910215
 ] 

Rohith commented on MAPREDUCE-4950:
---

In my cluster Hadoop-2.3, I too encounterd with similar 
exception.JobHistoryEvents write  failed. I am not pretty sure what is cause 
for the reason.:-(

{noformat}
2014-02-21 22:10:33,841 INFO [Thread-355] 
org.apache.hadoop.service.AbstractService: Service JobHistoryEventHandler 
failed in state STOPPED; cause: org.apache.avro.AvroTypeException: Attempt to 
process a enum when a string was expected.
org.apache.avro.AvroTypeException: Attempt to process a enum when a string was 
expected.
at org.apache.avro.io.parsing.Parser.advance(Parser.java:93)
at 
org.apache.avro.io.JsonEncoder.writeEnum(JsonEncoder.java:217)
at 
org.apache.avro.specific.SpecificDatumWriter.writeEnum(SpecificDatumWriter.java:54)
at 
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:67)
at 
org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
at 
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
at 
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)
at 
org.apache.hadoop.mapreduce.jobhistory.EventWriter.write(EventWriter.java:66)
at 
org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$MetaInfo.writeEvent(JobHistoryEventHandler.java:870)
at 
org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:517)
at 
org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.serviceStop(JobHistoryEventHandler.java:332)
at 
org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
at 
org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
at 
org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
at 
org.apache.hadoop.service.CompositeService.stop(CompositeService.java:159)
at 
org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:132)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStop(MRAppMaster.java:1386)
at 
org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.shutDownJob(MRAppMaster.java:550)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler$1.run(MRAppMaster.java:602)
{noformat}

 MR App Master fails to write the history due to AvroTypeException
 -

 Key: MAPREDUCE-4950
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4950
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mr-am
Reporter: Devaraj K
Priority: Critical

 {code:xml}
 2013-01-19 19:31:27,269 INFO [AsyncDispatcher event handler] 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: In stop, 
 writing event MAP_ATTEMPT_STARTED
 2013-01-19 19:31:27,269 INFO [AsyncDispatcher event handler] 
 org.apache.hadoop.yarn.service.CompositeService: Error stopping 
 JobHistoryEventHandler
 org.apache.avro.AvroTypeException: Attempt to process a enum when a 
 array-start was expected.
   at org.apache.avro.io.parsing.Parser.advance(Parser.java:93)
   at org.apache.avro.io.JsonEncoder.writeEnum(JsonEncoder.java:210)
   at 
 org.apache.avro.specific.SpecificDatumWriter.writeEnum(SpecificDatumWriter.java:54)
   at 
 org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
   at 
 org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:104)
   at 
 org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:65)
   at 
 org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:57)
   at 
 org.apache.hadoop.mapreduce.jobhistory.EventWriter.write(EventWriter.java:66)
   at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$MetaInfo.writeEvent(JobHistoryEventHandler.java:825)
   at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:517)
   at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.stop(JobHistoryEventHandler.java:346)
   at 
 org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:99)
   at 
 org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:89)
   at 
 

[jira] [Created] (MAPREDUCE-5764) Potential NullPointerException in YARNRunner.killJob(JobID arg0)

2014-02-24 Thread Rohith (JIRA)
Rohith created MAPREDUCE-5764:
-

 Summary: Potential NullPointerException in 
YARNRunner.killJob(JobID arg0)
 Key: MAPREDUCE-5764
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5764
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Rohith
Assignee: Rohith


I found YARNRunner.killJob(JobID arg0) can throw NullPointerExpetion if job 
status is null. 
bq. clientCache.getClient(arg0).getJobStatus(arg0);  can be null.
This can happen when there is history write is failed because of hdfs errors or 
staging directory is different from history server..
 
We need to have null check otherwise killJob() is prone to throw NPE which 
cause joblient to exit.

{noformat}
@Override
  public void killJob(JobID arg0) throws IOException, InterruptedException {
/* check if the status is not running, if not send kill to RM */
JobStatus status = clientCache.getClient(arg0).getJobStatus(arg0);
if (status.getState() != JobStatus.State.RUNNING) {
  try {
resMgrDelegate.killApplication(TypeConverter.toYarn(arg0).getAppId());
  } catch (YarnException e) {
throw new IOException(e);
  }
  return;
}
...
..
...
  }
{noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (MAPREDUCE-5764) Potential NullPointerException in YARNRunner.killJob(JobID arg0)

2014-02-24 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-5764.
---

Resolution: Duplicate

This is a duplicate of MAPREDUCE-5542.

 Potential NullPointerException in YARNRunner.killJob(JobID arg0)
 

 Key: MAPREDUCE-5764
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5764
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Rohith
Assignee: Rohith

 I found YARNRunner.killJob(JobID arg0) can throw NullPointerExpetion if job 
 status is null. 
 bq. clientCache.getClient(arg0).getJobStatus(arg0);  can be null.
 This can happen when there is history write is failed because of hdfs errors 
 or staging directory is different from history server..
  
 We need to have null check otherwise killJob() is prone to throw NPE which 
 cause joblient to exit.
 {noformat}
 @Override
   public void killJob(JobID arg0) throws IOException, InterruptedException {
 /* check if the status is not running, if not send kill to RM */
 JobStatus status = clientCache.getClient(arg0).getJobStatus(arg0);
 if (status.getState() != JobStatus.State.RUNNING) {
   try {
 resMgrDelegate.killApplication(TypeConverter.toYarn(arg0).getAppId());
   } catch (YarnException e) {
 throw new IOException(e);
   }
   return;
 }
 ...
 ..
 ...
   }
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5665) Add audience annotations to MiniMRYarnCluster and MiniMRCluster

2014-02-24 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated MAPREDUCE-5665:
-

Attachment: MAPREDUCE-5665.001.patch

Patch

 Add audience annotations to MiniMRYarnCluster and MiniMRCluster
 ---

 Key: MAPREDUCE-5665
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5665
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 2.2.0
Reporter: Sandy Ryza
Assignee: Anubhav Dhoot
  Labels: newbie
 Attachments: MAPREDUCE-5665.001.patch


 We should make it clear whether these are public interfaces.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5665) Add audience annotations to MiniMRYarnCluster and MiniMRCluster

2014-02-24 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated MAPREDUCE-5665:
-

Status: Patch Available  (was: Open)

 Add audience annotations to MiniMRYarnCluster and MiniMRCluster
 ---

 Key: MAPREDUCE-5665
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5665
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 2.2.0
Reporter: Sandy Ryza
Assignee: Anubhav Dhoot
  Labels: newbie

 We should make it clear whether these are public interfaces.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (MAPREDUCE-5765) Update hadoop-pipes examples README

2014-02-24 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created MAPREDUCE-5765:
--

 Summary: Update hadoop-pipes examples README
 Key: MAPREDUCE-5765
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5765
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: pipes
Affects Versions: 3.0.0, 2.5.0
Reporter: Jonathan Eagles


wordcount-simple is in the native/examples directory



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-5765) Update hadoop-pipes examples README

2014-02-24 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910653#comment-13910653
 ] 

Jonathan Eagles commented on MAPREDUCE-5765:


hadoop-tools/hadoop-pipes/src/main/native/examples/README.txt is the file with 
the error

 Update hadoop-pipes examples README
 ---

 Key: MAPREDUCE-5765
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5765
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: pipes
Affects Versions: 3.0.0, 2.5.0
Reporter: Jonathan Eagles
Assignee: Mit Desai
  Labels: documentation

 wordcount-simple is in the native/examples directory



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5765) Update hadoop-pipes examples README

2014-02-24 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated MAPREDUCE-5765:
---

Priority: Minor  (was: Major)

 Update hadoop-pipes examples README
 ---

 Key: MAPREDUCE-5765
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5765
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: pipes
Affects Versions: 3.0.0, 2.5.0
Reporter: Jonathan Eagles
Assignee: Mit Desai
Priority: Minor
  Labels: documentation

 wordcount-simple is in the native/examples directory



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Assigned] (MAPREDUCE-5765) Update hadoop-pipes examples README

2014-02-24 Thread Mit Desai (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai reassigned MAPREDUCE-5765:


Assignee: Mit Desai

 Update hadoop-pipes examples README
 ---

 Key: MAPREDUCE-5765
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5765
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: pipes
Affects Versions: 3.0.0, 2.5.0
Reporter: Jonathan Eagles
Assignee: Mit Desai
  Labels: documentation

 wordcount-simple is in the native/examples directory



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (MAPREDUCE-5766) Ping messages from attempts should be moved to DEBUG

2014-02-24 Thread Ramya Sunil (JIRA)
Ramya Sunil created MAPREDUCE-5766:
--

 Summary: Ping messages from attempts should be moved to DEBUG
 Key: MAPREDUCE-5766
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5766
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Reporter: Ramya Sunil
Priority: Minor
 Fix For: 0.24.0


Messages such as org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from 
attempt_1391416522080_0015_m_00_0 in AM logs should be moved to DEBUG.




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Assigned] (MAPREDUCE-5766) Ping messages from attempts should be moved to DEBUG

2014-02-24 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He reassigned MAPREDUCE-5766:
--

Assignee: Jian He

 Ping messages from attempts should be moved to DEBUG
 

 Key: MAPREDUCE-5766
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5766
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Reporter: Ramya Sunil
Assignee: Jian He
Priority: Minor
 Fix For: 0.24.0


 Messages such as org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from 
 attempt_1391416522080_0015_m_00_0 in AM logs should be moved to DEBUG.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them

2014-02-24 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated MAPREDUCE-5044:
-

Attachment: MAPREDUCE-5044.v04.patch

v04 to apply on top of YARN-1515.v05. It now makes sure that a thread dump is 
created in the uber mode. 

Added unit tests for a normal MR job and uber MR job.

While working on this I realized that we actually need to discuss how 
mapreduce.task.timeout is treated in the ubermode. Right now it's basically 
ignored because AM does not kill itself, LocalContainerLauncher processes 
CONTAINER_REMOTE_CLEANUP inline with the stuck in SubtaskRunner.  The liveness 
monitor for AM in RM does not catch the problem either because RMCommunicator 
heartbeats in a separate allocator thread. 

I am considering two options:
- move heartbeat() into SubtaskRunner for ubermode such that the liveness 
monitor catches the stuck ubertask.
- do System.exit(errorcode) when TA_TIMEOUT occurs.

 

 Have AM trigger jstack on task attempts that timeout before killing them
 

 Key: MAPREDUCE-5044
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mr-am
Affects Versions: 2.1.0-beta
Reporter: Jason Lowe
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, 
 MAPREDUCE-5044.v03.patch, MAPREDUCE-5044.v04.patch, Screen Shot 2013-11-12 at 
 1.05.32 PM.png, Screen Shot 2013-11-12 at 1.06.04 PM.png


 When an AM expires a task attempt it would be nice if it triggered a jstack 
 output via SIGQUIT before killing the task attempt.  This would be invaluable 
 for helping users debug their hung tasks, especially if they do not have 
 shell access to the nodes.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them

2014-02-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910771#comment-13910771
 ] 

Hadoop QA commented on MAPREDUCE-5044:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12630788/MAPREDUCE-5044.v04.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4367//console

This message is automatically generated.

 Have AM trigger jstack on task attempts that timeout before killing them
 

 Key: MAPREDUCE-5044
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mr-am
Affects Versions: 2.1.0-beta
Reporter: Jason Lowe
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, 
 MAPREDUCE-5044.v03.patch, MAPREDUCE-5044.v04.patch, Screen Shot 2013-11-12 at 
 1.05.32 PM.png, Screen Shot 2013-11-12 at 1.06.04 PM.png


 When an AM expires a task attempt it would be nice if it triggered a jstack 
 output via SIGQUIT before killing the task attempt.  This would be invaluable 
 for helping users debug their hung tasks, especially if they do not have 
 shell access to the nodes.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-5665) Add audience annotations to MiniMRYarnCluster and MiniMRCluster

2014-02-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910777#comment-13910777
 ] 

Hadoop QA commented on MAPREDUCE-5665:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12630739/MAPREDUCE-5665.001.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4366//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4366//console

This message is automatically generated.

 Add audience annotations to MiniMRYarnCluster and MiniMRCluster
 ---

 Key: MAPREDUCE-5665
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5665
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 2.2.0
Reporter: Sandy Ryza
Assignee: Anubhav Dhoot
  Labels: newbie
 Attachments: MAPREDUCE-5665.001.patch


 We should make it clear whether these are public interfaces.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-4868) Allow multiple iteration for map

2014-02-24 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-4868:
-

Fix Version/s: (was: 2.3.0)
   2.4.0

 Allow multiple iteration for map
 

 Key: MAPREDUCE-4868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4868
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 3.0.0, 2.0.3-alpha
Reporter: Jerry Chen
 Fix For: 3.0.0, 2.4.0

   Original Estimate: 168h
  Remaining Estimate: 168h

 Currently, the Mapper class allows advanced users to override public void 
 run(Context context) method for more control over the execution of the 
 mapper, while Context interface limit the operations over the data which is 
 the foundation of more control.
 One of use cases is that when I am considering a hive optimziation problem, I 
 want to go two passes over the input data instead of using a another job or 
 task ( which may slower the whole process). Each pass do the same thing but 
 with a different parameters.
 This is a new paradigm of Map Reduce usage and can be archived easily by 
 extend Context interface a little with the more control over the data such as 
 reset the input.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5728) Check NPE for serializer/deserializer in MapTask

2014-02-24 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-5728:
-

Fix Version/s: (was: 2.3.0)
   2.4.0

 Check NPE for serializer/deserializer in MapTask
 

 Key: MAPREDUCE-5728
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5728
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client
Affects Versions: 2.2.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Fix For: 2.2.1, 2.4.0

 Attachments: MAPREDUCE-5728-trunk.patch


 Currently we will get NPE if the serializer/deserializer is not configured 
 correctly.
 {code}
 14/01/14 11:52:35 INFO mapred.JobClient: Task Id : 
 attempt_201401072154_0027_m_02_2, Status : FAILED
 java.lang.NullPointerException
 at 
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:944)
 at 
 org.apache.hadoop.mapred.MapTask$NewOutputCollector.init(MapTask.java:672)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:740)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:368)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
 at 
 java.security.AccessController.doPrivileged(AccessController.java:362)
 at javax.security.auth.Subject.doAs(Subject.java:573)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1502)
 at org.apache.hadoop.mapred.Child.main(Child.java:249)
 {code}
 serializationFactory.getSerializer and serializationFactory.getDeserializer 
 returns NULL in this case.
 Let's check NPE for serializer/deserializer in MapTask so that we don't get 
 meaningless NPE.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-4462) Enhance readability of TestFairScheduler.java

2014-02-24 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-4462:
-

Fix Version/s: (was: 2.3.0)
   2.4.0

 Enhance readability of TestFairScheduler.java
 -

 Key: MAPREDUCE-4462
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4462
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: scheduler, test
Reporter: Ryan Hennig
Priority: Minor
  Labels: comments, test
 Fix For: 2.4.0

 Attachments: MAPREDUCE-4462.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 While reading over the unit tests for the Fair Scheduler introduced by 
 MAPREDUCE-3451, I added comments to make the logic of the test easier to grok 
 quickly.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5559) Reconsidering the policy of ignoring the blacklist after reaching the threshold

2014-02-24 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-5559:
-

Fix Version/s: (was: 2.3.0)
   2.4.0

 Reconsidering the policy of ignoring the blacklist after reaching the 
 threshold
 ---

 Key: MAPREDUCE-5559
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5559
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.1.1-beta
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.4.0


 Nowadays, when MR AM find the number of blacklisted nodes reaches one 
 threshold, the blacklist will be totally ignored. The newly assigned 
 containers on the blacklisted nodes will be allocated. This may be not the 
 best practice. We need to reconsider of it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5028) Maps fail when io.sort.mb is set to high value

2014-02-24 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-5028:
-

Fix Version/s: (was: 2.3.0)
   2.4.0

 Maps fail when io.sort.mb is set to high value
 --

 Key: MAPREDUCE-5028
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5028
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.1, 2.0.3-alpha, 0.23.5
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Critical
 Fix For: 1.2.0, 2.4.0

 Attachments: MR-5028_testapp.patch, mr-5028-branch1.patch, 
 mr-5028-branch1.patch, mr-5028-branch1.patch, mr-5028-trunk.patch, 
 mr-5028-trunk.patch, mr-5028-trunk.patch, repro-mr-5028.patch


 Verified the problem exists on branch-1 with the following configuration:
 Pseudo-dist mode: 2 maps/ 1 reduce, mapred.child.java.opts=-Xmx2048m, 
 io.sort.mb=1280, dfs.block.size=2147483648
 Run teragen to generate 4 GB data
 Maps fail when you run wordcount on this configuration with the following 
 error: 
 {noformat}
 java.io.IOException: Spill failed
   at 
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1031)
   at 
 org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:692)
   at 
 org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
   at 
 org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:45)
   at 
 org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:34)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.io.EOFException
   at java.io.DataInputStream.readInt(DataInputStream.java:375)
   at org.apache.hadoop.io.IntWritable.readFields(IntWritable.java:38)
   at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
   at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
   at 
 org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116)
   at 
 org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
   at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
   at 
 org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1505)
   at 
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1438)
   at 
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:855)
   at 
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1346)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5036) Default shuffle handler port should not be 8080

2014-02-24 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-5036:
-

Fix Version/s: (was: 2.3.0)
   2.4.0

 Default shuffle handler port should not be 8080
 ---

 Key: MAPREDUCE-5036
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5036
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 2.4.0

 Attachments: MAPREDUCE-5036-13562.patch, MAPREDUCE-5036-2.patch, 
 MAPREDUCE-5036.patch


 The shuffle handler port (mapreduce.shuffle.port) defaults to 8080.  This is 
 a pretty common port for web services, and is likely to cause unnecessary 
 port conflicts.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-4253) Tests for mapreduce-client-core are lying under mapreduce-client-jobclient

2014-02-24 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-4253:
-

Fix Version/s: (was: 2.3.0)
   2.4.0

 Tests for mapreduce-client-core are lying under mapreduce-client-jobclient
 --

 Key: MAPREDUCE-4253
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4253
 Project: Hadoop Map/Reduce
  Issue Type: Task
  Components: client
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Tsuyoshi OZAWA
 Fix For: 2.4.0

 Attachments: MR-4253.1.patch, MR-4253.2.patch, 
 crossing_project_checker.rb, result.txt


 Many of the tests for client libs from mapreduce-client-core are lying under 
 mapreduce-client-jobclient.
 We should investigate if this is the right thing to do and if not, move the 
 tests back into client-core.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-4468) Encapsulate FairScheduler preemption logic into helper class

2014-02-24 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-4468:
-

Fix Version/s: (was: 2.3.0)
   2.4.0

 Encapsulate FairScheduler preemption logic into helper class
 

 Key: MAPREDUCE-4468
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4468
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: scheduler
Reporter: Ryan Hennig
Priority: Minor
  Labels: refactoring, scheduler
 Fix For: 2.4.0

 Attachments: MAPREDUCE-4468.patch

   Original Estimate: 4h
  Remaining Estimate: 4h

 I've extracted the preemption logic from the Fair Scheduler into a helper 
 class so that FairScheduler is closer to following the Single Responsibility 
 Principle.  This may eventually evolve into a generalized preemption module 
 which could be leveraged by other schedulers.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5761) Add a log message like encrypted shuffle is ON in nodemanager logs

2014-02-24 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-5761:
---

Status: Open  (was: Patch Available)

The patch looks fine and trivial except for the message itself. Call it 
encrypted shuffle instead of encryption shuffle?

 Add a log message like encrypted shuffle is ON in nodemanager logs
 

 Key: MAPREDUCE-5761
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5761
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Yesha Vora
Assignee: Jian He
 Attachments: YARN-1739.1.patch


 Currently no log message gets printed for encrypted shuffle which can 
 determine if encrypted shuffle is On or Off.
 Need to add message at Info level such as encrypted shuffle is ON



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5761) Add a log message like encrypted shuffle is ON in nodemanager logs

2014-02-24 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated MAPREDUCE-5761:
---

Attachment: MAPREDUCE-5761.1.patch

Updated the log as suggested

 Add a log message like encrypted shuffle is ON in nodemanager logs
 

 Key: MAPREDUCE-5761
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5761
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Yesha Vora
Assignee: Jian He
 Attachments: MAPREDUCE-5761.1.patch, YARN-1739.1.patch


 Currently no log message gets printed for encrypted shuffle which can 
 determine if encrypted shuffle is On or Off.
 Need to add message at Info level such as encrypted shuffle is ON



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5542) Killing a job just as it finishes can generate an NPE in client

2014-02-24 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-5542:
---

Target Version/s: 2.4.0

 Killing a job just as it finishes can generate an NPE in client
 ---

 Key: MAPREDUCE-5542
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5542
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, mrv2
Affects Versions: 2.1.0-beta, 0.23.9
Reporter: Jason Lowe

 If a client tries to kill a job just as the job is finishing then the client 
 can crash with an NPE.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-5761) Add a log message like encrypted shuffle is ON in nodemanager logs

2014-02-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910949#comment-13910949
 ] 

Hadoop QA commented on MAPREDUCE-5761:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12630818/MAPREDUCE-5761.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4368//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4368//console

This message is automatically generated.

 Add a log message like encrypted shuffle is ON in nodemanager logs
 

 Key: MAPREDUCE-5761
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5761
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Yesha Vora
Assignee: Jian He
 Attachments: MAPREDUCE-5761.1.patch, YARN-1739.1.patch


 Currently no log message gets printed for encrypted shuffle which can 
 determine if encrypted shuffle is On or Off.
 Need to add message at Info level such as encrypted shuffle is ON



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-5754) Preserve Job diagnostics in history

2014-02-24 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910961#comment-13910961
 ] 

Jason Lowe commented on MAPREDUCE-5754:
---

The patch has gone stale and no longer applies cleanly.  My apologies for not 
getting to this sooner.

Some other comments on the patch:

- Nit: Is it appropriate to put a particular diagnostic string in the Job 
interface?  Wondering if this is more appropriately placed in JobImpl, as the 
wording of diagnostic messages seems implementation-specific and not something 
that necessarily belongs in the interface.
- Timeout for TestEvents.testEvent was commented out.
- Why are we joining diagnostics with ','?  JobImpl.getReport() joins with '\n' 
so there would be some inconsistency.
- Why does JobUnsuccessfulCompletionEvent.getDiagnostics() explicitly check for 
N/A and translate it to an empty string?  Can we just have the Avro spec 
default to  and remove this check or is there another use-case for this 
transform?
- Did you run a test where the history server tried to parse an old history 
file generated before this change?  It *should* work given the Avro default, 
but it would be nice to confirm since there were issues in the past with 
compatibility.

 Preserve Job diagnostics in history
 ---

 Key: MAPREDUCE-5754
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5754
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver, mr-am
Affects Versions: 2.2.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5754.v01.patch, MAPREDUCE-5754.v02.patch


 History does not store the runtime diagnostics information. JobHistoryParser 
 tries to blame a task. We propose to preserve the original runtime 
 diagnostics that covers all the cases including the job being killed. This is 
 particularly important in the context of user-supplied diagnostic message as 
 in MAPREDUCE-5648. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5762) Port MAPREDUCE-3223 (Remove MRv1 config from mapred-default.xml) to branch-2

2014-02-24 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated MAPREDUCE-5762:
-

Status: Patch Available  (was: Open)

 Port MAPREDUCE-3223 (Remove MRv1 config from mapred-default.xml) to branch-2
 

 Key: MAPREDUCE-5762
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5762
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.3.0
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
 Attachments: MAPREDUCE-5762-branch-2.patch


 MRv1 configs are removed in trunk, but they are not removed in branch-2.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-5762) Port MAPREDUCE-3223 (Remove MRv1 config from mapred-default.xml) to branch-2

2014-02-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911029#comment-13911029
 ] 

Hadoop QA commented on MAPREDUCE-5762:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12630415/MAPREDUCE-5762-branch-2.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4369//console

This message is automatically generated.

 Port MAPREDUCE-3223 (Remove MRv1 config from mapred-default.xml) to branch-2
 

 Key: MAPREDUCE-5762
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5762
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.3.0
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
 Attachments: MAPREDUCE-5762-branch-2.patch


 MRv1 configs are removed in trunk, but they are not removed in branch-2.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5761) Add a log message like encrypted shuffle is ON in nodemanager logs

2014-02-24 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-5761:
---

Priority: Trivial  (was: Major)

 Add a log message like encrypted shuffle is ON in nodemanager logs
 

 Key: MAPREDUCE-5761
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5761
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Yesha Vora
Assignee: Jian He
Priority: Trivial
 Fix For: 2.4.0

 Attachments: MAPREDUCE-5761.1.patch, YARN-1739.1.patch


 Currently no log message gets printed for encrypted shuffle which can 
 determine if encrypted shuffle is On or Off.
 Need to add message at Info level such as encrypted shuffle is ON



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5761) Add a log message like encrypted shuffle is ON in nodemanager logs

2014-02-24 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-5761:
---

   Resolution: Fixed
Fix Version/s: 2.4.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed this to trunk, branch-2 and branch-2.4 .Thanks Jian!

 Add a log message like encrypted shuffle is ON in nodemanager logs
 

 Key: MAPREDUCE-5761
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5761
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Yesha Vora
Assignee: Jian He
 Fix For: 2.4.0

 Attachments: MAPREDUCE-5761.1.patch, YARN-1739.1.patch


 Currently no log message gets printed for encrypted shuffle which can 
 determine if encrypted shuffle is On or Off.
 Need to add message at Info level such as encrypted shuffle is ON



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-5754) Preserve Job diagnostics in history

2014-02-24 Thread Gera Shegalov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1392#comment-1392
 ] 

Gera Shegalov commented on MAPREDUCE-5754:
--

Thanks [~jlowe] for a review. I tested the change on an old history file as 
this is a major concern for upgrades. I'll follow up on your other points as 
well.

 Preserve Job diagnostics in history
 ---

 Key: MAPREDUCE-5754
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5754
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver, mr-am
Affects Versions: 2.2.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5754.v01.patch, MAPREDUCE-5754.v02.patch


 History does not store the runtime diagnostics information. JobHistoryParser 
 tries to blame a task. We propose to preserve the original runtime 
 diagnostics that covers all the cases including the job being killed. This is 
 particularly important in the context of user-supplied diagnostic message as 
 in MAPREDUCE-5648. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-5761) Add a log message like encrypted shuffle is ON in nodemanager logs

2014-02-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1396#comment-1396
 ] 

Hudson commented on MAPREDUCE-5761:
---

SUCCESS: Integrated in Hadoop-trunk-Commit #5217 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5217/])
MAPREDUCE-5761. Added a simple log message to denote when encrypted shuffle is 
on in the shuffle-handler. Contributed by Jian He. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1571514)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java


 Add a log message like encrypted shuffle is ON in nodemanager logs
 

 Key: MAPREDUCE-5761
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5761
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Yesha Vora
Assignee: Jian He
Priority: Trivial
 Fix For: 2.4.0

 Attachments: MAPREDUCE-5761.1.patch, YARN-1739.1.patch


 Currently no log message gets printed for encrypted shuffle which can 
 determine if encrypted shuffle is On or Off.
 Need to add message at Info level such as encrypted shuffle is ON



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5766) Ping messages from attempts should be moved to DEBUG

2014-02-24 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated MAPREDUCE-5766:
---

Attachment: MAPREDUCE-5766.1.patch

 Ping messages from attempts should be moved to DEBUG
 

 Key: MAPREDUCE-5766
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5766
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Reporter: Ramya Sunil
Assignee: Jian He
Priority: Minor
 Fix For: 0.24.0

 Attachments: MAPREDUCE-5766.1.patch


 Messages such as org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from 
 attempt_1391416522080_0015_m_00_0 in AM logs should be moved to DEBUG.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-5766) Ping messages from attempts should be moved to DEBUG

2014-02-24 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911153#comment-13911153
 ] 

Jian He commented on MAPREDUCE-5766:


Trivial patch to move some logging into debug level.

 Ping messages from attempts should be moved to DEBUG
 

 Key: MAPREDUCE-5766
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5766
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Reporter: Ramya Sunil
Assignee: Jian He
Priority: Minor
 Fix For: 0.24.0

 Attachments: MAPREDUCE-5766.1.patch


 Messages such as org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from 
 attempt_1391416522080_0015_m_00_0 in AM logs should be moved to DEBUG.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-5756) CombineFileInputFormat.getSplits() including directories in its results

2014-02-24 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911155#comment-13911155
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-5756:


git blame tells me this is introduced by MAPREDUCE-1981 which was committed 
to 0.23.10 and 2.1.1-beta. This is the interesting bit of that patch:
{code}
@@ -169,13 +171,17 @@ public static PathFilter getInputPathFilter(JobConf conf) 
{
   protected void addInputPathRecursively(ListFileStatus result,
   FileSystem fs, Path path, PathFilter inputFilter)
   throws IOException {
-for(FileStatus stat: fs.listStatus(path, inputFilter)) {
-  if (stat.isDirectory()) {
-addInputPathRecursively(result, fs, stat.getPath(), inputFilter);
-  } else {
-result.add(stat);
+RemoteIteratorLocatedFileStatus iter = fs.listLocatedStatus(path);
+while (iter.hasNext()) {
+  LocatedFileStatus stat = iter.next();
+  if (inputFilter.accept(stat.getPath())) {
+if (stat.isDirectory()) {
+  addInputPathRecursively(result, fs, stat.getPath(), inputFilter);
+} else {
+  result.add(stat);
+}
   }
-}
+}
   }
{code}

Clearly, before 0.23.10 and 2.1.1-beta, the behavior was to exclude 
directories. So should we treat it as incorrect behavior and fix it?

 CombineFileInputFormat.getSplits() including directories in its results
 ---

 Key: MAPREDUCE-5756
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5756
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Jason Dere

 Trying to track down HIVE-6401, where we see some is not a file errors 
 because getSplits() is giving us directories.  I believe the culprit is 
 FileInputFormat.listStatus():
 {code}
 if (recursive  stat.isDirectory()) {
   addInputPathRecursively(result, fs, stat.getPath(),
   inputFilter);
 } else {
   result.add(stat);
 }
 {code}
 Which seems to be allowing directories to be added to the results if 
 recursive is false.  Is this meant to return directories? If not, I think it 
 should look like this:
 {code}
 if (stat.isDirectory()) {
  if (recursive) {
   addInputPathRecursively(result, fs, stat.getPath(),
   inputFilter);
  }
 } else {
   result.add(stat);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-5764) Potential NullPointerException in YARNRunner.killJob(JobID arg0)

2014-02-24 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911172#comment-13911172
 ] 

Rohith commented on MAPREDUCE-5764:
---

Yes Jason Lowe, it is duplicate. I shall take care before raising an issue 
avoding Jira Id waste :-(

 Potential NullPointerException in YARNRunner.killJob(JobID arg0)
 

 Key: MAPREDUCE-5764
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5764
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Rohith
Assignee: Rohith

 I found YARNRunner.killJob(JobID arg0) can throw NullPointerExpetion if job 
 status is null. 
 bq. clientCache.getClient(arg0).getJobStatus(arg0);  can be null.
 This can happen when there is history write is failed because of hdfs errors 
 or staging directory is different from history server..
  
 We need to have null check otherwise killJob() is prone to throw NPE which 
 cause joblient to exit.
 {noformat}
 @Override
   public void killJob(JobID arg0) throws IOException, InterruptedException {
 /* check if the status is not running, if not send kill to RM */
 JobStatus status = clientCache.getClient(arg0).getJobStatus(arg0);
 if (status.getState() != JobStatus.State.RUNNING) {
   try {
 resMgrDelegate.killApplication(TypeConverter.toYarn(arg0).getAppId());
   } catch (YarnException e) {
 throw new IOException(e);
   }
   return;
 }
 ...
 ..
 ...
   }
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Assigned] (MAPREDUCE-5542) Killing a job just as it finishes can generate an NPE in client

2014-02-24 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith reassigned MAPREDUCE-5542:
-

Assignee: Rohith

 Killing a job just as it finishes can generate an NPE in client
 ---

 Key: MAPREDUCE-5542
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5542
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, mrv2
Affects Versions: 2.1.0-beta, 0.23.9
Reporter: Jason Lowe
Assignee: Rohith

 If a client tries to kill a job just as the job is finishing then the client 
 can crash with an NPE.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-5766) Ping messages from attempts should be moved to DEBUG

2014-02-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911179#comment-13911179
 ] 

Hadoop QA commented on MAPREDUCE-5766:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12630868/MAPREDUCE-5766.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4370//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4370//console

This message is automatically generated.

 Ping messages from attempts should be moved to DEBUG
 

 Key: MAPREDUCE-5766
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5766
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Reporter: Ramya Sunil
Assignee: Jian He
Priority: Minor
 Fix For: 0.24.0

 Attachments: MAPREDUCE-5766.1.patch


 Messages such as org.apache.hadoop.mapred.TaskAttemptListenerImpl: Ping from 
 attempt_1391416522080_0015_m_00_0 in AM logs should be moved to DEBUG.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-5542) Killing a job just as it finishes can generate an NPE in client

2014-02-24 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911186#comment-13911186
 ] 

Rohith commented on MAPREDUCE-5542:
---

There are 2 solution we can solve.
1. Just have null check and return with log warning No Jobs to kill.
JobStatus status = clientCache.getClient(arg0).getJobStatus(arg0);
if (status == null) {
  LOG.warn(Attempting to Kill Job which is Not Running.);
  return;
}

2. Maintain consistency acros other API's such as 
org.apache.hadoop.mapreduce.Job.updateStatus().
JobStatus status = clientCache.getClient(arg0).getJobStatus(arg0);
if (status == null) {
  throw new IOException(Job status not available );
} 

I prefer to have 2nd, let client handle this exception.

 Killing a job just as it finishes can generate an NPE in client
 ---

 Key: MAPREDUCE-5542
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5542
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, mrv2
Affects Versions: 2.1.0-beta, 0.23.9
Reporter: Jason Lowe
Assignee: Rohith

 If a client tries to kill a job just as the job is finishing then the client 
 can crash with an NPE.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5542) Killing a job just as it finishes can generate an NPE in client

2014-02-24 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated MAPREDUCE-5542:
--

Status: Patch Available  (was: Open)

 Killing a job just as it finishes can generate an NPE in client
 ---

 Key: MAPREDUCE-5542
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5542
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, mrv2
Affects Versions: 0.23.9, 2.1.0-beta
Reporter: Jason Lowe
Assignee: Rohith
 Attachments: MAPREDUCE-5542.1.patch


 If a client tries to kill a job just as the job is finishing then the client 
 can crash with an NPE.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5542) Killing a job just as it finishes can generate an NPE in client

2014-02-24 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated MAPREDUCE-5542:
--

Attachment: MAPREDUCE-5542.1.patch

Attaching a patch for 2nd. Please review.. 

 Killing a job just as it finishes can generate an NPE in client
 ---

 Key: MAPREDUCE-5542
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5542
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, mrv2
Affects Versions: 2.1.0-beta, 0.23.9
Reporter: Jason Lowe
Assignee: Rohith
 Attachments: MAPREDUCE-5542.1.patch


 If a client tries to kill a job just as the job is finishing then the client 
 can crash with an NPE.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)