[jira] [Commented] (MAPREDUCE-6485) MR job hanged forever because all resources are taken up by reducers and the last map attempt never get resource to run

2015-10-02 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941211#comment-14941211
 ] 

Rohith Sharma K S commented on MAPREDUCE-6485:
--

Committing shortly.. 

> MR job hanged forever because all resources are taken up by reducers and the 
> last map attempt never get resource to run
> ---
>
> Key: MAPREDUCE-6485
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6485
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 3.0.0, 2.4.1, 2.6.0, 2.7.1
>Reporter: Bob
>Assignee: Xianyin Xin
>Priority: Critical
> Attachments: MAPREDUCE-6485.001.patch, MAPREDUCE-6485.004.patch, 
> MAPREDUCE-6485.005.patch, MAPREDUCE-6485.006.patch, MAPREDUCE-6845.002.patch, 
> MAPREDUCE-6845.003.patch
>
>
> The scenarios is like this:
> With configuring mapreduce.job.reduce.slowstart.completedmaps=0.8, reduces 
> will take resource and  start to run when all the map have not finished. 
> But It could happened that when all the resources are taken up by running 
> reduces, there is still one map not finished. 
> Under this condition , the last map have two task attempts .
> As for the first attempt was killed due to timeout(mapreduce.task.timeout), 
> and its state transitioned from RUNNING to FAIL_CONTAINER_CLEANUP then to 
> FAILED, but failed map attempt would not be restarted for there is still one 
> speculate map attempt in progressing. 
> As for the second attempt which was started due to having enable map task 
> speculative is pending at UNASSINGED state because of no resource available. 
> But the second map attempt request have lower priority than reduces, so 
> preemption would not happened.
> As a result all reduces would not finished because of there is one map left. 
> and the last map hanged there because of no resource available. so, the job 
> would never finish.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MAPREDUCE-6488) Make buffer size in PipeMapRed configurable

2015-10-02 Thread He Tianyi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Tianyi resolved MAPREDUCE-6488.
--
Resolution: Invalid

> Make buffer size in PipeMapRed configurable
> ---
>
> Key: MAPREDUCE-6488
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6488
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: He Tianyi
>Assignee: He Tianyi
>
> Default value of buffer size is 128K in {{PipeMapRed}}.
> When mapper input record is large enough that it won't fit in buffer, 
> {{MapRunner}} blocks until written. If child process and input reader are 
> both slow (due to calculation and decompress), then process of decoding and 
> reading will rarely overlap with each other, hurting performance.
> I suppose we should make the buffer size configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6302) Incorrect headroom can lead to a deadlock between map and reduce allocations

2015-10-02 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941135#comment-14941135
 ] 

Jason Lowe commented on MAPREDUCE-6302:
---

Since the old code also doesn't preempt if there's room for one map then I'm OK 
with the current logic.  I just didn't want a regression.  And as for SHUFFLE 
phase awareness, I agree that's best left for a followup JIRA.


> Incorrect headroom can lead to a deadlock between map and reduce allocations 
> -
>
> Key: MAPREDUCE-6302
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6302
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: mai shurong
>Assignee: Karthik Kambatla
>Priority: Critical
> Attachments: AM_log_head10.txt.gz, AM_log_tail10.txt.gz, 
> log.txt, mr-6302-1.patch, mr-6302-2.patch, mr-6302-3.patch, mr-6302-4.patch, 
> mr-6302-prelim.patch, queue_with_max163cores.png, queue_with_max263cores.png, 
> queue_with_max333cores.png
>
>
> I submit a  big job, which has 500 maps and 350 reduce, to a 
> queue(fairscheduler) with 300 max cores. When the big mapreduce job is 
> running 100% maps, the 300 reduces have occupied 300 max cores in the queue. 
> And then, a map fails and retry, waiting for a core, while the 300 reduces 
> are waiting for failed map to finish. So a deadlock occur. As a result, the 
> job is blocked, and the later job in the queue cannot run because no 
> available cores in the queue.
> I think there is the similar issue for memory of a queue .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6485) MR job hanged forever because all resources are taken up by reducers and the last map attempt never get resource to run

2015-10-02 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated MAPREDUCE-6485:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

committed to branch-2/trunk.. Thanks [~xinxianyin] for contributions!! [~kasha] 
for the additional review.. 

> MR job hanged forever because all resources are taken up by reducers and the 
> last map attempt never get resource to run
> ---
>
> Key: MAPREDUCE-6485
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6485
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 3.0.0, 2.4.1, 2.6.0, 2.7.1
>Reporter: Bob
>Assignee: Xianyin Xin
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: MAPREDUCE-6485.001.patch, MAPREDUCE-6485.004.patch, 
> MAPREDUCE-6485.005.patch, MAPREDUCE-6485.006.patch, MAPREDUCE-6845.002.patch, 
> MAPREDUCE-6845.003.patch
>
>
> The scenarios is like this:
> With configuring mapreduce.job.reduce.slowstart.completedmaps=0.8, reduces 
> will take resource and  start to run when all the map have not finished. 
> But It could happened that when all the resources are taken up by running 
> reduces, there is still one map not finished. 
> Under this condition , the last map have two task attempts .
> As for the first attempt was killed due to timeout(mapreduce.task.timeout), 
> and its state transitioned from RUNNING to FAIL_CONTAINER_CLEANUP then to 
> FAILED, but failed map attempt would not be restarted for there is still one 
> speculate map attempt in progressing. 
> As for the second attempt which was started due to having enable map task 
> speculative is pending at UNASSINGED state because of no resource available. 
> But the second map attempt request have lower priority than reduces, so 
> preemption would not happened.
> As a result all reduces would not finished because of there is one map left. 
> and the last map hanged there because of no resource available. so, the job 
> would never finish.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6485) MR job hanged forever because all resources are taken up by reducers and the last map attempt never get resource to run

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941241#comment-14941241
 ] 

Hudson commented on MAPREDUCE-6485:
---

FAILURE: Integrated in Hadoop-trunk-Commit #8554 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8554/])
MAPREDUCE-6485. Create a new task attempt with failed map task priority 
(rohithsharmaks: rev 439f43ad3defbac907eda2d139a793f153544430)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java


> MR job hanged forever because all resources are taken up by reducers and the 
> last map attempt never get resource to run
> ---
>
> Key: MAPREDUCE-6485
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6485
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 3.0.0, 2.4.1, 2.6.0, 2.7.1
>Reporter: Bob
>Assignee: Xianyin Xin
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: MAPREDUCE-6485.001.patch, MAPREDUCE-6485.004.patch, 
> MAPREDUCE-6485.005.patch, MAPREDUCE-6485.006.patch, MAPREDUCE-6845.002.patch, 
> MAPREDUCE-6845.003.patch
>
>
> The scenarios is like this:
> With configuring mapreduce.job.reduce.slowstart.completedmaps=0.8, reduces 
> will take resource and  start to run when all the map have not finished. 
> But It could happened that when all the resources are taken up by running 
> reduces, there is still one map not finished. 
> Under this condition , the last map have two task attempts .
> As for the first attempt was killed due to timeout(mapreduce.task.timeout), 
> and its state transitioned from RUNNING to FAIL_CONTAINER_CLEANUP then to 
> FAILED, but failed map attempt would not be restarted for there is still one 
> speculate map attempt in progressing. 
> As for the second attempt which was started due to having enable map task 
> speculative is pending at UNASSINGED state because of no resource available. 
> But the second map attempt request have lower priority than reduces, so 
> preemption would not happened.
> As a result all reduces would not finished because of there is one map left. 
> and the last map hanged there because of no resource available. so, the job 
> would never finish.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6485) MR job hanged forever because all resources are taken up by reducers and the last map attempt never get resource to run

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941295#comment-14941295
 ] 

Hudson commented on MAPREDUCE-6485:
---

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #479 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/479/])
MAPREDUCE-6485. Create a new task attempt with failed map task priority 
(rohithsharmaks: rev 439f43ad3defbac907eda2d139a793f153544430)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java
* hadoop-mapreduce-project/CHANGES.txt


> MR job hanged forever because all resources are taken up by reducers and the 
> last map attempt never get resource to run
> ---
>
> Key: MAPREDUCE-6485
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6485
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 3.0.0, 2.4.1, 2.6.0, 2.7.1
>Reporter: Bob
>Assignee: Xianyin Xin
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: MAPREDUCE-6485.001.patch, MAPREDUCE-6485.004.patch, 
> MAPREDUCE-6485.005.patch, MAPREDUCE-6485.006.patch, MAPREDUCE-6845.002.patch, 
> MAPREDUCE-6845.003.patch
>
>
> The scenarios is like this:
> With configuring mapreduce.job.reduce.slowstart.completedmaps=0.8, reduces 
> will take resource and  start to run when all the map have not finished. 
> But It could happened that when all the resources are taken up by running 
> reduces, there is still one map not finished. 
> Under this condition , the last map have two task attempts .
> As for the first attempt was killed due to timeout(mapreduce.task.timeout), 
> and its state transitioned from RUNNING to FAIL_CONTAINER_CLEANUP then to 
> FAILED, but failed map attempt would not be restarted for there is still one 
> speculate map attempt in progressing. 
> As for the second attempt which was started due to having enable map task 
> speculative is pending at UNASSINGED state because of no resource available. 
> But the second map attempt request have lower priority than reduces, so 
> preemption would not happened.
> As a result all reduces would not finished because of there is one map left. 
> and the last map hanged there because of no resource available. so, the job 
> would never finish.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6485) MR job hanged forever because all resources are taken up by reducers and the last map attempt never get resource to run

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941433#comment-14941433
 ] 

Hudson commented on MAPREDUCE-6485:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #471 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/471/])
MAPREDUCE-6485. Create a new task attempt with failed map task priority 
(rohithsharmaks: rev 439f43ad3defbac907eda2d139a793f153544430)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java


> MR job hanged forever because all resources are taken up by reducers and the 
> last map attempt never get resource to run
> ---
>
> Key: MAPREDUCE-6485
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6485
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 3.0.0, 2.4.1, 2.6.0, 2.7.1
>Reporter: Bob
>Assignee: Xianyin Xin
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: MAPREDUCE-6485.001.patch, MAPREDUCE-6485.004.patch, 
> MAPREDUCE-6485.005.patch, MAPREDUCE-6485.006.patch, MAPREDUCE-6845.002.patch, 
> MAPREDUCE-6845.003.patch
>
>
> The scenarios is like this:
> With configuring mapreduce.job.reduce.slowstart.completedmaps=0.8, reduces 
> will take resource and  start to run when all the map have not finished. 
> But It could happened that when all the resources are taken up by running 
> reduces, there is still one map not finished. 
> Under this condition , the last map have two task attempts .
> As for the first attempt was killed due to timeout(mapreduce.task.timeout), 
> and its state transitioned from RUNNING to FAIL_CONTAINER_CLEANUP then to 
> FAILED, but failed map attempt would not be restarted for there is still one 
> speculate map attempt in progressing. 
> As for the second attempt which was started due to having enable map task 
> speculative is pending at UNASSINGED state because of no resource available. 
> But the second map attempt request have lower priority than reduces, so 
> preemption would not happened.
> As a result all reduces would not finished because of there is one map left. 
> and the last map hanged there because of no resource available. so, the job 
> would never finish.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6485) MR job hanged forever because all resources are taken up by reducers and the last map attempt never get resource to run

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941511#comment-14941511
 ] 

Hudson commented on MAPREDUCE-6485:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2414 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2414/])
MAPREDUCE-6485. Create a new task attempt with failed map task priority 
(rohithsharmaks: rev 439f43ad3defbac907eda2d139a793f153544430)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java


> MR job hanged forever because all resources are taken up by reducers and the 
> last map attempt never get resource to run
> ---
>
> Key: MAPREDUCE-6485
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6485
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 3.0.0, 2.4.1, 2.6.0, 2.7.1
>Reporter: Bob
>Assignee: Xianyin Xin
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: MAPREDUCE-6485.001.patch, MAPREDUCE-6485.004.patch, 
> MAPREDUCE-6485.005.patch, MAPREDUCE-6485.006.patch, MAPREDUCE-6845.002.patch, 
> MAPREDUCE-6845.003.patch
>
>
> The scenarios is like this:
> With configuring mapreduce.job.reduce.slowstart.completedmaps=0.8, reduces 
> will take resource and  start to run when all the map have not finished. 
> But It could happened that when all the resources are taken up by running 
> reduces, there is still one map not finished. 
> Under this condition , the last map have two task attempts .
> As for the first attempt was killed due to timeout(mapreduce.task.timeout), 
> and its state transitioned from RUNNING to FAIL_CONTAINER_CLEANUP then to 
> FAILED, but failed map attempt would not be restarted for there is still one 
> speculate map attempt in progressing. 
> As for the second attempt which was started due to having enable map task 
> speculative is pending at UNASSINGED state because of no resource available. 
> But the second map attempt request have lower priority than reduces, so 
> preemption would not happened.
> As a result all reduces would not finished because of there is one map left. 
> and the last map hanged there because of no resource available. so, the job 
> would never finish.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6485) MR job hanged forever because all resources are taken up by reducers and the last map attempt never get resource to run

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941455#comment-14941455
 ] 

Hudson commented on MAPREDUCE-6485:
---

SUCCESS: Integrated in Hadoop-Yarn-trunk #1209 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1209/])
MAPREDUCE-6485. Create a new task attempt with failed map task priority 
(rohithsharmaks: rev 439f43ad3defbac907eda2d139a793f153544430)
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java


> MR job hanged forever because all resources are taken up by reducers and the 
> last map attempt never get resource to run
> ---
>
> Key: MAPREDUCE-6485
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6485
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 3.0.0, 2.4.1, 2.6.0, 2.7.1
>Reporter: Bob
>Assignee: Xianyin Xin
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: MAPREDUCE-6485.001.patch, MAPREDUCE-6485.004.patch, 
> MAPREDUCE-6485.005.patch, MAPREDUCE-6485.006.patch, MAPREDUCE-6845.002.patch, 
> MAPREDUCE-6845.003.patch
>
>
> The scenarios is like this:
> With configuring mapreduce.job.reduce.slowstart.completedmaps=0.8, reduces 
> will take resource and  start to run when all the map have not finished. 
> But It could happened that when all the resources are taken up by running 
> reduces, there is still one map not finished. 
> Under this condition , the last map have two task attempts .
> As for the first attempt was killed due to timeout(mapreduce.task.timeout), 
> and its state transitioned from RUNNING to FAIL_CONTAINER_CLEANUP then to 
> FAILED, but failed map attempt would not be restarted for there is still one 
> speculate map attempt in progressing. 
> As for the second attempt which was started due to having enable map task 
> speculative is pending at UNASSINGED state because of no resource available. 
> But the second map attempt request have lower priority than reduces, so 
> preemption would not happened.
> As a result all reduces would not finished because of there is one map left. 
> and the last map hanged there because of no resource available. so, the job 
> would never finish.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6491) Environment variable handling assumes values should be appended

2015-10-02 Thread Dustin Cote (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941585#comment-14941585
 ] 

Dustin Cote commented on MAPREDUCE-6491:


[~jlowe], yes I'll check it out now.  I was building against trunk and it 
looked clean there.  Let me see how it goes with branch-2.

> Environment variable handling assumes values should be appended
> ---
>
> Key: MAPREDUCE-6491
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6491
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jason Lowe
>Assignee: Dustin Cote
> Attachments: YARN-2369-1.patch, YARN-2369-2.patch, YARN-2369-3.patch, 
> YARN-2369-4.patch, YARN-2369-5.patch, YARN-2369-6.patch, YARN-2369-7.patch, 
> YARN-2369-8.patch, YARN-2369-9.patch
>
>
> When processing environment variables for a container context the code 
> assumes that the value should be appended to any pre-existing value in the 
> environment.  This may be desired behavior for handling path-like environment 
> variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a 
> non-intuitive and harmful way to handle any variable that does not have 
> path-like semantics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6485) MR job hanged forever because all resources are taken up by reducers and the last map attempt never get resource to run

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941598#comment-14941598
 ] 

Hudson commented on MAPREDUCE-6485:
---

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #445 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/445/])
MAPREDUCE-6485. Create a new task attempt with failed map task priority 
(rohithsharmaks: rev 439f43ad3defbac907eda2d139a793f153544430)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java


> MR job hanged forever because all resources are taken up by reducers and the 
> last map attempt never get resource to run
> ---
>
> Key: MAPREDUCE-6485
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6485
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 3.0.0, 2.4.1, 2.6.0, 2.7.1
>Reporter: Bob
>Assignee: Xianyin Xin
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: MAPREDUCE-6485.001.patch, MAPREDUCE-6485.004.patch, 
> MAPREDUCE-6485.005.patch, MAPREDUCE-6485.006.patch, MAPREDUCE-6845.002.patch, 
> MAPREDUCE-6845.003.patch
>
>
> The scenarios is like this:
> With configuring mapreduce.job.reduce.slowstart.completedmaps=0.8, reduces 
> will take resource and  start to run when all the map have not finished. 
> But It could happened that when all the resources are taken up by running 
> reduces, there is still one map not finished. 
> Under this condition , the last map have two task attempts .
> As for the first attempt was killed due to timeout(mapreduce.task.timeout), 
> and its state transitioned from RUNNING to FAIL_CONTAINER_CLEANUP then to 
> FAILED, but failed map attempt would not be restarted for there is still one 
> speculate map attempt in progressing. 
> As for the second attempt which was started due to having enable map task 
> speculative is pending at UNASSINGED state because of no resource available. 
> But the second map attempt request have lower priority than reduces, so 
> preemption would not happened.
> As a result all reduces would not finished because of there is one map left. 
> and the last map hanged there because of no resource available. so, the job 
> would never finish.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic

2015-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941676#comment-14941676
 ] 

Hadoop QA commented on MAPREDUCE-6451:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m  7s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 54s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  7s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 16s | The applied patch generated 
1 release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 26s | The applied patch generated  3 
new checkstyle issues (total was 36, now 28). |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 29s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 47s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | tools/hadoop tests |   6m 32s | Tests passed in 
hadoop-distcp. |
| | |  44m 19s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764830/MAPREDUCE-6451-v4.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 1037ee5 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6050/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6050/artifact/patchprocess/diffcheckstylehadoop-distcp.txt
 |
| hadoop-distcp test log | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6050/artifact/patchprocess/testrun_hadoop-distcp.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6050/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6050/console |


This message was automatically generated.

> DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
> -
>
> Key: MAPREDUCE-6451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, 
> MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch
>
>
> DistCp when used with dynamic strategy does not update the chunkFilePath and 
> other static variables any time other than for the first job. This is seen 
> when DistCp::run() is used. 
> A single copy succeeds but multiple jobs finish successfully without any real 
> copying. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic

2015-10-02 Thread Kuhu Shukla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kuhu Shukla updated MAPREDUCE-6451:
---
Attachment: MAPREDUCE-6451-v4.patch

Thank you [~eepayne] for your review comments. Updated the patch. For the 
checkstyle issue of missing package-info.java , the file was not there for 
mapred/lib directory before and therefore I did not add one now. There is one 
for tools package already. I was referring the wrong .class  for the new file, 
corrected that now. Request for review.

> DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
> -
>
> Key: MAPREDUCE-6451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, 
> MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch
>
>
> DistCp when used with dynamic strategy does not update the chunkFilePath and 
> other static variables any time other than for the first job. This is seen 
> when DistCp::run() is used. 
> A single copy succeeds but multiple jobs finish successfully without any real 
> copying. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6485) MR job hanged forever because all resources are taken up by reducers and the last map attempt never get resource to run

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941658#comment-14941658
 ] 

Hudson commented on MAPREDUCE-6485:
---

FAILURE: Integrated in Hadoop-Hdfs-trunk #2385 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2385/])
MAPREDUCE-6485. Create a new task attempt with failed map task priority 
(rohithsharmaks: rev 439f43ad3defbac907eda2d139a793f153544430)
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java


> MR job hanged forever because all resources are taken up by reducers and the 
> last map attempt never get resource to run
> ---
>
> Key: MAPREDUCE-6485
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6485
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 3.0.0, 2.4.1, 2.6.0, 2.7.1
>Reporter: Bob
>Assignee: Xianyin Xin
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: MAPREDUCE-6485.001.patch, MAPREDUCE-6485.004.patch, 
> MAPREDUCE-6485.005.patch, MAPREDUCE-6485.006.patch, MAPREDUCE-6845.002.patch, 
> MAPREDUCE-6845.003.patch
>
>
> The scenarios is like this:
> With configuring mapreduce.job.reduce.slowstart.completedmaps=0.8, reduces 
> will take resource and  start to run when all the map have not finished. 
> But It could happened that when all the resources are taken up by running 
> reduces, there is still one map not finished. 
> Under this condition , the last map have two task attempts .
> As for the first attempt was killed due to timeout(mapreduce.task.timeout), 
> and its state transitioned from RUNNING to FAIL_CONTAINER_CLEANUP then to 
> FAILED, but failed map attempt would not be restarted for there is still one 
> speculate map attempt in progressing. 
> As for the second attempt which was started due to having enable map task 
> speculative is pending at UNASSINGED state because of no resource available. 
> But the second map attempt request have lower priority than reduces, so 
> preemption would not happened.
> As a result all reduces would not finished because of there is one map left. 
> and the last map hanged there because of no resource available. so, the job 
> would never finish.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6500) DynamicInputChunk and DynamicRecordReader class has no unit tests

2015-10-02 Thread Kuhu Shukla (JIRA)
Kuhu Shukla created MAPREDUCE-6500:
--

 Summary: DynamicInputChunk and DynamicRecordReader class has no 
unit tests
 Key: MAPREDUCE-6500
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6500
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: distcp
Reporter: Kuhu Shukla
Assignee: Kuhu Shukla
Priority: Minor


The Dynamic strategy of DistCp has test coverage only for its InputFormat 
class. It would be nice to have coverage for DynamicRecordReader and 
DynamicInputChunk classes as well



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic

2015-10-02 Thread Kuhu Shukla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kuhu Shukla updated MAPREDUCE-6451:
---
Attachment: MAPREDUCE-6451-v4.patch

The release audit warning : 
{noformat}
 !? 
/home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/hadoop-common-project/hadoop-common/CHANGES-HDFS-EC-7285.txt
Lines that start with ? in the release audit  report indicate files that do 
not have an Apache license header.
{noformat}
seems unrelated. 

Removed the whitespace issue. Rest are the same as before.

> DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
> -
>
> Key: MAPREDUCE-6451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, 
> MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch
>
>
> DistCp when used with dynamic strategy does not update the chunkFilePath and 
> other static variables any time other than for the first job. This is seen 
> when DistCp::run() is used. 
> A single copy succeeds but multiple jobs finish successfully without any real 
> copying. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic

2015-10-02 Thread Kuhu Shukla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kuhu Shukla updated MAPREDUCE-6451:
---
Attachment: (was: MAPREDUCE-6451-v4.patch)

> DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
> -
>
> Key: MAPREDUCE-6451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, 
> MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch
>
>
> DistCp when used with dynamic strategy does not update the chunkFilePath and 
> other static variables any time other than for the first job. This is seen 
> when DistCp::run() is used. 
> A single copy succeeds but multiple jobs finish successfully without any real 
> copying. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic

2015-10-02 Thread Kuhu Shukla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kuhu Shukla updated MAPREDUCE-6451:
---
Attachment: MAPREDUCE-6451-v5.patch

> DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
> -
>
> Key: MAPREDUCE-6451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, 
> MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch
>
>
> DistCp when used with dynamic strategy does not update the chunkFilePath and 
> other static variables any time other than for the first job. This is seen 
> when DistCp::run() is used. 
> A single copy succeeds but multiple jobs finish successfully without any real 
> copying. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic

2015-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941784#comment-14941784
 ] 

Hadoop QA commented on MAPREDUCE-6451:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 23s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 57s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  11m 43s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 18s | The applied patch generated 
1 release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 27s | The applied patch generated  2 
new checkstyle issues (total was 36, now 27). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 48s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 40s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 56s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | tools/hadoop tests |   7m 40s | Tests passed in 
hadoop-distcp. |
| | |  47m 55s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764842/MAPREDUCE-6451-v5.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / fdf02d1 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6051/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6051/artifact/patchprocess/diffcheckstylehadoop-distcp.txt
 |
| hadoop-distcp test log | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6051/artifact/patchprocess/testrun_hadoop-distcp.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6051/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6051/console |


This message was automatically generated.

> DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
> -
>
> Key: MAPREDUCE-6451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, 
> MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch
>
>
> DistCp when used with dynamic strategy does not update the chunkFilePath and 
> other static variables any time other than for the first job. This is seen 
> when DistCp::run() is used. 
> A single copy succeeds but multiple jobs finish successfully without any real 
> copying. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)