[jira] [Commented] (MAPREDUCE-6257) Document encrypted spills

2015-08-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661837#comment-14661837
 ] 

Hudson commented on MAPREDUCE-6257:
---

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #269 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/269/])
MAPREDUCE-6257. Document encrypted spills (Bibin A Chundatt via aw) (aw: rev 
fb1be0b3100cdd69f6dc1987585fcadd4e7c8a2a)
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/EncryptedShuffle.md
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml


 Document encrypted spills
 -

 Key: MAPREDUCE-6257
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6257
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: security
Reporter: Allen Wittenauer
Assignee: Bibin A Chundatt
 Fix For: 3.0.0

 Attachments: 0001-MAPREDUCE-6257.patch, 0002-MAPREDUCE-6257.patch, 
 0003-MAPREDUCE-6257.patch, EncryptedShuffle.html


 Encrypted spills appear to be completely undocumented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6443) Add JvmPauseMonitor to Job History Server

2015-08-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661841#comment-14661841
 ] 

Hudson commented on MAPREDUCE-6443:
---

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #269 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/269/])
MAPREDUCE-6443. Add JvmPauseMonitor to JobHistoryServer. Contributed by Robert 
Kanter. (junping_du: rev e73a928a6360f68aaee2ed58b3a8d180f4051407)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java
* hadoop-mapreduce-project/CHANGES.txt


 Add JvmPauseMonitor to Job History Server
 -

 Key: MAPREDUCE-6443
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6443
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Affects Versions: 2.8.0
Reporter: Robert Kanter
Assignee: Robert Kanter
 Fix For: 2.8.0

 Attachments: MAPREDUCE-6443.001.patch, MAPREDUCE-6443.002.patch


 We should add the {{JvmPauseMonitor}} from HADOOP-9618 to the Job History 
 Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6257) Document encrypted spills

2015-08-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661860#comment-14661860
 ] 

Hudson commented on MAPREDUCE-6257:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #277 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/277/])
MAPREDUCE-6257. Document encrypted spills (Bibin A Chundatt via aw) (aw: rev 
fb1be0b3100cdd69f6dc1987585fcadd4e7c8a2a)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/EncryptedShuffle.md


 Document encrypted spills
 -

 Key: MAPREDUCE-6257
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6257
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: security
Reporter: Allen Wittenauer
Assignee: Bibin A Chundatt
 Fix For: 3.0.0

 Attachments: 0001-MAPREDUCE-6257.patch, 0002-MAPREDUCE-6257.patch, 
 0003-MAPREDUCE-6257.patch, EncryptedShuffle.html


 Encrypted spills appear to be completely undocumented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Moved] (MAPREDUCE-6446) Support SSL for AM webapp

2015-08-07 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena moved YARN-3651 to MAPREDUCE-6446:
---

Affects Version/s: (was: 2.7.0)
   2.7.0
  Component/s: (was: applications)
   (was: resourcemanager)
   resourcemanager
  Key: MAPREDUCE-6446  (was: YARN-3651)
  Project: Hadoop Map/Reduce  (was: Hadoop YARN)

 Support SSL for AM webapp
 -

 Key: MAPREDUCE-6446
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6446
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.7.0
 Environment: Suse 11 Sp3
Reporter: Bibin A Chundatt
Priority: Minor

 Application URL in Application CLI wrong
 Steps to reproduce
 ==
 1. Start HA setup insecure mode
 2.Configure HTTPS_ONLY
 3.Submit application to cluster
 4.Execute command ./yarn application -list
 5.Observer tracking URL shown
 {code}
 15/05/15 13:34:38 INFO client.AHSProxy: Connecting to Application History 
 server at /IP:45034
 Total number of applications (application-types: [] and states: [SUBMITTED, 
 ACCEPTED, RUNNING]):1
 Application-Id --- Tracking-URL
 application_1431672734347_0003   *http://host-10-19-92-117:13013*
 {code}
 *Expected*
 https://IP:64323/proxy/application_1431672734347_0003 /



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6443) Add JvmPauseMonitor to Job History Server

2015-08-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661632#comment-14661632
 ] 

Hudson commented on MAPREDUCE-6443:
---

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #280 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/280/])
MAPREDUCE-6443. Add JvmPauseMonitor to JobHistoryServer. Contributed by Robert 
Kanter. (junping_du: rev e73a928a6360f68aaee2ed58b3a8d180f4051407)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java
* hadoop-mapreduce-project/CHANGES.txt


 Add JvmPauseMonitor to Job History Server
 -

 Key: MAPREDUCE-6443
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6443
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Affects Versions: 2.8.0
Reporter: Robert Kanter
Assignee: Robert Kanter
 Fix For: 2.8.0

 Attachments: MAPREDUCE-6443.001.patch, MAPREDUCE-6443.002.patch


 We should add the {{JvmPauseMonitor}} from HADOOP-9618 to the Job History 
 Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6257) Document encrypted spills

2015-08-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661628#comment-14661628
 ] 

Hudson commented on MAPREDUCE-6257:
---

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #280 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/280/])
MAPREDUCE-6257. Document encrypted spills (Bibin A Chundatt via aw) (aw: rev 
fb1be0b3100cdd69f6dc1987585fcadd4e7c8a2a)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/EncryptedShuffle.md
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
* hadoop-mapreduce-project/CHANGES.txt


 Document encrypted spills
 -

 Key: MAPREDUCE-6257
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6257
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: security
Reporter: Allen Wittenauer
Assignee: Bibin A Chundatt
 Fix For: 3.0.0

 Attachments: 0001-MAPREDUCE-6257.patch, 0002-MAPREDUCE-6257.patch, 
 0003-MAPREDUCE-6257.patch, EncryptedShuffle.html


 Encrypted spills appear to be completely undocumented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6443) Add JvmPauseMonitor to Job History Server

2015-08-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661642#comment-14661642
 ] 

Hudson commented on MAPREDUCE-6443:
---

FAILURE: Integrated in Hadoop-Yarn-trunk #1010 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1010/])
MAPREDUCE-6443. Add JvmPauseMonitor to JobHistoryServer. Contributed by Robert 
Kanter. (junping_du: rev e73a928a6360f68aaee2ed58b3a8d180f4051407)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java
* hadoop-mapreduce-project/CHANGES.txt


 Add JvmPauseMonitor to Job History Server
 -

 Key: MAPREDUCE-6443
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6443
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Affects Versions: 2.8.0
Reporter: Robert Kanter
Assignee: Robert Kanter
 Fix For: 2.8.0

 Attachments: MAPREDUCE-6443.001.patch, MAPREDUCE-6443.002.patch


 We should add the {{JvmPauseMonitor}} from HADOOP-9618 to the Job History 
 Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6257) Document encrypted spills

2015-08-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661638#comment-14661638
 ] 

Hudson commented on MAPREDUCE-6257:
---

FAILURE: Integrated in Hadoop-Yarn-trunk #1010 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1010/])
MAPREDUCE-6257. Document encrypted spills (Bibin A Chundatt via aw) (aw: rev 
fb1be0b3100cdd69f6dc1987585fcadd4e7c8a2a)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/EncryptedShuffle.md
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
* hadoop-mapreduce-project/CHANGES.txt


 Document encrypted spills
 -

 Key: MAPREDUCE-6257
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6257
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: security
Reporter: Allen Wittenauer
Assignee: Bibin A Chundatt
 Fix For: 3.0.0

 Attachments: 0001-MAPREDUCE-6257.patch, 0002-MAPREDUCE-6257.patch, 
 0003-MAPREDUCE-6257.patch, EncryptedShuffle.html


 Encrypted spills appear to be completely undocumented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6445) Shuffle hang

2015-08-07 Thread Peng Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661599#comment-14661599
 ] 

Peng Zhang commented on MAPREDUCE-6445:
---

I found most tasks of this job (94 of 100) failed like MAPREDUCE-6303. 
So this maybe related, I'll backport MAPREDUCE-6303 and test on our cluster.

 Shuffle hang
 

 Key: MAPREDUCE-6445
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6445
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Peng Zhang

 Scale cluster has run for months with 2.6.0.
 2 of 200 reduces hang on shuffle
 instance 1 log seems like loop on 1 map output:
 {noformat}
 2015-08-06 21:54:14,649 INFO [fetcher#1] 
 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: assigned 2 of 2 
 to node-132.bj:22408 to fetcher#1
 2015-08-06 21:54:14,651 INFO [fetcher#1] 
 org.apache.hadoop.mapreduce.task.reduce.Fetcher: for 
 url=22408/mapOutput?job=job_1438689528746_10193reduce=20map=attempt_1438689528746_10193_m_13_0,attempt_1438689528746_10193_m_20_0
  sent hash and received reply
 2015-08-06 21:54:14,651 INFO [fetcher#1] 
 org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#1 - MergeManager 
 returned status WAIT ...
 2015-08-06 21:54:14,651 INFO [fetcher#1] 
 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: 
 node-132.bj:22408 freed by fetcher#1 in 2ms
 2015-08-06 21:54:14,651 INFO [fetcher#5] 
 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Assigning 
 node-132.bj:22408 with 2 to fetcher#5
 2015-08-06 21:54:14,651 INFO [fetcher#5] 
 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: assigned 2 of 2 
 to node-132.bj:22408 to fetcher#5
 2015-08-06 21:54:14,656 INFO [fetcher#5] 
 org.apache.hadoop.mapreduce.task.reduce.Fetcher: for 
 url=22408/mapOutput?job=job_1438689528746_10193reduce=20map=attempt_1438689528746_10193_m_13_0,attempt_1438689528746_10193_m_20_0
  sent hash and received reply
 2015-08-06 21:54:14,656 INFO [fetcher#5] 
 org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#5 - MergeManager 
 returned status WAIT ...
 2015-08-06 21:54:14,656 INFO [fetcher#5] 
 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: 
 node-132.bj:22408 freed by fetcher#5 in 4ms
 2015-08-06 21:54:14,656 INFO [fetcher#5] 
 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Assigning 
 node-132.bj:22408 with 2 to fetcher#5
 2015-08-06 21:54:14,656 INFO [fetcher#5] 
 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: assigned 2 of 2 
 to node-132.bj:22408 to fetcher#5
 2015-08-06 21:54:14,660 INFO [fetcher#5] 
 org.apache.hadoop.mapreduce.task.reduce.Fetcher: for 
 url=22408/mapOutput?job=job_1438689528746_10193reduce=20map=attempt_1438689528746_10193_m_13_0,attempt_1438689528746_10193_m_20_0
  sent hash and received reply
 2015-08-06 21:54:14,660 INFO [fetcher#5] 
 org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#5 - MergeManager 
 returned status WAIT ...
 2015-08-06 21:54:14,660 INFO [fetcher#5] 
 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: 
 node-132.bj:22408 freed by fetcher#5 in 5ms
 2015-08-06 21:54:14,660 INFO [fetcher#5] 
 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Assigning 
 node-132.bj:22408 with 2 to fetcher#5
 2015-08-06 21:54:14,660 INFO [fetcher#5] 
 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: assigned 2 of 2 
 to node-132.bj:22408 to fetcher#5
 {noformat}
 node 2 log seems like loop on 5 map output:
 {noformat}
 2015-08-06 21:43:33,626 INFO [fetcher#5] 
 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Assigning 
 node-172.bj:22408 with 1 to fetcher#5
 2015-08-06 21:43:33,626 INFO [fetcher#5] 
 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: assigned 1 of 1 
 to node-172.bj:22408 to fetcher#5
 2015-08-06 21:43:33,627 INFO [fetcher#3] 
 org.apache.hadoop.mapreduce.task.reduce.Fetcher: for 
 url=22408/mapOutput?job=job_1438689528746_10193reduce=85map=attempt_1438689528746_10193_m_13_0,attempt_1438689528746_10193_m_20_0
  sent hash and received reply
 2015-08-06 21:43:33,627 INFO [fetcher#3] 
 org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#3 - MergeManager 
 returned status WAIT ...
 2015-08-06 21:43:33,627 INFO [fetcher#3] 
 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: 
 node-132.bj:22408 freed by fetcher#3 in 5ms
 2015-08-06 21:43:33,627 INFO [fetcher#3] 
 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Assigning 
 node-179.bj:22408 with 1 to fetcher#3
 2015-08-06 21:43:33,627 INFO [fetcher#3] 
 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: assigned 1 of 1 
 to node-179.bj:22408 to fetcher#3
 2015-08-06 21:43:33,627 INFO [fetcher#4] 
 org.apache.hadoop.mapreduce.task.reduce.Fetcher: for 
 

[jira] [Updated] (MAPREDUCE-6357) MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute

2015-08-07 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote updated MAPREDUCE-6357:
---
Attachment: (was: MAPREDUCE-6357-1.patch)

 MultipleOutputs.write() API should document that output committing is not 
 utilized when input path is absolute
 --

 Key: MAPREDUCE-6357
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6357
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.6.0
Reporter: Ivan Mitic
Assignee: Dustin Cote

 After spending the afternoon debugging a user job where reduce tasks were 
 failing on retry with the below exception, I think it would be worthwhile to 
 add a note in the MultipleOutputs.write() documentation, saying that absolute 
 paths may cause improper execution of tasks on retry or when MR speculative 
 execution is enabled. 
 {code}
 2015-04-28 23:13:10,452 WARN [main] org.apache.hadoop.mapred.YarnChild: 
 Exception running child : java.io.IOException: File already 
 exists:wasb://full20150...@bgtstoragefull.blob.core.windows.net/user/hadoop/some/path/block-r-00299.bz2
at 
 org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1354)
at 
 org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1195)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
at 
 org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:135)
at 
 org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.getRecordWriter(MultipleOutputs.java:475)
at 
 org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:433)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.processValue(LevelReducer.java:91)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:69)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:14)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
at 
 org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
 {code}
 As discussed in MAPREDUCE-3772, when the baseOutputPath passed to 
 MultipleOutputs.write() is an absolute path (or more precisely a path that 
 resolves outside of the job output-dir), the concept of output committing is 
 not utilized. 
 In this case, the user read thru the MultipleOutputs docs and was assuming 
 that everything will be working fine, as there are blog posts saying that 
 MultipleOutputs does handle output commit. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6357) MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute

2015-08-07 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote updated MAPREDUCE-6357:
---
Attachment: MAPREDUCE-6357-1.patch

Submitting the javadoc changes.  Please let me know if anything look amiss.  
Thanks!

 MultipleOutputs.write() API should document that output committing is not 
 utilized when input path is absolute
 --

 Key: MAPREDUCE-6357
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6357
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.6.0
Reporter: Ivan Mitic
Assignee: Dustin Cote
 Attachments: MAPREDUCE-6357-1.patch


 After spending the afternoon debugging a user job where reduce tasks were 
 failing on retry with the below exception, I think it would be worthwhile to 
 add a note in the MultipleOutputs.write() documentation, saying that absolute 
 paths may cause improper execution of tasks on retry or when MR speculative 
 execution is enabled. 
 {code}
 2015-04-28 23:13:10,452 WARN [main] org.apache.hadoop.mapred.YarnChild: 
 Exception running child : java.io.IOException: File already 
 exists:wasb://full20150...@bgtstoragefull.blob.core.windows.net/user/hadoop/some/path/block-r-00299.bz2
at 
 org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1354)
at 
 org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1195)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
at 
 org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:135)
at 
 org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.getRecordWriter(MultipleOutputs.java:475)
at 
 org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:433)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.processValue(LevelReducer.java:91)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:69)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:14)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
at 
 org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
 {code}
 As discussed in MAPREDUCE-3772, when the baseOutputPath passed to 
 MultipleOutputs.write() is an absolute path (or more precisely a path that 
 resolves outside of the job output-dir), the concept of output committing is 
 not utilized. 
 In this case, the user read thru the MultipleOutputs docs and was assuming 
 that everything will be working fine, as there are blog posts saying that 
 MultipleOutputs does handle output commit. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6357) MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute

2015-08-07 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote updated MAPREDUCE-6357:
---
Status: Open  (was: Patch Available)

 MultipleOutputs.write() API should document that output committing is not 
 utilized when input path is absolute
 --

 Key: MAPREDUCE-6357
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6357
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.6.0
Reporter: Ivan Mitic
Assignee: Dustin Cote
 Attachments: MAPREDUCE-6357-1.patch


 After spending the afternoon debugging a user job where reduce tasks were 
 failing on retry with the below exception, I think it would be worthwhile to 
 add a note in the MultipleOutputs.write() documentation, saying that absolute 
 paths may cause improper execution of tasks on retry or when MR speculative 
 execution is enabled. 
 {code}
 2015-04-28 23:13:10,452 WARN [main] org.apache.hadoop.mapred.YarnChild: 
 Exception running child : java.io.IOException: File already 
 exists:wasb://full20150...@bgtstoragefull.blob.core.windows.net/user/hadoop/some/path/block-r-00299.bz2
at 
 org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1354)
at 
 org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1195)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
at 
 org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:135)
at 
 org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.getRecordWriter(MultipleOutputs.java:475)
at 
 org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:433)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.processValue(LevelReducer.java:91)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:69)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:14)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
at 
 org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
 {code}
 As discussed in MAPREDUCE-3772, when the baseOutputPath passed to 
 MultipleOutputs.write() is an absolute path (or more precisely a path that 
 resolves outside of the job output-dir), the concept of output committing is 
 not utilized. 
 In this case, the user read thru the MultipleOutputs docs and was assuming 
 that everything will be working fine, as there are blog posts saying that 
 MultipleOutputs does handle output commit. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6257) Document encrypted spills

2015-08-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661911#comment-14661911
 ] 

Hudson commented on MAPREDUCE-6257:
---

FAILURE: Integrated in Hadoop-Hdfs-trunk #2207 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2207/])
MAPREDUCE-6257. Document encrypted spills (Bibin A Chundatt via aw) (aw: rev 
fb1be0b3100cdd69f6dc1987585fcadd4e7c8a2a)
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/EncryptedShuffle.md
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml


 Document encrypted spills
 -

 Key: MAPREDUCE-6257
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6257
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: security
Reporter: Allen Wittenauer
Assignee: Bibin A Chundatt
 Fix For: 3.0.0

 Attachments: 0001-MAPREDUCE-6257.patch, 0002-MAPREDUCE-6257.patch, 
 0003-MAPREDUCE-6257.patch, EncryptedShuffle.html


 Encrypted spills appear to be completely undocumented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6357) MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute

2015-08-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661962#comment-14661962
 ] 

Hadoop QA commented on MAPREDUCE-6357:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 36s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 52s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 53s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 46s | The applied patch generated  3 
new checkstyle issues (total was 29, now 32). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 23s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 26s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | mapreduce tests |   1m 47s | Tests passed in 
hadoop-mapreduce-client-core. |
| | |  40m 43s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12749275/MAPREDUCE-6357-1.patch 
|
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / b6265d3 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5931/artifact/patchprocess/diffcheckstylehadoop-mapreduce-client-core.txt
 |
| hadoop-mapreduce-client-core test log | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5931/artifact/patchprocess/testrun_hadoop-mapreduce-client-core.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5931/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5931/console |


This message was automatically generated.

 MultipleOutputs.write() API should document that output committing is not 
 utilized when input path is absolute
 --

 Key: MAPREDUCE-6357
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6357
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.6.0
Reporter: Ivan Mitic
Assignee: Dustin Cote
 Attachments: MAPREDUCE-6357-1.patch


 After spending the afternoon debugging a user job where reduce tasks were 
 failing on retry with the below exception, I think it would be worthwhile to 
 add a note in the MultipleOutputs.write() documentation, saying that absolute 
 paths may cause improper execution of tasks on retry or when MR speculative 
 execution is enabled. 
 {code}
 2015-04-28 23:13:10,452 WARN [main] org.apache.hadoop.mapred.YarnChild: 
 Exception running child : java.io.IOException: File already 
 exists:wasb://full20150...@bgtstoragefull.blob.core.windows.net/user/hadoop/some/path/block-r-00299.bz2
at 
 org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1354)
at 
 org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1195)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
at 
 org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:135)
at 
 org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.getRecordWriter(MultipleOutputs.java:475)
at 
 org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:433)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.processValue(LevelReducer.java:91)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:69)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:14)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
at 
 

[jira] [Updated] (MAPREDUCE-6357) MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute

2015-08-07 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote updated MAPREDUCE-6357:
---
Attachment: MAPREDUCE-6357-1.patch

Fixing a typo and checkstyle warning.  No tests since this is a doc change.

 MultipleOutputs.write() API should document that output committing is not 
 utilized when input path is absolute
 --

 Key: MAPREDUCE-6357
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6357
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.6.0
Reporter: Ivan Mitic
Assignee: Dustin Cote
 Attachments: MAPREDUCE-6357-1.patch


 After spending the afternoon debugging a user job where reduce tasks were 
 failing on retry with the below exception, I think it would be worthwhile to 
 add a note in the MultipleOutputs.write() documentation, saying that absolute 
 paths may cause improper execution of tasks on retry or when MR speculative 
 execution is enabled. 
 {code}
 2015-04-28 23:13:10,452 WARN [main] org.apache.hadoop.mapred.YarnChild: 
 Exception running child : java.io.IOException: File already 
 exists:wasb://full20150...@bgtstoragefull.blob.core.windows.net/user/hadoop/some/path/block-r-00299.bz2
at 
 org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1354)
at 
 org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1195)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
at 
 org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:135)
at 
 org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.getRecordWriter(MultipleOutputs.java:475)
at 
 org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:433)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.processValue(LevelReducer.java:91)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:69)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:14)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
at 
 org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
 {code}
 As discussed in MAPREDUCE-3772, when the baseOutputPath passed to 
 MultipleOutputs.write() is an absolute path (or more precisely a path that 
 resolves outside of the job output-dir), the concept of output committing is 
 not utilized. 
 In this case, the user read thru the MultipleOutputs docs and was assuming 
 that everything will be working fine, as there are blog posts saying that 
 MultipleOutputs does handle output commit. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6357) MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute

2015-08-07 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote updated MAPREDUCE-6357:
---
Status: Patch Available  (was: Open)

 MultipleOutputs.write() API should document that output committing is not 
 utilized when input path is absolute
 --

 Key: MAPREDUCE-6357
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6357
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.6.0
Reporter: Ivan Mitic
Assignee: Dustin Cote
 Attachments: MAPREDUCE-6357-1.patch


 After spending the afternoon debugging a user job where reduce tasks were 
 failing on retry with the below exception, I think it would be worthwhile to 
 add a note in the MultipleOutputs.write() documentation, saying that absolute 
 paths may cause improper execution of tasks on retry or when MR speculative 
 execution is enabled. 
 {code}
 2015-04-28 23:13:10,452 WARN [main] org.apache.hadoop.mapred.YarnChild: 
 Exception running child : java.io.IOException: File already 
 exists:wasb://full20150...@bgtstoragefull.blob.core.windows.net/user/hadoop/some/path/block-r-00299.bz2
at 
 org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1354)
at 
 org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1195)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
at 
 org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:135)
at 
 org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.getRecordWriter(MultipleOutputs.java:475)
at 
 org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:433)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.processValue(LevelReducer.java:91)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:69)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:14)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
at 
 org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
 {code}
 As discussed in MAPREDUCE-3772, when the baseOutputPath passed to 
 MultipleOutputs.write() is an absolute path (or more precisely a path that 
 resolves outside of the job output-dir), the concept of output committing is 
 not utilized. 
 In this case, the user read thru the MultipleOutputs docs and was assuming 
 that everything will be working fine, as there are blog posts saying that 
 MultipleOutputs does handle output commit. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5870) Support for passing Job priority through Application Submission Context in Mapreduce Side

2015-08-07 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14662064#comment-14662064
 ] 

Eric Payne commented on MAPREDUCE-5870:
---

[~sunilg], for what it's worth, I have downloaded the latest patch (version 
003) and tested and verified it in conjunction with the changes that were made 
for YARN-2003.

I performed the following sleep jobs with 10 tasks each. My one-node cluster 
can run 5 containers at once.
- I submit sleep job1 to the default queue, setting 
{{-Dmapreduce.job.priority=LOW}}
- Job1 starts running 5 containers and has 5 tasks pending.
- I submit sleep job2 to the default queue, setting 
{{-Dmapreduce.job.priority=HIGH}}
- All 10 job2 tasks are pending.
- Once tasks from job1 complete, job2 gets the containers. Although job1 has 5 
tasks pending, the number of running tasks for job1 remains 0 until job2 has no 
more pending tasks and job2's running tasks begin to complete.
- At that point, job1's tasks begin again to receive containers.

I also verified that you can specify {{-Dmapreduce.job.priority=_number_}}, and 
the container allocations go to the higher numbered jobs.
Finally, I verified that if you make the priority higher than the cluster max, 
it silently sets the job priority to cluster max.

So, the bottom line is LGTM :-)
+1

 Support for passing Job priority through Application Submission Context in 
 Mapreduce Side
 -

 Key: MAPREDUCE-5870
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5870
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client
Reporter: Sunil G
Assignee: Sunil G
 Attachments: 0001-MAPREDUCE-5870.patch, 0002-MAPREDUCE-5870.patch, 
 0003-MAPREDUCE-5870.patch, Yarn-2002.1.patch


 Job Prioirty can be set from client side as below [Configuration and api].
   a.  JobConf.getJobPriority() and 
 Job.setPriority(JobPriority priority) 
   b.  We can also use configuration 
 mapreduce.job.priority.
   Now this Job priority can be passed in Application Submission 
 context from Client side.
   Here we can reuse the MRJobConfig.PRIORITY configuration. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6357) MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute

2015-08-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14662080#comment-14662080
 ] 

Hadoop QA commented on MAPREDUCE-6357:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  16m 15s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 44s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 16s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 29s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 27s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 31s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | mapreduce tests |   1m 52s | Tests passed in 
hadoop-mapreduce-client-core. |
| | |  40m 38s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12749289/MAPREDUCE-6357-1.patch 
|
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / b6265d3 |
| hadoop-mapreduce-client-core test log | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5932/artifact/patchprocess/testrun_hadoop-mapreduce-client-core.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5932/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5932/console |


This message was automatically generated.

 MultipleOutputs.write() API should document that output committing is not 
 utilized when input path is absolute
 --

 Key: MAPREDUCE-6357
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6357
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.6.0
Reporter: Ivan Mitic
Assignee: Dustin Cote
 Attachments: MAPREDUCE-6357-1.patch


 After spending the afternoon debugging a user job where reduce tasks were 
 failing on retry with the below exception, I think it would be worthwhile to 
 add a note in the MultipleOutputs.write() documentation, saying that absolute 
 paths may cause improper execution of tasks on retry or when MR speculative 
 execution is enabled. 
 {code}
 2015-04-28 23:13:10,452 WARN [main] org.apache.hadoop.mapred.YarnChild: 
 Exception running child : java.io.IOException: File already 
 exists:wasb://full20150...@bgtstoragefull.blob.core.windows.net/user/hadoop/some/path/block-r-00299.bz2
at 
 org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1354)
at 
 org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1195)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
at 
 org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:135)
at 
 org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.getRecordWriter(MultipleOutputs.java:475)
at 
 org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:433)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.processValue(LevelReducer.java:91)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:69)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:14)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
at 
 org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
  

[jira] [Updated] (MAPREDUCE-6357) MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute

2015-08-07 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote updated MAPREDUCE-6357:
---
Status: Patch Available  (was: Open)

 MultipleOutputs.write() API should document that output committing is not 
 utilized when input path is absolute
 --

 Key: MAPREDUCE-6357
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6357
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.6.0
Reporter: Ivan Mitic
Assignee: Dustin Cote
 Attachments: MAPREDUCE-6357-1.patch


 After spending the afternoon debugging a user job where reduce tasks were 
 failing on retry with the below exception, I think it would be worthwhile to 
 add a note in the MultipleOutputs.write() documentation, saying that absolute 
 paths may cause improper execution of tasks on retry or when MR speculative 
 execution is enabled. 
 {code}
 2015-04-28 23:13:10,452 WARN [main] org.apache.hadoop.mapred.YarnChild: 
 Exception running child : java.io.IOException: File already 
 exists:wasb://full20150...@bgtstoragefull.blob.core.windows.net/user/hadoop/some/path/block-r-00299.bz2
at 
 org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1354)
at 
 org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1195)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
at 
 org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:135)
at 
 org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.getRecordWriter(MultipleOutputs.java:475)
at 
 org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:433)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.processValue(LevelReducer.java:91)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:69)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:14)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
at 
 org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
 {code}
 As discussed in MAPREDUCE-3772, when the baseOutputPath passed to 
 MultipleOutputs.write() is an absolute path (or more precisely a path that 
 resolves outside of the job output-dir), the concept of output committing is 
 not utilized. 
 In this case, the user read thru the MultipleOutputs docs and was assuming 
 that everything will be working fine, as there are blog posts saying that 
 MultipleOutputs does handle output commit. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6257) Document encrypted spills

2015-08-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661887#comment-14661887
 ] 

Hudson commented on MAPREDUCE-6257:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2226 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2226/])
MAPREDUCE-6257. Document encrypted spills (Bibin A Chundatt via aw) (aw: rev 
fb1be0b3100cdd69f6dc1987585fcadd4e7c8a2a)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/EncryptedShuffle.md
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml


 Document encrypted spills
 -

 Key: MAPREDUCE-6257
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6257
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: security
Reporter: Allen Wittenauer
Assignee: Bibin A Chundatt
 Fix For: 3.0.0

 Attachments: 0001-MAPREDUCE-6257.patch, 0002-MAPREDUCE-6257.patch, 
 0003-MAPREDUCE-6257.patch, EncryptedShuffle.html


 Encrypted spills appear to be completely undocumented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6443) Add JvmPauseMonitor to Job History Server

2015-08-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661915#comment-14661915
 ] 

Hudson commented on MAPREDUCE-6443:
---

FAILURE: Integrated in Hadoop-Hdfs-trunk #2207 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2207/])
MAPREDUCE-6443. Add JvmPauseMonitor to JobHistoryServer. Contributed by Robert 
Kanter. (junping_du: rev e73a928a6360f68aaee2ed58b3a8d180f4051407)
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java


 Add JvmPauseMonitor to Job History Server
 -

 Key: MAPREDUCE-6443
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6443
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Affects Versions: 2.8.0
Reporter: Robert Kanter
Assignee: Robert Kanter
 Fix For: 2.8.0

 Attachments: MAPREDUCE-6443.001.patch, MAPREDUCE-6443.002.patch


 We should add the {{JvmPauseMonitor}} from HADOOP-9618 to the Job History 
 Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)