date:20130123

[jira] [Commented] (MAPREDUCE-4946) Type conversion of map completion events leads to performance problems with large jobs

2013-01-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560547#comment-13560547
 ] 

Hudson commented on MAPREDUCE-4946:
---

Integrated in Hadoop-Yarn-trunk #105 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/105/])
MAPREDUCE-4946. Fix a performance problem for large jobs by reducing the 
number of map completion event type conversions. Contributed by Jason Lowe. 
(Revision 1437103)

 Result = SUCCESS
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1437103
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/TaskAttemptListenerImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/Job.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapred/TestTaskAttemptListenerImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MockJobs.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestFetchFailure.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRuntimeEstimators.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/CompletedJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/PartialJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/MockHistoryJobs.java


 Type conversion of map completion events leads to performance problems with 
 large jobs
 --

 Key: MAPREDUCE-4946
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4946
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 2.0.2-alpha, 0.23.5
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Fix For: 2.0.3-alpha, 0.23.7

 Attachments: MAPREDUCE-4946-branch-0.23.patch, MAPREDUCE-4946.patch


 We've seen issues with large jobs (e.g.: 13,000 maps and 3,500 reduces) where 
 reducers fail to connect back to the AM after being launched due to 
 connection timeout.  Looking at stack traces of the AM during this time we 
 see a lot of IPC servers stuck waiting for a lock to get the application ID 
 while type converting the map completion events.  What's odd is that normally 
 getting the application ID should be very cheap, but in this case we're 
 type-converting thousands of map completion events for *each* reducer 
 connecting.  That means we end up type-converting the map completion events 
 over 45 million times during the lifetime of the example job (13,000 * 3,500).
 We either need to make the type conversion much cheaper (i.e.: lockless or at 
 least read-write locked) or, even better, store the completion events in a 
 form that does not require type conversion when serving them up to reducers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4808) Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations

2013-01-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560548#comment-13560548
 ] 

Hudson commented on MAPREDUCE-4808:
---

Integrated in Hadoop-Yarn-trunk #105 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/105/])
MAPREDUCE-4808. Refactor MapOutput and MergeManager to facilitate reuse by 
Shuffle implementations. (masokan via tucu) (Revision 1436936)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1436936
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/dev-support/findbugs-exclude.xml
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/InMemoryMapOutput.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/InMemoryReader.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MapOutput.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManager.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManagerImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeThread.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestMergeManager.java


 Refactor MapOutput and MergeManager to facilitate reuse by Shuffle 
 implementations
 --

 Key: MAPREDUCE-4808
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Arun C Murthy
Assignee: Mariappan Asokan
 Fix For: 3.0.0

 Attachments: COMBO-mapreduce-4809-4812-4808.patch, M4808-0.patch, 
 M4808-1.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 MergeManagerPlugin.pdf, MR-4808.patch


 Now that Shuffle is pluggable (MAPREDUCE-4049), it would be convenient for 
 alternate implementations to be able to reuse portions of the default 
 implementation. 
 This would come with the strong caveat that these classes are LimitedPrivate 
 and Unstable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4949) Enable multiple pi jobs to run in parallel

2013-01-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560552#comment-13560552
 ] 

Hudson commented on MAPREDUCE-4949:
---

Integrated in Hadoop-Yarn-trunk #105 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/105/])
MAPREDUCE-4949. Enable multiple pi jobs to run in parallel. (sandyr via 
tucu) (Revision 1437029)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1437029
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/QuasiMonteCarlo.java


 Enable multiple pi jobs to run in parallel
 --

 Key: MAPREDUCE-4949
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4949
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: examples
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Minor
 Fix For: 2.0.3-alpha

 Attachments: MAPREDUCE-4949.patch


 Currently the hadoop pi example uses a hardcoded temporary directory to store 
 its inputs and outputs.  This makes it so that only one pi job can run at a 
 time, and that if it is cancelled, the temporary directory must be manually 
 deleted.
 I propose using a temporary directory based on a timestamp and random number 
 to avoid these conflicts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4929) mapreduce.task.timeout is ignored

2013-01-23 Thread Tom White (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated MAPREDUCE-4929:
-

  Resolution: Fixed
   Fix Version/s: 1.2.0
Target Version/s:   (was: 1.1.2)
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

I just committed this. Thanks, Sandy!

 mapreduce.task.timeout is ignored
 -

 Key: MAPREDUCE-4929
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4929
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.1.1
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 1.2.0

 Attachments: MAPREDUCE-4929-branch-1.patch


 In MR1, only mapred.task.timeout works.  Both should be made to work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4946) Type conversion of map completion events leads to performance problems with large jobs

2013-01-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560645#comment-13560645
 ] 

Hudson commented on MAPREDUCE-4946:
---

Integrated in Hadoop-Hdfs-0.23-Build #503 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/503/])
MAPREDUCE-4946. Fix a performance problem for large jobs by reducing the 
number of map completion event type conversions. Contributed by Jason Lowe. 
(Revision 1437105)

 Result = FAILURE
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1437105
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/TaskAttemptListenerImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/Job.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapred/TestTaskAttemptListenerImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MockJobs.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestFetchFailure.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRuntimeEstimators.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/CompletedJob.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/PartialJob.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/MockHistoryJobs.java


 Type conversion of map completion events leads to performance problems with 
 large jobs
 --

 Key: MAPREDUCE-4946
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4946
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 2.0.2-alpha, 0.23.5
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Fix For: 2.0.3-alpha, 0.23.7

 Attachments: MAPREDUCE-4946-branch-0.23.patch, MAPREDUCE-4946.patch


 We've seen issues with large jobs (e.g.: 13,000 maps and 3,500 reduces) where 
 reducers fail to connect back to the AM after being launched due to 
 connection timeout.  Looking at stack traces of the AM during this time we 
 see a lot of IPC servers stuck waiting for a lock to get the application ID 
 while type converting the map completion events.  What's odd is that normally 
 getting the application ID should be very cheap, but in this case we're 
 type-converting thousands of map completion events for *each* reducer 
 connecting.  That means we end up type-converting the map completion events 
 over 45 million times during the lifetime of the example job (13,000 * 3,500).
 We either need to make the type conversion much cheaper (i.e.: lockless or at 
 least read-write locked) or, even better, store the completion events in a 
 form that does not require type conversion when serving them up to reducers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4808) Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations

2013-01-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560656#comment-13560656
 ] 

Hudson commented on MAPREDUCE-4808:
---

Integrated in Hadoop-Hdfs-trunk #1294 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1294/])
MAPREDUCE-4808. Refactor MapOutput and MergeManager to facilitate reuse by 
Shuffle implementations. (masokan via tucu) (Revision 1436936)

 Result = FAILURE
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1436936
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/dev-support/findbugs-exclude.xml
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/InMemoryMapOutput.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/InMemoryReader.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MapOutput.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManager.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManagerImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeThread.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestMergeManager.java


 Refactor MapOutput and MergeManager to facilitate reuse by Shuffle 
 implementations
 --

 Key: MAPREDUCE-4808
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Arun C Murthy
Assignee: Mariappan Asokan
 Fix For: 3.0.0

 Attachments: COMBO-mapreduce-4809-4812-4808.patch, M4808-0.patch, 
 M4808-1.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 MergeManagerPlugin.pdf, MR-4808.patch


 Now that Shuffle is pluggable (MAPREDUCE-4049), it would be convenient for 
 alternate implementations to be able to reuse portions of the default 
 implementation. 
 This would come with the strong caveat that these classes are LimitedPrivate 
 and Unstable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4946) Type conversion of map completion events leads to performance problems with large jobs

2013-01-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560655#comment-13560655
 ] 

Hudson commented on MAPREDUCE-4946:
---

Integrated in Hadoop-Hdfs-trunk #1294 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1294/])
MAPREDUCE-4946. Fix a performance problem for large jobs by reducing the 
number of map completion event type conversions. Contributed by Jason Lowe. 
(Revision 1437103)

 Result = FAILURE
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1437103
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/TaskAttemptListenerImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/Job.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapred/TestTaskAttemptListenerImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MockJobs.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestFetchFailure.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRuntimeEstimators.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/CompletedJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/PartialJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/MockHistoryJobs.java


 Type conversion of map completion events leads to performance problems with 
 large jobs
 --

 Key: MAPREDUCE-4946
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4946
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 2.0.2-alpha, 0.23.5
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Fix For: 2.0.3-alpha, 0.23.7

 Attachments: MAPREDUCE-4946-branch-0.23.patch, MAPREDUCE-4946.patch


 We've seen issues with large jobs (e.g.: 13,000 maps and 3,500 reduces) where 
 reducers fail to connect back to the AM after being launched due to 
 connection timeout.  Looking at stack traces of the AM during this time we 
 see a lot of IPC servers stuck waiting for a lock to get the application ID 
 while type converting the map completion events.  What's odd is that normally 
 getting the application ID should be very cheap, but in this case we're 
 type-converting thousands of map completion events for *each* reducer 
 connecting.  That means we end up type-converting the map completion events 
 over 45 million times during the lifetime of the example job (13,000 * 3,500).
 We either need to make the type conversion much cheaper (i.e.: lockless or at 
 least read-write locked) or, even better, store the completion events in a 
 form that does not require type conversion when serving them up to reducers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4949) Enable multiple pi jobs to run in parallel

2013-01-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560660#comment-13560660
 ] 

Hudson commented on MAPREDUCE-4949:
---

Integrated in Hadoop-Hdfs-trunk #1294 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1294/])
MAPREDUCE-4949. Enable multiple pi jobs to run in parallel. (sandyr via 
tucu) (Revision 1437029)

 Result = FAILURE
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1437029
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/QuasiMonteCarlo.java


 Enable multiple pi jobs to run in parallel
 --

 Key: MAPREDUCE-4949
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4949
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: examples
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Minor
 Fix For: 2.0.3-alpha

 Attachments: MAPREDUCE-4949.patch


 Currently the hadoop pi example uses a hardcoded temporary directory to store 
 its inputs and outputs.  This makes it so that only one pi job can run at a 
 time, and that if it is cancelled, the temporary directory must be manually 
 deleted.
 I propose using a temporary directory based on a timestamp and random number 
 to avoid these conflicts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4808) Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations

2013-01-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560704#comment-13560704
 ] 

Hudson commented on MAPREDUCE-4808:
---

Integrated in Hadoop-Mapreduce-trunk #1322 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1322/])
MAPREDUCE-4808. Refactor MapOutput and MergeManager to facilitate reuse by 
Shuffle implementations. (masokan via tucu) (Revision 1436936)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1436936
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/dev-support/findbugs-exclude.xml
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/InMemoryMapOutput.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/InMemoryReader.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MapOutput.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManager.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManagerImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeThread.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestMergeManager.java


 Refactor MapOutput and MergeManager to facilitate reuse by Shuffle 
 implementations
 --

 Key: MAPREDUCE-4808
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Arun C Murthy
Assignee: Mariappan Asokan
 Fix For: 3.0.0

 Attachments: COMBO-mapreduce-4809-4812-4808.patch, M4808-0.patch, 
 M4808-1.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 MergeManagerPlugin.pdf, MR-4808.patch


 Now that Shuffle is pluggable (MAPREDUCE-4049), it would be convenient for 
 alternate implementations to be able to reuse portions of the default 
 implementation. 
 This would come with the strong caveat that these classes are LimitedPrivate 
 and Unstable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4946) Type conversion of map completion events leads to performance problems with large jobs

2013-01-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560703#comment-13560703
 ] 

Hudson commented on MAPREDUCE-4946:
---

Integrated in Hadoop-Mapreduce-trunk #1322 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1322/])
MAPREDUCE-4946. Fix a performance problem for large jobs by reducing the 
number of map completion event type conversions. Contributed by Jason Lowe. 
(Revision 1437103)

 Result = SUCCESS
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1437103
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/TaskAttemptListenerImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/Job.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapred/TestTaskAttemptListenerImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MockJobs.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestFetchFailure.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRuntimeEstimators.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/CompletedJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/PartialJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/MockHistoryJobs.java


 Type conversion of map completion events leads to performance problems with 
 large jobs
 --

 Key: MAPREDUCE-4946
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4946
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 2.0.2-alpha, 0.23.5
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Fix For: 2.0.3-alpha, 0.23.7

 Attachments: MAPREDUCE-4946-branch-0.23.patch, MAPREDUCE-4946.patch


 We've seen issues with large jobs (e.g.: 13,000 maps and 3,500 reduces) where 
 reducers fail to connect back to the AM after being launched due to 
 connection timeout.  Looking at stack traces of the AM during this time we 
 see a lot of IPC servers stuck waiting for a lock to get the application ID 
 while type converting the map completion events.  What's odd is that normally 
 getting the application ID should be very cheap, but in this case we're 
 type-converting thousands of map completion events for *each* reducer 
 connecting.  That means we end up type-converting the map completion events 
 over 45 million times during the lifetime of the example job (13,000 * 3,500).
 We either need to make the type conversion much cheaper (i.e.: lockless or at 
 least read-write locked) or, even better, store the completion events in a 
 form that does not require type conversion when serving them up to reducers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4949) Enable multiple pi jobs to run in parallel

2013-01-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560708#comment-13560708
 ] 

Hudson commented on MAPREDUCE-4949:
---

Integrated in Hadoop-Mapreduce-trunk #1322 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1322/])
MAPREDUCE-4949. Enable multiple pi jobs to run in parallel. (sandyr via 
tucu) (Revision 1437029)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1437029
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/QuasiMonteCarlo.java


 Enable multiple pi jobs to run in parallel
 --

 Key: MAPREDUCE-4949
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4949
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: examples
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Minor
 Fix For: 2.0.3-alpha

 Attachments: MAPREDUCE-4949.patch


 Currently the hadoop pi example uses a hardcoded temporary directory to store 
 its inputs and outputs.  This makes it so that only one pi job can run at a 
 time, and that if it is cancelled, the temporary directory must be manually 
 deleted.
 I propose using a temporary directory based on a timestamp and random number 
 to avoid these conflicts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4951) Container preemption interpreted as task failure

2013-01-23 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560736#comment-13560736
 ] 

Jason Lowe commented on MAPREDUCE-4951:
---

bq. having the RM ask the AM to kill the container in case of preemption would 
likely not work as the AM cannot be trusted.

Agreed, I was thinking of exactly the alternative you propose where preemption 
has potentially two phases, a please AM, preempt that container you have with 
a watchdog timer to have the RM kill it forcefully if the AM does not comply in 
a reasonable amount of time.  This eliminates the race where the container can 
fail because of the preemption and provides a way for the AM to potentially 
checkpoint the state of the container for faster recovery.  However it does 
mean the meantime latency for container availability would be higher since the 
AM will have a grace period before relinquishing the resources.

 Container preemption interpreted as task failure
 

 Key: MAPREDUCE-4951
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4951
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mr-am, mrv2
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-4951-1.patch, MAPREDUCE-4951-2.patch, 
 MAPREDUCE-4951.patch


 When YARN reports a completed container to the MR AM, it always interprets it 
 as a failure.  This can lead to a job failing because too many of its tasks 
 failed, when in fact they only failed because the scheduler preempted them.
 MR needs to recognize the special exit code value of -100 and interpret it as 
 a container being killed instead of a container failure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4951) Container preemption interpreted as task failure

2013-01-23 Thread Bikas Saha (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560903#comment-13560903
]

Bikas Saha commented on MAPREDUCE-4951:
---

We might be digressing from this jira here. But I really dont think the 2-step
approach is worth its complexity. The main scenario where it makes sense is
when the task has an ability to checkpoint its work before getting preempted. I
havent seen this capability outside of basic research prototypes. Its much
simpler to have the preemption be an RM only action. We do need to fix the
action and information loop so that AM's can get correct information about the
infrastructure's actions.

Container preemption interpreted as task failure

Key: MAPREDUCE-4951
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4951
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: applicationmaster, mr-am, mrv2
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Attachments: MAPREDUCE-4951-1.patch, MAPREDUCE-4951-2.patch,
MAPREDUCE-4951.patch

When YARN reports a completed container to the MR AM, it always interprets it
as a failure. This can lead to a job failing because too many of its tasks
failed, when in fact they only failed because the scheduler preempted them.
MR needs to recognize the special exit code value of -100 and interpret it as
a container being killed instead of a container failure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service

2013-01-23 Thread Alejandro Abdelnur (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560916#comment-13560916
]

Alejandro Abdelnur commented on MAPREDUCE-4049:
---

Avner, sounds good. We can do another review iteration on your updated patch,
it will be easier looking at concrete code. Thx

plugin for generic shuffle service
--

Key: MAPREDUCE-4049
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
Project: Hadoop Map/Reduce
Issue Type: Sub-task
Components: performance, task, tasktracker
Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0
Reporter: Avner BenHanoch
Assignee: Avner BenHanoch
Labels: merge, plugin, rdma, shuffle
Fix For: 3.0.0

Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf,
MAPREDUCE-4049--branch-1.patch, mapreduce-4049.patch

Support generic shuffle service as set of two plugins: ShuffleProvider
ShuffleConsumer.
This will satisfy the following needs:
# Better shuffle and merge performance. For example: we are working on
shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE,
or Infiniband) instead of using the current HTTP shuffle. Based on the fast
RDMA shuffle, the plugin can also utilize a suitable merge approach during
the intermediate merges. Hence, getting much better performance.
# Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden
dependency of NodeManager with a specific version of mapreduce shuffle
(currently targeted to 0.24.0).
References:
# Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu
from Auburn University with others,
[http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
# I am attaching 2 documents with suggested Top Level Design for both plugins
(currently, based on 1.0 branch)
# I am providing link for downloading UDA - Mellanox's open source plugin
that implements generic shuffle service using RDMA and levitated merge.
Note: At this phase, the code is in C++ through JNI and you should consider
it as beta only. Still, it can serve anyone that wants to implement or
contribute to levitated merge. (Please be advised that levitated merge is
mostly suit in very fast networks) -
[http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69]

[jira] [Commented] (MAPREDUCE-4838) Add extra info to JH files

2013-01-23 Thread Siddharth Seth (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561105#comment-13561105
]

Siddharth Seth commented on MAPREDUCE-4838:
---

Took a quick look. This patch looks better but still needs some fixes.
- Unit tests should use the new properties defined in MRJobConfig.*
- In case of Reduce tasks, the container host doesn't need to be resolved (or
in the case where dataLocalHosts is empty)
- The null check is still required in the history events - since these values
don't need to be set.
- TaskAttemptImpl has a repeated dataLocalHosts assignment
- RMContainerAllocator has an unused import
- Formatting fix in TestJobImpl (patch line 377)
- Since resolveHosts has been changed to work with a hashSet - TaskAttemptImpl
itself could store this. Instead of iterating over an array to match the
container host.

Most of the test changes look good as well. Need to take a better look at some
of them though. Thanks.

Add extra info to JH files
--

Key: MAPREDUCE-4838
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4838
Project: Hadoop Map/Reduce
Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Zhijie Shen
Attachments: MAPREDUCE-4838_1.patch, MAPREDUCE-4838_2.patch,
MAPREDUCE-4838.patch

It will be useful to add more task-info to JH for analytics.

[jira] [Created] (MAPREDUCE-4956) The Additional JH Info Should Be Exposed

2013-01-23 Thread Zhijie Shen (JIRA)

Zhijie Shen created MAPREDUCE-4956:
--

 Summary: The Additional JH Info Should Be Exposed
 Key: MAPREDUCE-4956
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4956
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Zhijie Shen
Assignee: Zhijie Shen


In MAPREDUCE-4838, the addition info has been added to JH. This info is useful 
to be exposed, at least via UI.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4838) Add extra info to JH files

2013-01-23 Thread Zhijie Shen (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated MAPREDUCE-4838:
---

Attachment: MAPREDUCE-4838_3.patch

Hi Sid,

your new comments are addressed. Please have a look at the latest attached 
patch file.

Thanks,
Zhijie

 Add extra info to JH files
 --

 Key: MAPREDUCE-4838
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4838
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Zhijie Shen
 Attachments: MAPREDUCE-4838_1.patch, MAPREDUCE-4838_2.patch, 
 MAPREDUCE-4838_3.patch, MAPREDUCE-4838.patch


 It will be useful to add more task-info to JH for analytics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and mapreduce.framework.name is local

2013-01-23 Thread yi (JIRA)

yi created MAPREDUCE-4957:
-

 Summary: Throw FileNotFoundException when running in single node 
and mapreduce.framework.name is local
 Key: MAPREDUCE-4957
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4957
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: yi
Assignee: yi
Priority: Minor


java.io.FileNotFoundException: File does not exist: 
/root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar 
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:772)
 
at 
org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
 
at 
org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
 
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:254)
 
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:292)
 
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:365)
 
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) 
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:396) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
 
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) 
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) 
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:396) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
 
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) 
at 
org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:446) 
at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:683) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 
at java.lang.reflect.Method.invoke(Method.java:597) 
at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 
Job Submission failed with exception 'java.io.FileNotFoundException(File does 
not exist: 
/root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar)'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and mapreduce.framework.name is local

2013-01-23 Thread yi (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561367#comment-13561367
 ] 

yi commented on MAPREDUCE-4957:
---

The file path added by DistributedCache.addFileToClassPath should be 
qualified by correct filesystem.

 Throw FileNotFoundException when running in single node and 
 mapreduce.framework.name is local
 ---

 Key: MAPREDUCE-4957
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4957
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: yi
Assignee: yi
Priority: Minor

 java.io.FileNotFoundException: File does not exist: 
 /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar 
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:772)
  
 at 
 org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
  
 at 
 org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
  
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:254)
  
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:292)
  
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:365)
  
 at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) 
 at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) 
 at java.security.AccessController.doPrivileged(Native Method) 
 at javax.security.auth.Subject.doAs(Subject.java:396) 
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
  
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) 
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) 
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) 
 at java.security.AccessController.doPrivileged(Native Method) 
 at javax.security.auth.Subject.doAs(Subject.java:396) 
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
  
 at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) 
 at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:446) 
 at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:683) 
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  
 at java.lang.reflect.Method.invoke(Method.java:597) 
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 
 Job Submission failed with exception 'java.io.FileNotFoundException(File does 
 not exist: 
 /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar)'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and mapreduce.framework.name is local

2013-01-23 Thread yi (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yi updated MAPREDUCE-4957:
--

Attachment: MAPREDUCE-4957.patch

 Throw FileNotFoundException when running in single node and 
 mapreduce.framework.name is local
 ---

 Key: MAPREDUCE-4957
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4957
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: yi
Assignee: yi
Priority: Minor
 Attachments: MAPREDUCE-4957.patch


 java.io.FileNotFoundException: File does not exist: 
 /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar 
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:772)
  
 at 
 org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
  
 at 
 org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
  
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:254)
  
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:292)
  
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:365)
  
 at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) 
 at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) 
 at java.security.AccessController.doPrivileged(Native Method) 
 at javax.security.auth.Subject.doAs(Subject.java:396) 
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
  
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) 
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) 
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) 
 at java.security.AccessController.doPrivileged(Native Method) 
 at javax.security.auth.Subject.doAs(Subject.java:396) 
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
  
 at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) 
 at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:446) 
 at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:683) 
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  
 at java.lang.reflect.Method.invoke(Method.java:597) 
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 
 Job Submission failed with exception 'java.io.FileNotFoundException(File does 
 not exist: 
 /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar)'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and mapreduce.framework.name is local

2013-01-23 Thread yi (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yi updated MAPREDUCE-4957:
--

Status: Patch Available  (was: Open)

Patch has been attached, please help to review.

 Throw FileNotFoundException when running in single node and 
 mapreduce.framework.name is local
 ---

 Key: MAPREDUCE-4957
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4957
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: yi
Assignee: yi
Priority: Minor
 Attachments: MAPREDUCE-4957.patch


 Run in single node and mapreduce.framework.name is local, and get following 
 error:
 java.io.FileNotFoundException: File does not exist: 
 /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar 
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:772)
  
 at 
 org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
  
 at 
 org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
  
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:254)
  
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:292)
  
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:365)
  
 at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) 
 at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) 
 at java.security.AccessController.doPrivileged(Native Method) 
 at javax.security.auth.Subject.doAs(Subject.java:396) 
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
  
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) 
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) 
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) 
 at java.security.AccessController.doPrivileged(Native Method) 
 at javax.security.auth.Subject.doAs(Subject.java:396) 
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
  
 at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) 
 at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:446) 
 at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:683) 
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  
 at java.lang.reflect.Method.invoke(Method.java:597) 
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 
 Job Submission failed with exception 'java.io.FileNotFoundException(File does 
 not exist: 
 /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar)'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and mapreduce.framework.name is local

2013-01-23 Thread yi (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yi updated MAPREDUCE-4957:
--

Description: 
Run in single node and mapreduce.framework.name is local, and get following 
error:

java.io.FileNotFoundException: File does not exist: 
/root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar 
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:772)
 
at 
org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
 
at 
org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
 
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:254)
 
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:292)
 
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:365)
 
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) 
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:396) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
 
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) 
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) 
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:396) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
 
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) 
at 
org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:446) 
at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:683) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 
at java.lang.reflect.Method.invoke(Method.java:597) 
at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 
Job Submission failed with exception 'java.io.FileNotFoundException(File does 
not exist: 
/root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar)'

  was:
java.io.FileNotFoundException: File does not exist: 
/root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar 
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:772)
 
at 
org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
 
at 
org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
 
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:254)
 
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:292)
 
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:365)
 
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) 
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:396) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
 
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) 
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) 
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:396) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
 
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) 
at 
org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:446) 
at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:683) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 
at java.lang.reflect.Method.invoke(Method.java:597) 
at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 
Job Submission failed with exception

[jira] [Commented] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and mapreduce.framework.name is local

2013-01-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561407#comment-13561407
 ] 

Hadoop QA commented on MAPREDUCE-4957:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12566240/MAPREDUCE-4957.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 2024 javac 
compiler warnings (more than the trunk's current 2021 warnings).

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3265//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3265//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3265//console

This message is automatically generated.

 Throw FileNotFoundException when running in single node and 
 mapreduce.framework.name is local
 ---

 Key: MAPREDUCE-4957
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4957
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: yi
Assignee: yi
Priority: Minor
 Attachments: MAPREDUCE-4957.patch


 Run in single node and mapreduce.framework.name is local, and get following 
 error:
 java.io.FileNotFoundException: File does not exist: 
 /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar 
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:772)
  
 at 
 org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
  
 at 
 org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
  
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:254)
  
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:292)
  
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:365)
  
 at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) 
 at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) 
 at java.security.AccessController.doPrivileged(Native Method) 
 at javax.security.auth.Subject.doAs(Subject.java:396) 
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
  
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) 
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) 
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) 
 at java.security.AccessController.doPrivileged(Native Method) 
 at javax.security.auth.Subject.doAs(Subject.java:396) 
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
  
 at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) 
 at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:446) 
 at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:683) 
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  
 at java.lang.reflect.Method.invoke(Method.java:597) 
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 
 Job Submission failed with exception 'java.io.FileNotFoundException(File does 
 not exist: 
 /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar)'

--
This message is automatically generated by JIRA.
If you think it was sent

[jira] [Created] (MAPREDUCE-4958) close method of RawKeyValueIterator is not called after after finish using.

2013-01-23 Thread Jerry Chen (JIRA)

Jerry Chen created MAPREDUCE-4958:
-

 Summary: close method of RawKeyValueIterator is not called after 
after finish using.
 Key: MAPREDUCE-4958
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4958
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: trunk
Reporter: Jerry Chen


I observed that the close method of the RawKeyValueIterator returned from 
MergeManager is not called.

Which will cause resource leaks for RawKeyValueIterator implementation which 
depends on the RawKeyValueIterator.close for doing cleanup when finished.

Some other places in MapTask also not follow the convension to call 
RawKeyValueIterator.close after use it. 



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4951) Container preemption interpreted as task failure

2013-01-23 Thread Sandy Ryza (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561449#comment-13561449
 ] 

Sandy Ryza commented on MAPREDUCE-4951:
---

It doesn't seem to me that either approach would conflict with this patch at 
the moment.  While this code might get rewritten in the future, under the 
current preemption mechanism, when MR is explicitly told that a container was 
preempted, it should not count it as failed.  Does anybody disagree?

 Container preemption interpreted as task failure
 

 Key: MAPREDUCE-4951
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4951
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mr-am, mrv2
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-4951-1.patch, MAPREDUCE-4951-2.patch, 
 MAPREDUCE-4951.patch


 When YARN reports a completed container to the MR AM, it always interprets it 
 as a failure.  This can lead to a job failing because too many of its tasks 
 failed, when in fact they only failed because the scheduler preempted them.
 MR needs to recognize the special exit code value of -100 and interpret it as 
 a container being killed instead of a container failure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and mapreduce.framework.name is local

2013-01-23 Thread yi (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yi updated MAPREDUCE-4957:
--

Attachment: MAPREDUCE-4957.patch

 Throw FileNotFoundException when running in single node and 
 mapreduce.framework.name is local
 ---

 Key: MAPREDUCE-4957
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4957
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: yi
Assignee: yi
Priority: Minor
 Attachments: MAPREDUCE-4957.patch, MAPREDUCE-4957.patch


 Run in single node and mapreduce.framework.name is local, and get following 
 error:
 java.io.FileNotFoundException: File does not exist: 
 /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar 
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:772)
  
 at 
 org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
  
 at 
 org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
  
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:254)
  
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:292)
  
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:365)
  
 at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) 
 at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) 
 at java.security.AccessController.doPrivileged(Native Method) 
 at javax.security.auth.Subject.doAs(Subject.java:396) 
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
  
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) 
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) 
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) 
 at java.security.AccessController.doPrivileged(Native Method) 
 at javax.security.auth.Subject.doAs(Subject.java:396) 
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
  
 at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) 
 at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:446) 
 at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:683) 
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  
 at java.lang.reflect.Method.invoke(Method.java:597) 
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 
 Job Submission failed with exception 'java.io.FileNotFoundException(File does 
 not exist: 
 /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar)'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4946) Type conversion of map completion events leads to performance problems with large jobs

[jira] [Commented] (MAPREDUCE-4808) Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations

[jira] [Commented] (MAPREDUCE-4949) Enable multiple pi jobs to run in parallel

[jira] [Updated] (MAPREDUCE-4929) mapreduce.task.timeout is ignored

[jira] [Commented] (MAPREDUCE-4946) Type conversion of map completion events leads to performance problems with large jobs

[jira] [Commented] (MAPREDUCE-4808) Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations

[jira] [Commented] (MAPREDUCE-4946) Type conversion of map completion events leads to performance problems with large jobs

[jira] [Commented] (MAPREDUCE-4949) Enable multiple pi jobs to run in parallel

[jira] [Commented] (MAPREDUCE-4808) Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations

[jira] [Commented] (MAPREDUCE-4946) Type conversion of map completion events leads to performance problems with large jobs

[jira] [Commented] (MAPREDUCE-4949) Enable multiple pi jobs to run in parallel

[jira] [Commented] (MAPREDUCE-4951) Container preemption interpreted as task failure

[jira] [Commented] (MAPREDUCE-4951) Container preemption interpreted as task failure

[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service

[jira] [Commented] (MAPREDUCE-4838) Add extra info to JH files

[jira] [Created] (MAPREDUCE-4956) The Additional JH Info Should Be Exposed

[jira] [Updated] (MAPREDUCE-4838) Add extra info to JH files

[jira] [Created] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and mapreduce.framework.name is local

[jira] [Commented] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and mapreduce.framework.name is local

[jira] [Updated] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and mapreduce.framework.name is local

[jira] [Updated] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and mapreduce.framework.name is local

[jira] [Updated] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and mapreduce.framework.name is local

[jira] [Commented] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and mapreduce.framework.name is local

[jira] [Created] (MAPREDUCE-4958) close method of RawKeyValueIterator is not called after after finish using.

[jira] [Commented] (MAPREDUCE-4951) Container preemption interpreted as task failure

[jira] [Updated] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and mapreduce.framework.name is local

26 matches

Site Navigation

Mail list logo

Footer information