[jira] [Commented] (MAPREDUCE-5984) native-task: upgrade lz4 to lastest version

2014-07-22 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14069939#comment-14069939
 ] 

Binglin Chang commented on MAPREDUCE-5984:
--

bq.  but I'm wondering if it's possible to reuse the lz4 source files that are 
already checked in for hadoop-common
Sure, I will update the patch to copy lz4 files to building path. And we can 
upgrading the version in hadoop-common in trunk. 


 native-task: upgrade lz4 to lastest version
 ---

 Key: MAPREDUCE-5984
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5984
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: task
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
 Attachments: MAPREDUCE-5984.v1.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-2841) Task level native optimization

2014-07-22 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14069945#comment-14069945
 ] 

Binglin Chang commented on MAPREDUCE-2841:
--

Hi Sean, the test succeed on macosx, but failed on ubuntu12, I update the test 
a little in MAPREDUCE-5985.

 Task level native optimization
 --

 Key: MAPREDUCE-2841
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2841
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: task
 Environment: x86-64 Linux/Unix
Reporter: Binglin Chang
Assignee: Sean Zhong
 Attachments: DESIGN.html, MAPREDUCE-2841.v1.patch, 
 MAPREDUCE-2841.v2.patch, dualpivot-0.patch, dualpivotv20-0.patch, 
 fb-shuffle.patch, hadoop-3.0-mapreduce-2841-2014-7-17.patch


 I'm recently working on native optimization for MapTask based on JNI. 
 The basic idea is that, add a NativeMapOutputCollector to handle k/v pairs 
 emitted by mapper, therefore sort, spill, IFile serialization can all be done 
 in native code, preliminary test(on Xeon E5410, jdk6u24) showed promising 
 results:
 1. Sort is about 3x-10x as fast as java(only binary string compare is 
 supported)
 2. IFile serialization speed is about 3x of java, about 500MB/s, if hardware 
 CRC32C is used, things can get much faster(1G/
 3. Merge code is not completed yet, so the test use enough io.sort.mb to 
 prevent mid-spill
 This leads to a total speed up of 2x~3x for the whole MapTask, if 
 IdentityMapper(mapper does nothing) is used
 There are limitations of course, currently only Text and BytesWritable is 
 supported, and I have not think through many things right now, such as how to 
 support map side combine. I had some discussion with somebody familiar with 
 hive, it seems that these limitations won't be much problem for Hive to 
 benefit from those optimizations, at least. Advices or discussions about 
 improving compatibility are most welcome:) 
 Currently NativeMapOutputCollector has a static method called canEnable(), 
 which checks if key/value type, comparator type, combiner are all compatible, 
 then MapTask can choose to enable NativeMapOutputCollector.
 This is only a preliminary test, more work need to be done. I expect better 
 final results, and I believe similar optimization can be adopt to reduce task 
 and shuffle too. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-5988) Fix dead links to the javadocs of o.a.h.mapreduce.counters

2014-07-22 Thread Akira AJISAKA (JIRA)
Akira AJISAKA created MAPREDUCE-5988:


 Summary: Fix dead links to the javadocs of o.a.h.mapreduce.counters
 Key: MAPREDUCE-5988
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5988
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.4.1
Reporter: Akira AJISAKA
Priority: Minor


In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, 
AbstractCounters and CounterGroupBase are listed, but not linked.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5988) Fix dead links to the javadocs of o.a.h.mapreduce.counters

2014-07-22 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated MAPREDUCE-5988:
-

Attachment: MAPREDUCE-5988.patch

Removing {{@InterfaceAudience.Private}} from the package-info to generate the 
javadocs of {{CounterGroupBase}} and {{AbstractCounters}}.

 Fix dead links to the javadocs of o.a.h.mapreduce.counters
 --

 Key: MAPREDUCE-5988
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5988
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.4.1
Reporter: Akira AJISAKA
Priority: Minor
 Attachments: MAPREDUCE-5988.patch


 In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, 
 AbstractCounters and CounterGroupBase are listed, but not linked.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5988) Fix dead links to the javadocs in mapreduce project

2014-07-22 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated MAPREDUCE-5988:
-

Summary: Fix dead links to the javadocs in mapreduce project  (was: Fix 
dead links to the javadocs of o.a.h.mapreduce.counters)

 Fix dead links to the javadocs in mapreduce project
 ---

 Key: MAPREDUCE-5988
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5988
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.4.1
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
 Attachments: MAPREDUCE-5988.patch


 In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, some 
 classes are listed, but not linked.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5988) Fix dead links to the javadocs of o.a.h.mapreduce.counters

2014-07-22 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated MAPREDUCE-5988:
-

Description: In 
http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, some classes 
are listed, but not linked.  (was: In 
http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, 
AbstractCounters and CounterGroupBase are listed, but not linked.)
   Assignee: Akira AJISAKA

 Fix dead links to the javadocs of o.a.h.mapreduce.counters
 --

 Key: MAPREDUCE-5988
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5988
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.4.1
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
 Attachments: MAPREDUCE-5988.patch


 In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, some 
 classes are listed, but not linked.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5988) Fix dead links to the javadocs in mapreduce project

2014-07-22 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070062#comment-14070062
 ] 

Akira AJISAKA commented on MAPREDUCE-5988:
--

The below classes are linked, but undocumented.
- AbstractCounters
- CounterGroupBase
- CancelDelegationTokenRequest
- CancelDelegationTokenResponse
- GetDelegationTokenRequest
- RenewDelegationTokenRequest
- RenewDelegationTokenResponse
- HistoryFileManager
- HistoryStorage

 Fix dead links to the javadocs in mapreduce project
 ---

 Key: MAPREDUCE-5988
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5988
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.4.1
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
 Attachments: MAPREDUCE-5988.patch


 In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, some 
 classes are listed, but not linked.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5988) Fix dead links to the javadocs in mapreduce project

2014-07-22 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated MAPREDUCE-5988:
-

Description: In 
http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, some classes 
are listed, but not documented.  (was: In 
http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, some classes 
are listed, but not linked.)

 Fix dead links to the javadocs in mapreduce project
 ---

 Key: MAPREDUCE-5988
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5988
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.4.1
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
 Attachments: MAPREDUCE-5988.patch


 In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, some 
 classes are listed, but not documented.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5988) Fix dead links to the javadocs in mapreduce project

2014-07-22 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated MAPREDUCE-5988:
-

Attachment: MAPREDUCE-5988.2.patch

Removed {{@InterfaceAudience.Private}} from each package-info. I confirmed the 
javadocs of the above classes were generated.

 Fix dead links to the javadocs in mapreduce project
 ---

 Key: MAPREDUCE-5988
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5988
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.4.1
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
 Attachments: MAPREDUCE-5988.2.patch, MAPREDUCE-5988.patch


 In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, some 
 classes are listed, but not documented.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5988) Fix dead links to the javadocs in mapreduce project

2014-07-22 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated MAPREDUCE-5988:
-

Target Version/s: 2.6.0
  Status: Patch Available  (was: Open)

 Fix dead links to the javadocs in mapreduce project
 ---

 Key: MAPREDUCE-5988
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5988
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.4.1
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
 Attachments: MAPREDUCE-5988.2.patch, MAPREDUCE-5988.patch


 In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, some 
 classes are listed, but not documented.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5988) Fix dead links to the javadocs in mapreduce project

2014-07-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070091#comment-14070091
 ] 

Hadoop QA commented on MAPREDUCE-5988:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12657101/MAPREDUCE-5988.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs:

  org.apache.hadoop.mapreduce.v2.hs.TestJobHistoryParsing
  org.apache.hadoop.mapreduce.v2.hs.webapp.dao.TestJobInfo

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4760//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4760//console

This message is automatically generated.

 Fix dead links to the javadocs in mapreduce project
 ---

 Key: MAPREDUCE-5988
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5988
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.4.1
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
 Attachments: MAPREDUCE-5988.2.patch, MAPREDUCE-5988.patch


 In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, some 
 classes are listed, but not documented.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5957) AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used

2014-07-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070121#comment-14070121
 ] 

Hudson commented on MAPREDUCE-5957:
---

FAILURE: Integrated in Hadoop-Yarn-trunk #620 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/620/])
MAPREDUCE-5957. AM throws ClassNotFoundException with job classloader enabled 
if custom output format/committer is used. Contributed by Sangjin Lee (jlowe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612358)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/commit/CommitterEventHandler.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobs.java


 AM throws ClassNotFoundException with job classloader enabled if custom 
 output format/committer is used
 ---

 Key: MAPREDUCE-5957
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5957
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Fix For: 3.0.0, 2.6.0

 Attachments: MAPREDUCE-5957.branch-2.patch, MAPREDUCE-5957.patch, 
 MAPREDUCE-5957.patch, MAPREDUCE-5957.patch, MAPREDUCE-5957.patch, 
 MAPREDUCE-5957.patch, MAPREDUCE-5957.patch


 With the job classloader enabled, the MR AM throws ClassNotFoundException if 
 a custom output format class is specified.
 {noformat}
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
 java.lang.RuntimeException: java.lang.ClassNotFoundException: Class 
 com.foo.test.TestOutputFormat not found
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:473)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:374)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1459)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389)
 Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: 
 Class com.foo.test.TestOutputFormat not found
   at 
 org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1895)
   at 
 org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFormatClass(JobContextImpl.java:222)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:469)
   ... 8 more
 Caused by: java.lang.ClassNotFoundException: Class 
 com.foo.test.TestOutputFormat not found
   at 
 org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1801)
   at 
 org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1893)
   ... 10 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5756) CombineFileInputFormat.getSplits() including directories in its results

2014-07-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070123#comment-14070123
 ] 

Hudson commented on MAPREDUCE-5756:
---

FAILURE: Integrated in Hadoop-Yarn-trunk #620 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/620/])
MAPREDUCE-5756. CombineFileInputFormat.getSplits() including directories in its 
results. Contributed by Jason Dere (jlowe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612400)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/CombineFileInputFormat.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestCombineFileInputFormat.java


 CombineFileInputFormat.getSplits() including directories in its results
 ---

 Key: MAPREDUCE-5756
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5756
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere
 Fix For: 3.0.0, 2.6.0

 Attachments: MAPREDUCE-5756.1.patch, MAPREDUCE-5756.2.patch


 Trying to track down HIVE-6401, where we see some is not a file errors 
 because getSplits() is giving us directories.  I believe the culprit is 
 FileInputFormat.listStatus():
 {code}
 if (recursive  stat.isDirectory()) {
   addInputPathRecursively(result, fs, stat.getPath(),
   inputFilter);
 } else {
   result.add(stat);
 }
 {code}
 Which seems to be allowing directories to be added to the results if 
 recursive is false.  Is this meant to return directories? If not, I think it 
 should look like this:
 {code}
 if (stat.isDirectory()) {
  if (recursive) {
   addInputPathRecursively(result, fs, stat.getPath(),
   inputFilter);
  }
 } else {
   result.add(stat);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-5989) Add DeletionService in AM

2014-07-22 Thread Varun Saxena (JIRA)
Varun Saxena created MAPREDUCE-5989:
---

 Summary: Add DeletionService in AM
 Key: MAPREDUCE-5989
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5989
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: applicationmaster
Reporter: Varun Saxena
Assignee: Varun Saxena


In AM, for graceful cleanup, I propose addition of a DeletionService which will 
do the following :
1. Cleanup of failed tasks (temporary data need not occupy space till NM's 
Deletion Service is invoked)
2. Staging directory deletion (During AM shutdown, its better to place staging 
dir cleanup in Deletion Service: Refer to MAPREDUCE-4841 )




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5989) Add DeletionService in AM

2014-07-22 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070246#comment-14070246
 ] 

Jason Lowe commented on MAPREDUCE-5989:
---

Is this the same kind of DeletionService that the NM currently uses?  If so I'm 
unclear on the tangible benefits of this, since all that service does is 
potentially postpone deletions.  And as for the staging directory cleanup, 
implementing a deletion service is not needed to fix that issue.  Actually I 
believe it's already fixed by MAPREDUCE-5476 by having it delete the staging 
directory after unregistering so we know no other AM attempts will try to be 
launched after removing the staging directory.

If you could walk through an example scenario where the deletion service is 
used and how it's useful that would help me understand why adding such a 
service would be helpful.

 Add DeletionService in AM
 -

 Key: MAPREDUCE-5989
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5989
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: applicationmaster
Reporter: Varun Saxena
Assignee: Varun Saxena

 In AM, for graceful cleanup, I propose addition of a DeletionService which 
 will do the following :
 1. Cleanup of failed tasks (temporary data need not occupy space till NM's 
 Deletion Service is invoked)
 2. Staging directory deletion (During AM shutdown, its better to place 
 staging dir cleanup in Deletion Service: Refer to MAPREDUCE-4841 )



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (MAPREDUCE-4841) Application Master Retries fail due to FileNotFoundException

2014-07-22 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe reassigned MAPREDUCE-4841:
-

Assignee: Jason Lowe  (was: Devaraj K)

 Application Master Retries fail due to FileNotFoundException
 

 Key: MAPREDUCE-4841
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4841
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Affects Versions: 2.0.1-alpha
Reporter: Devaraj K
Assignee: Jason Lowe
Priority: Critical

 Application attempt1 is deleting the job related files and these are not 
 present in the HDFS for following retries.
 {code:xml}
 Application application_1353724754961_0001 failed 4 times due to AM Container 
 for appattempt_1353724754961_0001_04 exited with exitCode: -1000 due to: 
 RemoteTrace: java.io.FileNotFoundException: File does not exist: 
 hdfs://hacluster:8020/tmp/hadoop-yarn/staging/mapred/.staging/job_1353724754961_0001/appTokens
  at 
 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:752)
  at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:88) at 
 org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:49) at 
 org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:157) at 
 org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:155) at 
 java.security.AccessController.doPrivileged(Native Method) at 
 javax.security.auth.Subject.doAs(Subject.java:396) at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
  at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:153) at 
 org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49) at 
 java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at 
 java.util.concurrent.FutureTask.run(FutureTask.java:138) at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at 
 java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at 
 java.util.concurrent.FutureTask.run(FutureTask.java:138) at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
  at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
  at java.lang.Thread.run(Thread.java:662) at LocalTrace: 
 org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: File 
 does not exist: 
 hdfs://hacluster:8020/tmp/hadoop-yarn/staging/mapred/.staging/job_1353724754961_0001/appTokens
  at 
 org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217)
  at 
 org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147)
  at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:822)
  at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:492)
  at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:221)
  at 
 org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46)
  at 
 org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:57)
  at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:924) at 
 org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692) at 
 org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688) at 
 java.security.AccessController.doPrivileged(Native Method) at 
 javax.security.auth.Subject.doAs(Subject.java:396) at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1686) .Failing this 
 attempt.. Failing the application. 
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-4841) Application Master Retries fail due to FileNotFoundException

2014-07-22 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070247#comment-14070247
 ] 

Jason Lowe commented on MAPREDUCE-4841:
---

I believe this has been fixed by MAPREDUCE-5476.  [~devaraj.k] if you agree 
then we can mark this is as a duplicate of that JIRA.

 Application Master Retries fail due to FileNotFoundException
 

 Key: MAPREDUCE-4841
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4841
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Affects Versions: 2.0.1-alpha
Reporter: Devaraj K
Assignee: Jason Lowe
Priority: Critical

 Application attempt1 is deleting the job related files and these are not 
 present in the HDFS for following retries.
 {code:xml}
 Application application_1353724754961_0001 failed 4 times due to AM Container 
 for appattempt_1353724754961_0001_04 exited with exitCode: -1000 due to: 
 RemoteTrace: java.io.FileNotFoundException: File does not exist: 
 hdfs://hacluster:8020/tmp/hadoop-yarn/staging/mapred/.staging/job_1353724754961_0001/appTokens
  at 
 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:752)
  at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:88) at 
 org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:49) at 
 org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:157) at 
 org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:155) at 
 java.security.AccessController.doPrivileged(Native Method) at 
 javax.security.auth.Subject.doAs(Subject.java:396) at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
  at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:153) at 
 org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49) at 
 java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at 
 java.util.concurrent.FutureTask.run(FutureTask.java:138) at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at 
 java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at 
 java.util.concurrent.FutureTask.run(FutureTask.java:138) at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
  at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
  at java.lang.Thread.run(Thread.java:662) at LocalTrace: 
 org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: File 
 does not exist: 
 hdfs://hacluster:8020/tmp/hadoop-yarn/staging/mapred/.staging/job_1353724754961_0001/appTokens
  at 
 org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217)
  at 
 org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147)
  at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:822)
  at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:492)
  at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:221)
  at 
 org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46)
  at 
 org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:57)
  at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:924) at 
 org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692) at 
 org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688) at 
 java.security.AccessController.doPrivileged(Native Method) at 
 javax.security.auth.Subject.doAs(Subject.java:396) at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1686) .Failing this 
 attempt.. Failing the application. 
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-4841) Application Master Retries fail due to FileNotFoundException

2014-07-22 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K resolved MAPREDUCE-4841.
--

Resolution: Fixed

It has been fixed by MAPREDUCE-5476, closing it as duplicate of MAPREDUCE-5476.

 Application Master Retries fail due to FileNotFoundException
 

 Key: MAPREDUCE-4841
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4841
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Affects Versions: 2.0.1-alpha
Reporter: Devaraj K
Assignee: Jason Lowe
Priority: Critical

 Application attempt1 is deleting the job related files and these are not 
 present in the HDFS for following retries.
 {code:xml}
 Application application_1353724754961_0001 failed 4 times due to AM Container 
 for appattempt_1353724754961_0001_04 exited with exitCode: -1000 due to: 
 RemoteTrace: java.io.FileNotFoundException: File does not exist: 
 hdfs://hacluster:8020/tmp/hadoop-yarn/staging/mapred/.staging/job_1353724754961_0001/appTokens
  at 
 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:752)
  at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:88) at 
 org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:49) at 
 org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:157) at 
 org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:155) at 
 java.security.AccessController.doPrivileged(Native Method) at 
 javax.security.auth.Subject.doAs(Subject.java:396) at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
  at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:153) at 
 org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49) at 
 java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at 
 java.util.concurrent.FutureTask.run(FutureTask.java:138) at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at 
 java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at 
 java.util.concurrent.FutureTask.run(FutureTask.java:138) at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
  at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
  at java.lang.Thread.run(Thread.java:662) at LocalTrace: 
 org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: File 
 does not exist: 
 hdfs://hacluster:8020/tmp/hadoop-yarn/staging/mapred/.staging/job_1353724754961_0001/appTokens
  at 
 org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217)
  at 
 org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147)
  at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:822)
  at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:492)
  at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:221)
  at 
 org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46)
  at 
 org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:57)
  at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:924) at 
 org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692) at 
 org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688) at 
 java.security.AccessController.doPrivileged(Native Method) at 
 javax.security.auth.Subject.doAs(Subject.java:396) at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1686) .Failing this 
 attempt.. Failing the application. 
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Reopened] (MAPREDUCE-4841) Application Master Retries fail due to FileNotFoundException

2014-07-22 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K reopened MAPREDUCE-4841:
--


 Application Master Retries fail due to FileNotFoundException
 

 Key: MAPREDUCE-4841
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4841
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Affects Versions: 2.0.1-alpha
Reporter: Devaraj K
Assignee: Jason Lowe
Priority: Critical

 Application attempt1 is deleting the job related files and these are not 
 present in the HDFS for following retries.
 {code:xml}
 Application application_1353724754961_0001 failed 4 times due to AM Container 
 for appattempt_1353724754961_0001_04 exited with exitCode: -1000 due to: 
 RemoteTrace: java.io.FileNotFoundException: File does not exist: 
 hdfs://hacluster:8020/tmp/hadoop-yarn/staging/mapred/.staging/job_1353724754961_0001/appTokens
  at 
 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:752)
  at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:88) at 
 org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:49) at 
 org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:157) at 
 org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:155) at 
 java.security.AccessController.doPrivileged(Native Method) at 
 javax.security.auth.Subject.doAs(Subject.java:396) at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
  at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:153) at 
 org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49) at 
 java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at 
 java.util.concurrent.FutureTask.run(FutureTask.java:138) at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at 
 java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at 
 java.util.concurrent.FutureTask.run(FutureTask.java:138) at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
  at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
  at java.lang.Thread.run(Thread.java:662) at LocalTrace: 
 org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: File 
 does not exist: 
 hdfs://hacluster:8020/tmp/hadoop-yarn/staging/mapred/.staging/job_1353724754961_0001/appTokens
  at 
 org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217)
  at 
 org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147)
  at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:822)
  at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:492)
  at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:221)
  at 
 org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46)
  at 
 org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:57)
  at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:924) at 
 org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692) at 
 org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688) at 
 java.security.AccessController.doPrivileged(Native Method) at 
 javax.security.auth.Subject.doAs(Subject.java:396) at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1686) .Failing this 
 attempt.. Failing the application. 
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-4841) Application Master Retries fail due to FileNotFoundException

2014-07-22 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K resolved MAPREDUCE-4841.
--

Resolution: Duplicate

 Application Master Retries fail due to FileNotFoundException
 

 Key: MAPREDUCE-4841
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4841
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Affects Versions: 2.0.1-alpha
Reporter: Devaraj K
Assignee: Jason Lowe
Priority: Critical

 Application attempt1 is deleting the job related files and these are not 
 present in the HDFS for following retries.
 {code:xml}
 Application application_1353724754961_0001 failed 4 times due to AM Container 
 for appattempt_1353724754961_0001_04 exited with exitCode: -1000 due to: 
 RemoteTrace: java.io.FileNotFoundException: File does not exist: 
 hdfs://hacluster:8020/tmp/hadoop-yarn/staging/mapred/.staging/job_1353724754961_0001/appTokens
  at 
 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:752)
  at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:88) at 
 org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:49) at 
 org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:157) at 
 org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:155) at 
 java.security.AccessController.doPrivileged(Native Method) at 
 javax.security.auth.Subject.doAs(Subject.java:396) at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
  at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:153) at 
 org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49) at 
 java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at 
 java.util.concurrent.FutureTask.run(FutureTask.java:138) at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at 
 java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at 
 java.util.concurrent.FutureTask.run(FutureTask.java:138) at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
  at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
  at java.lang.Thread.run(Thread.java:662) at LocalTrace: 
 org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: File 
 does not exist: 
 hdfs://hacluster:8020/tmp/hadoop-yarn/staging/mapred/.staging/job_1353724754961_0001/appTokens
  at 
 org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217)
  at 
 org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147)
  at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:822)
  at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:492)
  at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:221)
  at 
 org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46)
  at 
 org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:57)
  at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:924) at 
 org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692) at 
 org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688) at 
 java.security.AccessController.doPrivileged(Native Method) at 
 javax.security.auth.Subject.doAs(Subject.java:396) at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1686) .Failing this 
 attempt.. Failing the application. 
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5963) ShuffleHandler DB schema should be versioned with compatible/incompatible changes

2014-07-22 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated MAPREDUCE-5963:
--

Attachment: MAPREDUCE-5963-v2.1.patch

In latest patch, fix the findbug warning.

 ShuffleHandler DB schema should be versioned with compatible/incompatible 
 changes
 -

 Key: MAPREDUCE-5963
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5963
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Affects Versions: 2.4.1
Reporter: Junping Du
Assignee: Junping Du
 Attachments: MAPREDUCE-5963-v2.1.patch, MAPREDUCE-5963-v2.patch, 
 MAPREDUCE-5963.patch


 ShuffleHandler persist job shuffle info into DB schema, which should be 
 versioned with compatible/incompatible changes to support rolling upgrade.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5756) CombineFileInputFormat.getSplits() including directories in its results

2014-07-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070284#comment-14070284
 ] 

Hudson commented on MAPREDUCE-5756:
---

FAILURE: Integrated in Hadoop-Hdfs-trunk #1812 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1812/])
MAPREDUCE-5756. CombineFileInputFormat.getSplits() including directories in its 
results. Contributed by Jason Dere (jlowe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612400)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/CombineFileInputFormat.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestCombineFileInputFormat.java


 CombineFileInputFormat.getSplits() including directories in its results
 ---

 Key: MAPREDUCE-5756
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5756
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere
 Fix For: 3.0.0, 2.6.0

 Attachments: MAPREDUCE-5756.1.patch, MAPREDUCE-5756.2.patch


 Trying to track down HIVE-6401, where we see some is not a file errors 
 because getSplits() is giving us directories.  I believe the culprit is 
 FileInputFormat.listStatus():
 {code}
 if (recursive  stat.isDirectory()) {
   addInputPathRecursively(result, fs, stat.getPath(),
   inputFilter);
 } else {
   result.add(stat);
 }
 {code}
 Which seems to be allowing directories to be added to the results if 
 recursive is false.  Is this meant to return directories? If not, I think it 
 should look like this:
 {code}
 if (stat.isDirectory()) {
  if (recursive) {
   addInputPathRecursively(result, fs, stat.getPath(),
   inputFilter);
  }
 } else {
   result.add(stat);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5957) AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used

2014-07-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070282#comment-14070282
 ] 

Hudson commented on MAPREDUCE-5957:
---

FAILURE: Integrated in Hadoop-Hdfs-trunk #1812 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1812/])
MAPREDUCE-5957. AM throws ClassNotFoundException with job classloader enabled 
if custom output format/committer is used. Contributed by Sangjin Lee (jlowe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612358)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/commit/CommitterEventHandler.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobs.java


 AM throws ClassNotFoundException with job classloader enabled if custom 
 output format/committer is used
 ---

 Key: MAPREDUCE-5957
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5957
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Fix For: 3.0.0, 2.6.0

 Attachments: MAPREDUCE-5957.branch-2.patch, MAPREDUCE-5957.patch, 
 MAPREDUCE-5957.patch, MAPREDUCE-5957.patch, MAPREDUCE-5957.patch, 
 MAPREDUCE-5957.patch, MAPREDUCE-5957.patch


 With the job classloader enabled, the MR AM throws ClassNotFoundException if 
 a custom output format class is specified.
 {noformat}
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
 java.lang.RuntimeException: java.lang.ClassNotFoundException: Class 
 com.foo.test.TestOutputFormat not found
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:473)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:374)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1459)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389)
 Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: 
 Class com.foo.test.TestOutputFormat not found
   at 
 org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1895)
   at 
 org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFormatClass(JobContextImpl.java:222)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:469)
   ... 8 more
 Caused by: java.lang.ClassNotFoundException: Class 
 com.foo.test.TestOutputFormat not found
   at 
 org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1801)
   at 
 org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1893)
   ... 10 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5963) ShuffleHandler DB schema should be versioned with compatible/incompatible changes

2014-07-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070309#comment-14070309
 ] 

Hadoop QA commented on MAPREDUCE-5963:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12657123/MAPREDUCE-5963-v2.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4761//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4761//console

This message is automatically generated.

 ShuffleHandler DB schema should be versioned with compatible/incompatible 
 changes
 -

 Key: MAPREDUCE-5963
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5963
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Affects Versions: 2.4.1
Reporter: Junping Du
Assignee: Junping Du
 Attachments: MAPREDUCE-5963-v2.1.patch, MAPREDUCE-5963-v2.patch, 
 MAPREDUCE-5963.patch


 ShuffleHandler persist job shuffle info into DB schema, which should be 
 versioned with compatible/incompatible changes to support rolling upgrade.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5756) CombineFileInputFormat.getSplits() including directories in its results

2014-07-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070356#comment-14070356
 ] 

Hudson commented on MAPREDUCE-5756:
---

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1839 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1839/])
MAPREDUCE-5756. CombineFileInputFormat.getSplits() including directories in its 
results. Contributed by Jason Dere (jlowe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612400)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/CombineFileInputFormat.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestCombineFileInputFormat.java


 CombineFileInputFormat.getSplits() including directories in its results
 ---

 Key: MAPREDUCE-5756
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5756
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere
 Fix For: 3.0.0, 2.6.0

 Attachments: MAPREDUCE-5756.1.patch, MAPREDUCE-5756.2.patch


 Trying to track down HIVE-6401, where we see some is not a file errors 
 because getSplits() is giving us directories.  I believe the culprit is 
 FileInputFormat.listStatus():
 {code}
 if (recursive  stat.isDirectory()) {
   addInputPathRecursively(result, fs, stat.getPath(),
   inputFilter);
 } else {
   result.add(stat);
 }
 {code}
 Which seems to be allowing directories to be added to the results if 
 recursive is false.  Is this meant to return directories? If not, I think it 
 should look like this:
 {code}
 if (stat.isDirectory()) {
  if (recursive) {
   addInputPathRecursively(result, fs, stat.getPath(),
   inputFilter);
  }
 } else {
   result.add(stat);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5957) AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used

2014-07-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070354#comment-14070354
 ] 

Hudson commented on MAPREDUCE-5957:
---

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1839 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1839/])
MAPREDUCE-5957. AM throws ClassNotFoundException with job classloader enabled 
if custom output format/committer is used. Contributed by Sangjin Lee (jlowe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612358)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/commit/CommitterEventHandler.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobs.java


 AM throws ClassNotFoundException with job classloader enabled if custom 
 output format/committer is used
 ---

 Key: MAPREDUCE-5957
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5957
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Fix For: 3.0.0, 2.6.0

 Attachments: MAPREDUCE-5957.branch-2.patch, MAPREDUCE-5957.patch, 
 MAPREDUCE-5957.patch, MAPREDUCE-5957.patch, MAPREDUCE-5957.patch, 
 MAPREDUCE-5957.patch, MAPREDUCE-5957.patch


 With the job classloader enabled, the MR AM throws ClassNotFoundException if 
 a custom output format class is specified.
 {noformat}
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
 java.lang.RuntimeException: java.lang.ClassNotFoundException: Class 
 com.foo.test.TestOutputFormat not found
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:473)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:374)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1459)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389)
 Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: 
 Class com.foo.test.TestOutputFormat not found
   at 
 org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1895)
   at 
 org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFormatClass(JobContextImpl.java:222)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:469)
   ... 8 more
 Caused by: java.lang.ClassNotFoundException: Class 
 com.foo.test.TestOutputFormat not found
   at 
 org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1801)
   at 
 org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1893)
   ... 10 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-250) JobTracker should log the scheduling of setup/cleanup task

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-250.


Resolution: Fixed

Fairly confident this has been fixed. Closing as stale.

 JobTracker should log the scheduling of setup/cleanup task
 --

 Key: MAPREDUCE-250
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-250
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Amar Kamat

 Setup/Cleanup is launched under (m+1)^th^ tip or (r+1)^th^ tip. It will be 
 nice if jobtracker logs this info.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-2811) Adding Multiple Reducers implementations.

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-2811:


Description: Like HADOOP-372, we have a multi format Reducer too. Someone 
suggested that if we need different reducers and map implementations(like what 
i need) I was better of by writing 2 jobs. I dont quite agree. I am calculating 
2 big matrices that must be calculated in the map step, summed in the reducers 
multiplied and then written to a file. The First mapper sums a matrix  based on 
the i,j th index(key) into the file and the second mapper adds the N*1  
dimension vector that uses a new line as key. These keys must be passed as such 
to the reduce process.  (was: Like the Patch released here 
https://issues.apache.org/jira/browse/HADOOP-372 can we have a multi format 
Reducer too. Someone suggested that if we need different reducers and map 
implementations(like what i need) I was better of by writing 2 jobs. I dont 
quite agree. I am calculating 2 big matrices that must be calculated in the map 
step, summed in the reducers multiplied and then written to a file. The First 
mapper sums a matrix  based on the i,j th index(key) into the file and the 
second mapper adds the N*1  dimension vector that uses a new line as key. These 
keys must be passed as such to the reduce process.)

 Adding Multiple Reducers implementations.
 -

 Key: MAPREDUCE-2811
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2811
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Sidharth Gupta

 Like HADOOP-372, we have a multi format Reducer too. Someone suggested that 
 if we need different reducers and map implementations(like what i need) I was 
 better of by writing 2 jobs. I dont quite agree. I am calculating 2 big 
 matrices that must be calculated in the map step, summed in the reducers 
 multiplied and then written to a file. The First mapper sums a matrix  based 
 on the i,j th index(key) into the file and the second mapper adds the N*1  
 dimension vector that uses a new line as key. These keys must be passed as 
 such to the reduce process.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-126) Job history analysis showing wrong job runtime

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-126:
---

Labels: newbie  (was: )

 Job history analysis showing wrong job runtime
 --

 Key: MAPREDUCE-126
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-126
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.1
Reporter: Amar Kamat
  Labels: newbie

 Analysis of completed jobs shows wrong runtime. Here is the faulty code
 {code:title=analysisjobhistory.jsp|borderStyle=solid}
 bFinished At : /b  %=StringUtils.getFormattedTimeWithDiff(dateFormat, 
 job.getLong(Keys.FINISH_TIME), job.getLong(Keys.LAUNCH_TIME)) %br/
 {code}
 I think it should be 
 {code:title=analysisjobhistory.jsp|borderStyle=solid}
 bFinished At : /b  %=StringUtils.getFormattedTimeWithDiff(dateFormat, 
 job.getLong(Keys.FINISH_TIME), job.getLong(Keys.SUBMIT_TIME)) %br/
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-126) Job history analysis showing wrong job runtime

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-126.


Resolution: Incomplete

This code is long gone in 2.x. Closing as stale.

 Job history analysis showing wrong job runtime
 --

 Key: MAPREDUCE-126
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-126
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.1
Reporter: Amar Kamat
  Labels: newbie

 Analysis of completed jobs shows wrong runtime. Here is the faulty code
 {code:title=analysisjobhistory.jsp|borderStyle=solid}
 bFinished At : /b  %=StringUtils.getFormattedTimeWithDiff(dateFormat, 
 job.getLong(Keys.FINISH_TIME), job.getLong(Keys.LAUNCH_TIME)) %br/
 {code}
 I think it should be 
 {code:title=analysisjobhistory.jsp|borderStyle=solid}
 bFinished At : /b  %=StringUtils.getFormattedTimeWithDiff(dateFormat, 
 job.getLong(Keys.FINISH_TIME), job.getLong(Keys.SUBMIT_TIME)) %br/
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-484) Logos for Hive and JobTracker

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-484.


Resolution: Fixed

Stale. Closing.

 Logos for Hive and JobTracker
 -

 Key: MAPREDUCE-484
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-484
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Aaron Newton
Priority: Trivial
 Attachments: hive  job tracker icons (font outlines).ai, hive  job 
 tracker icons (font outlines).pdf, hive  job tracker icons (font 
 outlines).pdf, hive.png, hive.png, jobtracker.png


 Greetings fine Hadoop peoples,
 While working on a few projects here at Cloudera we found ourselves wanting 
 for some sort of icon for both the JobTracker and for Hive. After checking on 
 the project page for Hive (the JobTracker doesn't really have one) and 
 finding that these items have no icons, we rolled up our sleeves and made 
 some. We'd like to contribute these to the project, so if you want 'em, 
 they're all yours. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-700) Too many copies of job-conf with the jobtracker

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-700.


Resolution: Fixed

Lots of changes here already. Closing this as stale.

 Too many copies of job-conf with the jobtracker
 ---

 Key: MAPREDUCE-700
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-700
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Reporter: Amar Kamat
Assignee: Amar Kamat

 As of today the jobtracker has job-conf copies in
 # mapred.system.dir : created while job-submission 
 # jobtracker-subdir (created by JobInProgress upon creation)
 # log-dir : created upon job-init
 # history-dir : created upon job-init
 Its difficult to manage these conf files. The problem aggravates under 
 restart.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-173) JobConf should also load resources from hdfs (or other filesystems)

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-173.


Resolution: Fixed

This is almost certainly fixed by now.

 JobConf should also load resources from hdfs (or other filesystems)
 ---

 Key: MAPREDUCE-173
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-173
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Amar Kamat

 {{JobConf conf = new JobConf(path)}} doesnt load the configuration if _path_ 
 points to a resource on hdfs. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-322) TaskTracker shuold run user tasks nicely in the local machine

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-322.


Resolution: Fixed

This has been fixed with both cgroups and task level niceness.

 TaskTracker shuold run user tasks nicely in the local machine
 -

 Key: MAPREDUCE-322
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-322
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Tsz Wo Nicholas Sze

 If one task tried to use all CPUs in a local machine, all other tasks or 
 processes (includes tasktracker and datanode daemons) may hardly get a chance 
 to run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-275) Display lost tracker information on the jobtracker webui and persist it across restarts

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-275.


Resolution: Won't Fix

I'm going to close this as Won't Fix.

 Display lost tracker information on the jobtracker webui and persist it 
 across restarts
 ---

 Key: MAPREDUCE-275
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-275
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Amar Kamat
Assignee: Amar Kamat

 As of today its difficult to distinguish between active tracker and lost 
 trackers (lost trackers are considered active). It will be nice if the 
 jobtracker can display what all trackers are lost and maintain it across 
 restarts. HADOOP-5643 does something similar for decommissioned trackers.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-156) ProcessTree.destroy() is sleeping for 5 seconds holding the task slot

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-156.


Resolution: Won't Fix

This is intentional to potentially give time for the process to clean up. 
Closing as won't fix.

 ProcessTree.destroy() is sleeping for 5 seconds holding the task slot
 -

 Key: MAPREDUCE-156
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-156
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Ravi Gummadi

 Currently, in ProcessTree.destroy(), after sending SIGTERM to the task JVM, 
 TT sleeps for 5 seconds(default value of 
 mapred.tasktracker.tasks.sleeptime-before-sigkill) before sending SIGKILL. 
 This seems to be blocking the task slot(not getting released) for 5 seconds. 
 We should avoid this so that another task could be launched in that slot 
 immediately.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-327) Add explicit remote map count JobTracker metrics

2014-07-22 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070734#comment-14070734
 ] 

Allen Wittenauer commented on MAPREDUCE-327:


In real-world scenarios, we've discovered that task locality as reported by the 
system can effectively be a lie because of CFIF/MFIF. Given 4 input splits, if 
the first is local but the rest are not, the task will still be considered 
local even though 3/4'ths of the data came off rack!

 Add explicit remote map count JobTracker metrics
 

 Key: MAPREDUCE-327
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-327
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Hong Tang
  Labels: newbie

 I am proposing to add a counter REMOTE_MAPS in addition to the following 
 counters: TOTAL_MAPS, DATA_LOCAL_MAPS, RACK_LOCAL_MAPS. A Map Task is 
 considered a remote-map iff the input split returns a set of locations, but 
 none is chosen to execute the map task.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-327) Add explicit remote map count JobTracker metrics

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-327:
---

Labels: newbie  (was: )

 Add explicit remote map count JobTracker metrics
 

 Key: MAPREDUCE-327
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-327
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Hong Tang
  Labels: newbie

 I am proposing to add a counter REMOTE_MAPS in addition to the following 
 counters: TOTAL_MAPS, DATA_LOCAL_MAPS, RACK_LOCAL_MAPS. A Map Task is 
 considered a remote-map iff the input split returns a set of locations, but 
 none is chosen to execute the map task.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5990) If output directory can not be created, error message on stdout does not provide any clue.

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5990:


Component/s: examples

 If output directory can not be created, error message on stdout does not 
 provide any clue.
 --

 Key: MAPREDUCE-5990
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5990
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: examples
Reporter: Suhas Gogate
  Labels: newbie

 In the following wordcount example output directory path can not be created 
 because /temp does not exists and user has not privileges to create output 
 path at /. 
 hadoop --config ./clustdir/ jar /homes/gogate/wordcount.jar 
 com..wordcount.WordCount /in-path/gogate/myfile /temp/mywc-gogate 
 09/04/28 23:00:32 WARN mapred.JobClient: Use GenericOptionsParser for parsing 
 the arguments. Applications should implement Tool for the same.
 09/04/28 23:00:32 INFO mapred.FileInputFormat: Total input paths to process : 
 1
 09/04/28 23:00:32 INFO mapred.FileInputFormat: Total input paths to process : 
 1
 09/04/28 23:00:33 INFO mapred.JobClient: Running job: job_200904282249_0004
 java.io.IOException: Job failed!
   at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1113)
   at com..wordcount.WordCount.main(WordCount.java:55)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
   at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
   at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5963) ShuffleHandler DB schema should be versioned with compatible/incompatible changes

2014-07-22 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070737#comment-14070737
 ] 

Jason Lowe commented on MAPREDUCE-5963:
---

+1 lgtm.  Committing this.

 ShuffleHandler DB schema should be versioned with compatible/incompatible 
 changes
 -

 Key: MAPREDUCE-5963
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5963
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Affects Versions: 2.4.1
Reporter: Junping Du
Assignee: Junping Du
 Attachments: MAPREDUCE-5963-v2.1.patch, MAPREDUCE-5963-v2.patch, 
 MAPREDUCE-5963.patch


 ShuffleHandler persist job shuffle info into DB schema, which should be 
 versioned with compatible/incompatible changes to support rolling upgrade.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Moved] (MAPREDUCE-5990) If output directory can not be created, error message on stdout does not provide any clue.

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer moved HADOOP-5756 to MAPREDUCE-5990:
-

Affects Version/s: (was: 0.18.3)
   Issue Type: Improvement  (was: Bug)
  Key: MAPREDUCE-5990  (was: HADOOP-5756)
  Project: Hadoop Map/Reduce  (was: Hadoop Common)

 If output directory can not be created, error message on stdout does not 
 provide any clue.
 --

 Key: MAPREDUCE-5990
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5990
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: examples
Reporter: Suhas Gogate
  Labels: newbie

 In the following wordcount example output directory path can not be created 
 because /temp does not exists and user has not privileges to create output 
 path at /. 
 hadoop --config ./clustdir/ jar /homes/gogate/wordcount.jar 
 com..wordcount.WordCount /in-path/gogate/myfile /temp/mywc-gogate 
 09/04/28 23:00:32 WARN mapred.JobClient: Use GenericOptionsParser for parsing 
 the arguments. Applications should implement Tool for the same.
 09/04/28 23:00:32 INFO mapred.FileInputFormat: Total input paths to process : 
 1
 09/04/28 23:00:32 INFO mapred.FileInputFormat: Total input paths to process : 
 1
 09/04/28 23:00:33 INFO mapred.JobClient: Running job: job_200904282249_0004
 java.io.IOException: Job failed!
   at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1113)
   at com..wordcount.WordCount.main(WordCount.java:55)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
   at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
   at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-197) add new options to mapred job -list-attempt-ids to dump counters and diagnostic messages

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-197:
---

Description: 
It would be very nice when tracking down tasks that have strange values for 
their counters, if there was a command line tool to print out the task attempts 
and their counters and diagnostic messages. I propose adding switches to 
-list-attempt-ids to accomplish that:

{quote}
mapred job -list-attempt-ids [-counters] [-diagnostics] job type state
{quote}

  was:
It would be very nice when tracking down tasks that have strange values for 
their counters, if there was a command line tool to print out the task attempts 
and their counters and diagnostic messages. I propose adding switches to 
-list-attempt-ids to accomplish that:

{quote}
hadoop job -list-attempt-ids [-counters] [-diagnostics] job type state
{quote}


 add new options to mapred job -list-attempt-ids to dump counters and 
 diagnostic messages
 

 Key: MAPREDUCE-197
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-197
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Owen O'Malley
Assignee: Owen O'Malley
  Labels: newbie

 It would be very nice when tracking down tasks that have strange values for 
 their counters, if there was a command line tool to print out the task 
 attempts and their counters and diagnostic messages. I propose adding 
 switches to -list-attempt-ids to accomplish that:
 {quote}
 mapred job -list-attempt-ids [-counters] [-diagnostics] job type state
 {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-197) add new options to mapred job -list-attempt-ids to dump counters and diagnostic messages

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-197:
---

Summary: add new options to mapred job -list-attempt-ids to dump counters 
and diagnostic messages  (was: add new options to hadoop job -list-attempt-ids 
to dump counters and diagnostic messages)

 add new options to mapred job -list-attempt-ids to dump counters and 
 diagnostic messages
 

 Key: MAPREDUCE-197
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-197
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Owen O'Malley
Assignee: Owen O'Malley
  Labels: newbie

 It would be very nice when tracking down tasks that have strange values for 
 their counters, if there was a command line tool to print out the task 
 attempts and their counters and diagnostic messages. I propose adding 
 switches to -list-attempt-ids to accomplish that:
 {quote}
 hadoop job -list-attempt-ids [-counters] [-diagnostics] job type state
 {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-168) hadoop job -list all should display the code for Killed also.

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-168:
---

Labels: newbie  (was: )

 hadoop job -list all should display the code for Killed also.
 -

 Key: MAPREDUCE-168
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-168
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Hemanth Yamijala
  Labels: newbie

 hadoop job -list all shows a legend for the states: PREP, SUCCEEDED, FAILED 
 and RUNNING. It should also display the state for KILLED (as 5).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-168) mapred job -list all should display the code for Killed also.

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-168:
---

Description: mapred job -list all shows a legend for the states: PREP, 
SUCCEEDED, FAILED and RUNNING. It should also display the state for KILLED (as 
5).  (was: hadoop job -list all shows a legend for the states: PREP, SUCCEEDED, 
FAILED and RUNNING. It should also display the state for KILLED (as 5).)

 mapred job -list all should display the code for Killed also.
 -

 Key: MAPREDUCE-168
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-168
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Hemanth Yamijala
  Labels: newbie

 mapred job -list all shows a legend for the states: PREP, SUCCEEDED, FAILED 
 and RUNNING. It should also display the state for KILLED (as 5).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-197) add new options to hadoop job -list-attempt-ids to dump counters and diagnostic messages

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-197:
---

Labels: newbie  (was: )

 add new options to hadoop job -list-attempt-ids to dump counters and 
 diagnostic messages
 

 Key: MAPREDUCE-197
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-197
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Owen O'Malley
Assignee: Owen O'Malley
  Labels: newbie

 It would be very nice when tracking down tasks that have strange values for 
 their counters, if there was a command line tool to print out the task 
 attempts and their counters and diagnostic messages. I propose adding 
 switches to -list-attempt-ids to accomplish that:
 {quote}
 hadoop job -list-attempt-ids [-counters] [-diagnostics] job type state
 {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-168) mapred job -list all should display the code for Killed also.

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-168:
---

Summary: mapred job -list all should display the code for Killed also.  
(was: hadoop job -list all should display the code for Killed also.)

 mapred job -list all should display the code for Killed also.
 -

 Key: MAPREDUCE-168
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-168
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Hemanth Yamijala
  Labels: newbie

 hadoop job -list all shows a legend for the states: PREP, SUCCEEDED, FAILED 
 and RUNNING. It should also display the state for KILLED (as 5).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-403) ProcessTree can try and kill a null PID

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-403.


Resolution: Incomplete

I'm going to close this as stale.  If this is still an issue, probably better 
to open a new jira.

 ProcessTree can try and kill a null PID
 -

 Key: MAPREDUCE-403
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-403
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Steve Loughran
Priority: Minor

 Saw this in a test run, while trying to shut down a TaskTracker
 [sf-startdaemon-debug] 09/05/07 16:42:42 [Map-events fetcher for all reduce 
 tasks on tracker_morzine.hpl.hp.com:localhost/127.0.0.1:36239] INFO 
 mapred.TaskTracker : Shutting down: Map-events fetcher for all reduce tasks 
 on tracker_morzine.hpl.hp.com:localhost/127.0.0.1:36239
 [sf-startdaemon-debug] 09/05/07 16:42:42 [TerminatorThread] WARN 
 util.ProcessTree : Error executing shell command 
 org.apache.hadoop.util.Shell$ExitCodeException: ERROR: garbage process ID 
 -null.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-403) ProcessTree can try and kill a null PID

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-403:
---

Summary: ProcessTree can try and kill a null PID  (was: ProcessTree can 
try and kill a null POD)

 ProcessTree can try and kill a null PID
 -

 Key: MAPREDUCE-403
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-403
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Steve Loughran
Priority: Minor

 Saw this in a test run, while trying to shut down a TaskTracker
 [sf-startdaemon-debug] 09/05/07 16:42:42 [Map-events fetcher for all reduce 
 tasks on tracker_morzine.hpl.hp.com:localhost/127.0.0.1:36239] INFO 
 mapred.TaskTracker : Shutting down: Map-events fetcher for all reduce tasks 
 on tracker_morzine.hpl.hp.com:localhost/127.0.0.1:36239
 [sf-startdaemon-debug] 09/05/07 16:42:42 [TerminatorThread] WARN 
 util.ProcessTree : Error executing shell command 
 org.apache.hadoop.util.Shell$ExitCodeException: ERROR: garbage process ID 
 -null.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-526) Sometimes job does not get removed from scheduler queue after it is killed

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-526.


Resolution: Won't Fix

Closing this as won't fix.

 Sometimes job does not get removed from scheduler queue after it is killed
 --

 Key: MAPREDUCE-526
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-526
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Karam Singh

 Sometimes when we kill a job, it does get removed from waiting queue, while 
 job status: Killed with Job Setup and Cleanup: Successful 
 Also JobTracker webui shows job under failed jobs lists and hadoop job -list 
 all, hadoop queue queuename -showJobs also shows jobs state=5.
 Prior to killing job state was Running



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-635) IllegalArgumentException is thrown if mapred local dir is not writable.

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-635:
---

Labels: newbie  (was: )

 IllegalArgumentException is thrown if mapred local dir is not writable.
 ---

 Key: MAPREDUCE-635
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-635
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: examples
Reporter: Suman Sehgal
Priority: Minor
  Labels: newbie

 If specified mapred local directory doesn't have write permission or is 
 non-existent  then IllegalArgumentException is thrown. Following error 
 message was displayed while running a sleep job with non-writable mapred 
 local directory specified in mapred-site.xml. 
 sleep job command : $hadoop_home/bin/hadoop jar hadoop-examples.jar sleep -m 
 100 -r 10 
 2009-05-12 05:36:46,491 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
 from attempt_200905120525_0001_m_00_0: 
 java.lang.IllegalArgumentException: n must be positive
 at java.util.Random.nextInt(Random.java:250)
 at 
 org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:243)
 at 
 org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:289)
 at 
 org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
 at 
 org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:107)
 at 
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1115)
 at 
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1028)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:357)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
 at org.apache.hadoop.mapred.Child.main(Child.java:170)
 This error message(i.e. IllegalArgumentException) ,somehow, doesn't clearly 
 indicate that problem is with mapred local directory. Error message should be 
 more specific in this case.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5985) native-task: Fix build on macosx

2014-07-22 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070757#comment-14070757
 ] 

Todd Lipcon commented on MAPREDUCE-5985:


+1, looks good to me. This also fixed my Ubuntu build issue with the unistd.h 
inclusion.

 native-task: Fix build on macosx
 

 Key: MAPREDUCE-5985
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5985
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: task
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
 Attachments: MAPREDUCE-5985.v1.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5963) ShuffleHandler DB schema should be versioned with compatible/incompatible changes

2014-07-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070760#comment-14070760
 ] 

Hudson commented on MAPREDUCE-5963:
---

FAILURE: Integrated in Hadoop-trunk-Commit #5941 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5941/])
MAPREDUCE-5963. ShuffleHandler DB schema should be versioned with 
compatible/incompatible changes. Contributed by Junping Du (jlowe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612652)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/test/java/org/apache/hadoop/mapred/TestShuffleHandler.java


 ShuffleHandler DB schema should be versioned with compatible/incompatible 
 changes
 -

 Key: MAPREDUCE-5963
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5963
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Affects Versions: 2.4.1
Reporter: Junping Du
Assignee: Junping Du
 Attachments: MAPREDUCE-5963-v2.1.patch, MAPREDUCE-5963-v2.patch, 
 MAPREDUCE-5963.patch


 ShuffleHandler persist job shuffle info into DB schema, which should be 
 versioned with compatible/incompatible changes to support rolling upgrade.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-257) Preventing node from swapping

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-257.


Resolution: Fixed

prevent a node from swapping - don't exhaust memory - memory limits - 
fixed

Closing.

 Preventing node from swapping
 -

 Key: MAPREDUCE-257
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-257
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Hong Tang

 When a node swaps, it slows everything: maps running on that node, reducers 
 fetching output from the node, and DFS clients reading from the DN. We should 
 just treat it the same way as if OS exhausts memory and kill some tasks to 
 free up memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5963) ShuffleHandler DB schema should be versioned with compatible/incompatible changes

2014-07-22 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5963:
--

   Resolution: Fixed
Fix Version/s: 2.6.0
   3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks, Junping!  I committed this to trunk and branch-2.

 ShuffleHandler DB schema should be versioned with compatible/incompatible 
 changes
 -

 Key: MAPREDUCE-5963
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5963
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Affects Versions: 2.4.1
Reporter: Junping Du
Assignee: Junping Du
 Fix For: 3.0.0, 2.6.0

 Attachments: MAPREDUCE-5963-v2.1.patch, MAPREDUCE-5963-v2.patch, 
 MAPREDUCE-5963.patch


 ShuffleHandler persist job shuffle info into DB schema, which should be 
 versioned with compatible/incompatible changes to support rolling upgrade.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-1283) Support including 3rd party jars supplied in lib/ folder of eclipse project in hadoop jar

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-1283.
-

Resolution: Incomplete

Likely stale.

 Support including 3rd party jars supplied in lib/ folder of eclipse project 
 in hadoop jar
 -

 Key: MAPREDUCE-1283
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1283
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/eclipse-plugin
 Environment: Any
Reporter: Amit Nithian
Priority: Minor
 Attachments: jarmodule.patch


 Currently, the eclipse plugin only exports the generated class files to the 
 hadoop jar but if there are any 3rd party jars specified in the lib/ folder, 
 they should also get packaged in the jar for submission to the cluster. 
 Currently this has to be done manually which can slow down development. I am 
 working on a patch to the current plugin to support this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-401) du fails on Ubuntu in TestJobHistory

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-401.


Resolution: Fixed

Likely fixed forever ago.

 du fails on Ubuntu in TestJobHistory
 

 Key: MAPREDUCE-401
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-401
 Project: Hadoop Map/Reduce
  Issue Type: Bug
 Environment: Ubuntu 8.10 x86_64, lots of RAM and HDD spare, clean 
 SVN_HEAD of trunk
Reporter: Steve Loughran
Priority: Minor

 TestJobHistory.testJobHistoryUserLogLocation is failing, and there is an 
 error in the log related to du failing in the mini MR cluster



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-111) JobTracker.getSystemDir throws NPE if it is called during intialization

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-111.


Resolution: Not a Problem

 JobTracker.getSystemDir throws NPE if it is called during intialization
 ---

 Key: MAPREDUCE-111
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-111
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Reporter: Amareshwari Sriramadasu

 JobTracker.getSystemDir throws NPE if it is called during intialization.
 It should check if fileSystem is null and throw IllegalStateException, as in 
 getFilesystemName method.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-5985) native-task: Fix build on macosx

2014-07-22 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved MAPREDUCE-5985.


  Resolution: Fixed
Hadoop Flags: Reviewed

 native-task: Fix build on macosx
 

 Key: MAPREDUCE-5985
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5985
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: task
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
 Attachments: MAPREDUCE-5985.v1.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (MAPREDUCE-311) JobClient should use multiple volumes as hadoop.tmp.dir

2014-07-22 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070802#comment-14070802
 ] 

Allen Wittenauer edited comment on MAPREDUCE-311 at 7/22/14 8:11 PM:
-

It's likely too late to change hadoop.tmp.dir.  But this is still an issue.  
Debating opening a new JIRA (under YARN) that states the problem but not a 
solution so that hadoop.tmp.dir is left alone.


was (Author: aw):
It's likely too late to change hadoop.tmp.dir.  But this is still an issue.  
Debating opening a new JIRA that states the problem but not a solution so that 
hadoop.tmp.dir is left alone.

 JobClient should use multiple volumes as hadoop.tmp.dir
 ---

 Key: MAPREDUCE-311
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-311
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
 Environment: All
Reporter: Milind Bhandarkar

 Currently, hadoop.tmp.dir configuration variable allows specification of only 
 a single directory to be used as scratch space. In particular, on the job 
 launcher nodes with multiple volumes, this fails the entire job if the 
 tmp.dir is somehow unusable. When the job launcher nodes have multiple 
 volumes, the tmp space availability can be improved by using multiple volumes 
 (either randomly or in round-robin.) The code for choosing a volume from a 
 comma-separated list of multiple volumes is already there for 
 mapred.local.dir etc. That needs to be used by job client as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-1775) Streaming should use hadoop.tmp.dir instead of stream.tmpdir

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-1775:


Labels: newbie  (was: )

 Streaming should use hadoop.tmp.dir instead of stream.tmpdir
 

 Key: MAPREDUCE-1775
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1775
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/streaming
 Environment: All
Reporter: Milind Bhandarkar
Priority: Minor
  Labels: newbie

 Hadoop streaming currently uses stream.tmpdir (on the job-client side) to 
 create jars to be submitted etc. This only adds complexity to site-specific 
 configuration files. Instead, it should use hadoop.tmp.dir configuration 
 variable.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-5991) native-task should not run unit tests if native profile is not enabled

2014-07-22 Thread Todd Lipcon (JIRA)
Todd Lipcon created MAPREDUCE-5991:
--

 Summary: native-task should not run unit tests if native profile 
is not enabled
 Key: MAPREDUCE-5991
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5991
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Todd Lipcon


Currently, running mvn test without the 'native' profile enabled causes all 
of the native-task tests to fail. In order to integrate to trunk, we need to 
fix this - either using JUnit Assume commands in each test that depends on 
native code, or disabling the tests from the pom unless -Pnative is specified



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-311) JobClient should use multiple volumes as hadoop.tmp.dir

2014-07-22 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070802#comment-14070802
 ] 

Allen Wittenauer commented on MAPREDUCE-311:


It's likely too late to change hadoop.tmp.dir.  But this is still an issue.  
Debating opening a new JIRA that states the problem but not a solution so that 
hadoop.tmp.dir is left alone.

 JobClient should use multiple volumes as hadoop.tmp.dir
 ---

 Key: MAPREDUCE-311
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-311
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
 Environment: All
Reporter: Milind Bhandarkar

 Currently, hadoop.tmp.dir configuration variable allows specification of only 
 a single directory to be used as scratch space. In particular, on the job 
 launcher nodes with multiple volumes, this fails the entire job if the 
 tmp.dir is somehow unusable. When the job launcher nodes have multiple 
 volumes, the tmp space availability can be improved by using multiple volumes 
 (either randomly or in round-robin.) The code for choosing a volume from a 
 comma-separated list of multiple volumes is already there for 
 mapred.local.dir etc. That needs to be used by job client as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-5992) native-task test logs should not write to console

2014-07-22 Thread Todd Lipcon (JIRA)
Todd Lipcon created MAPREDUCE-5992:
--

 Summary: native-task test logs should not write to console
 Key: MAPREDUCE-5992
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5992
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Todd Lipcon
Assignee: Todd Lipcon


Most of our unit tests are configured with a log4j.properties test resource so 
they don't spout a bunch of output to the console. We need to do the same for 
native-task.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-537) Instrument events in the capacity scheduler for collecting metrics information

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-537.


Resolution: Incomplete

I'm going to close this as stale.

 Instrument events in the capacity scheduler for collecting metrics information
 --

 Key: MAPREDUCE-537
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-537
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Hemanth Yamijala
 Attachments: metrics_implementation_With_time_window.patch


 We need to instrument various events in the capacity scheduler so that we can 
 collect metrics about them. This data will help us determine improvements to 
 scheduling strategies itself.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-528) NPE in jobqueue_details.jsp page if scheduler has not started

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-528.


Resolution: Won't Fix

 NPE in jobqueue_details.jsp page if scheduler has not started
 -

 Key: MAPREDUCE-528
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-528
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Ramya Sunil
Priority: Minor
 Attachments: screenshot-1.jpg


 NullPointerException is observed in jobqueue_details.jsp page if the 
 scheduler has not yet started



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-5993) native-task: simplify/remove dead code

2014-07-22 Thread Todd Lipcon (JIRA)
Todd Lipcon created MAPREDUCE-5993:
--

 Summary: native-task: simplify/remove dead code
 Key: MAPREDUCE-5993
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5993
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Todd Lipcon


The native task code has a bunch of code in it which isn't related to the map 
output collector. I suspect much if this is dead code. Let's remove it before 
we merge, so that the amount of code we have to maintain going forward is more 
limited.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-2166) map.input.file is not set

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-2166.
-

Resolution: Not a Problem

 map.input.file is not set
 ---

 Key: MAPREDUCE-2166
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2166
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: task
Reporter: Rares Vernica
Priority: Minor

 Hadoop does not set the map.input.file variable. I tried the fallowing and 
 all I get is null.
 public class Map extends MapperObject, Text, LongWritable, Text {
public void map(Object key, Text value, Context context)
throws IOException, InterruptedException {
Configuration conf = context.getConfiguration();
System.out.println(conf.get(map.input.file));
}
protected void setup(Context context) throws IOException,
InterruptedException {
Configuration conf = context.getConfiguration();
System.out.println(conf.get(map.input.file));
}
 }



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-301) mapred.child.classpath.extension property

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-301.


Resolution: Fixed

Already fixed via other means.

 mapred.child.classpath.extension property
 -

 Key: MAPREDUCE-301
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-301
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Klaas Bosteels

 It would be useful to be able to extend the classpath for the task processes 
 on a job per job basis via a {{mapred.child.classpath.extension}} property.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-352) Avoid creating JobInProgress objects before Access checks and Queues checks are done in JobTracker submitJob

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-352.


Resolution: Incomplete

Stale with YAWN.  I mean YARN.

 Avoid creating JobInProgress objects before Access checks and Queues checks 
 are done in JobTracker submitJob 
 -

 Key: MAPREDUCE-352
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-352
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: rahul k singh

 In JobTracker submitJob , JobInProgress instance gets created . after this 
 checks are done for access and queue state. In event of checks failed . There 
 isn't any use for these JIP objects , hence in event of failure only reason 
 these objects were created was to get conf data and be deleted.
 We need to fetch the information required to only do the checks instead of 
 creating a JobInProgress object



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5992) native-task test logs should not write to console

2014-07-22 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070843#comment-14070843
 ] 

Todd Lipcon commented on MAPREDUCE-5992:


I realized it's not a log4j issue at all. The native code logs directly to 
stderr without going through log4j. We should see if we can tie it into log4j 
via JNI.

 native-task test logs should not write to console
 -

 Key: MAPREDUCE-5992
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5992
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: task
Reporter: Todd Lipcon
Assignee: Todd Lipcon

 Most of our unit tests are configured with a log4j.properties test resource 
 so they don't spout a bunch of output to the console. We need to do the same 
 for native-task.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-562) A single slow (but not dead) map TaskTracker impedes MapReduce progress

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-562.


Resolution: Incomplete

This is still an interesting issue, but at this point, I feel the need to close 
this one.  The big reason being that this problem needs to be generalized for 
YARN and made much less MR specific.


 A single slow (but not dead) map TaskTracker impedes MapReduce progress
 ---

 Key: MAPREDUCE-562
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-562
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Aaron Kimball

 We see cases where there may be a large number of mapper nodes running many 
 tasks (e.g., a thousand). The reducers will pull 980 of the map task 
 intermediate files down, but will be unable to retrieve the final 
 intermediate shards from the last node. The TaskTracker on that node returns 
 data to reducers either slowly or not at all, but its heartbeat messages make 
 it back to the JobTracker -- so the JobTracker doesn't mark the tasks as 
 failed. Manually stopping the offending TaskTracker works to migrate the 
 tasks to other nodes, where the shuffling process finishes very quickly. Left 
 on its own, it can take hours to unjam itself otherwise.
 We need a mechanism for reducers to provide feedback to the JobTracker that 
 one of the mapper nodes should be regarded as lost.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-5994) native-task: TestBytesUtil fails

2014-07-22 Thread Todd Lipcon (JIRA)
Todd Lipcon created MAPREDUCE-5994:
--

 Summary: native-task: TestBytesUtil fails
 Key: MAPREDUCE-5994
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5994
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Todd Lipcon


This class appears to have some bugs. Two tests fail consistently on my system. 
BytesUtil itself appears to duplicate a lot of code from guava - we should 
probably just use the Guava functions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-385) pipes does not allow jobconf values containing commas

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-385.


Resolution: Won't Fix

-jobconf be dead, yo.

 pipes does not allow jobconf values containing commas
 -

 Key: MAPREDUCE-385
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-385
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Christian Kunz
Assignee: Christian Kunz
 Attachments: patch.HADOOP-6006, patch.HADOOP-6006.0.18


 Currently hadoop pipes does not allow a
 -jobconf key=value,key=value...
 commandline parameter with one or more commas in one of the values of the 
 key-value pairs.
 One use case is key=mapred.join.expr, where the value is required to have 
 commas.
 And it is not always convenient to add this to a configuration file.
 Submitter.java could easily be changed to check for backslash in front of a 
 comma before using it as a delimiter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-3) Set mapred.child.ulimit automatically to the value of the RAM limits for a job, if they are set

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-3.
--

Resolution: Fixed

Who sets ulimit anymore? No one. Why? cgroups and /proc-based memory limits. 
Closing as stale.

 Set mapred.child.ulimit automatically to the value of the RAM limits for a 
 job, if they are set
 ---

 Key: MAPREDUCE-3
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Hemanth Yamijala

 Memory based monitoring and scheduling allow users to set memory limits for 
 the tasks of their jobs. This parameter is the total memory taken by the 
 task, and any children it may launch (for e.g. in the case of streaming). A 
 related parameter is mapred.child.ulimit which is a hard limit on the memory 
 used by a single process of the entire task tree. For user convenience, it 
 would be sensible for the system to set the ulimit to atleast the memory 
 required by the task, if the user has specified the latter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-171) TestJobTrackerRestartWithLostTracker sometimes fails while validating history.

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-171.


Resolution: Fixed

I'm closing this as stale at this point.

 TestJobTrackerRestartWithLostTracker sometimes fails while validating history.
 --

 Key: MAPREDUCE-171
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-171
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker, test
Reporter: Amareshwari Sriramadasu
 Attachments: 
 TEST-org.apache.hadoop.mapred.TestJobTrackerRestartWithLostTracker.txt


 TestJobTrackerRestartWithLostTracker fails with following error
 Duplicate START_TIME seen for task task_200906151249_0001_m_01 in history 
 file at line 54
 junit.framework.AssertionFailedError: Duplicate START_TIME seen for task 
 task_200906151249_0001_m_01 in history file at line 54
   at 
 org.apache.hadoop.mapred.TestJobHistory$TestListener.handle(TestJobHistory.java:161)
   at org.apache.hadoop.mapred.JobHistory.parseLine(JobHistory.java:335)
   at 
 org.apache.hadoop.mapred.JobHistory.parseHistoryFromFS(JobHistory.java:299)
   at 
 org.apache.hadoop.mapred.TestJobHistory.validateJobHistoryFileFormat(TestJobHistory.java:478)
   at 
 org.apache.hadoop.mapred.TestJobTrackerRestartWithLostTracker.testRecoveryWithLostTracker(TestJobTrackerRestartWithLostTracker.java:116)
   at 
 org.apache.hadoop.mapred.TestJobTrackerRestartWithLostTracker.testRestartWithLostTracker(TestJobTrackerRestartWithLostTracker.java:162)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-103) The TaskTracker's shell environment should not be passed to the children.

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-103.


Resolution: Fixed

task-controller/container-executor should have fixed this. Closing.

 The TaskTracker's shell environment should not be passed to the children.
 -

 Key: MAPREDUCE-103
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-103
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: security
Reporter: Owen O'Malley

 HADOOP-2838 and HADOOP-5981 added support to make the TaskTracker's shell 
 environment available to the tasks. This has two problems:
   1. It makes the task tracker's environment part of the interface to the 
 task, which is fairly brittle.
   2. Security code typically only passes along whitelisted environment 
 variables instead of everything to prevent accidental leakage from the 
 administrator's account.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-459) Elegant decommission of lighty loaded tasktrackers from a map-reduce cluster

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-459.


Resolution: Incomplete

Me too. Closing.

 Elegant decommission of lighty loaded tasktrackers from a map-reduce cluster
 

 Key: MAPREDUCE-459
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-459
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: dhruba borthakur
Assignee: Namit Jain

 There is a need to elegantly move some machines from one map-reduce cluster 
 to another. This JIRA is to discuss how to find lightly loaded tasktrackers 
 that are candidates for decommissioning and then to elegantly decommission 
 them by waiting for existing tasks to finish.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-442) Ability to re-configure hadoop daemons online

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-442.


Resolution: Duplicate

I'm going to dupe this to HADOOP-7001, since it's closer to reality.  Other 
jiras tend to point to it as well.

 Ability to re-configure hadoop daemons online
 -

 Key: MAPREDUCE-442
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-442
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Amar Kamat

 Example : 
 Like we have _bin hadoop mradmin -refreshNodes_ we should also have _bin 
 hadoop mradmin -reconfigure_ which re-configures mr while the cluster is 
 online. Few parameters like job-expiry-interval etc can be changed in this 
 way without having to restart the whole cluster. 
 Master, once reconfigured, can ask the slaves to reconfigure (reload its 
 config) from a well defined location on hdfs or via heartbeat. 
 We can have some whitelisted configs that have _reloadable_ property. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-388) pipes combiner has a large memory footprint

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-388.


Resolution: Incomplete

Closing this as stale.

 pipes combiner has a large memory footprint
 ---

 Key: MAPREDUCE-388
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-388
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Christian Kunz

 Pipes combiner implementation can have a huge memory overhead compared to the 
 spill size. How much, depends on the record size. E.g., an application asks 
 for 2GB memory when io.sort.mb=500, key is 16 bytes, and value is 4 bytes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-441) TestMapReduceJobControl.testJobControlWithKillJob timedout in of the hudson runs

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-441.


Resolution: Incomplete

Almost certainly stale.

 TestMapReduceJobControl.testJobControlWithKillJob timedout in of the hudson 
 runs
 

 Key: MAPREDUCE-441
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-441
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 0.21.0
Reporter: Amareshwari Sriramadasu

 TestMapReduceJobControl.testJobControlWithKillJob timedout in of the hudson 
 runs @
 http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/530/testReport/org.apache.hadoop.mapreduce.lib.jobcontrol/TestMapReduceJobControl/testJobControlWithKillJob/



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-653) distcp can support bandwidth limiting

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-653.


Resolution: Won't Fix

distcpv2 does this now. closing as won't fix.

 distcp can support bandwidth limiting
 -

 Key: MAPREDUCE-653
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-653
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: distcp
Reporter: Ravi Gummadi
Assignee: Ravi Gummadi
 Attachments: d_bw.patch, d_bw.v1.patch, d_bw.v2.patch


 distcp should support an option for user to specify the bandwidth limit for 
 the distcp job.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-533) Support task preemption in Capacity Scheduler

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-533.


Resolution: Duplicate

 Support task preemption in Capacity Scheduler
 -

 Key: MAPREDUCE-533
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-533
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: capacity-sched
Reporter: Tsz Wo Nicholas Sze

 Without preemption, it is not possible to guarantee capacity since long 
 running jobs may occupy task slots for an arbitrarily long time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-563) Security features for Map/Reduce

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-563.


Resolution: Fixed

All your jobs are belong to someone who has a Kerberos principal.

 Security features for Map/Reduce
 

 Key: MAPREDUCE-563
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-563
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Owen O'Malley

 This is a top-level tracking JIRA for security work we are doing in 
 Map/reduce. Please add reference to this when opening new security related 
 JIRAs.
  
 Logically a subpiece of HADOOP-4487.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-564) Provide a way for the client to get the number of currently running maps/reduces

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-564.


Resolution: Incomplete

Probably stale.  See also comments about new API.

 Provide a way for the client to get the number of currently running 
 maps/reduces
 

 Key: MAPREDUCE-564
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-564
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker
Affects Versions: 0.21.0
Reporter: Ravi Gummadi
Assignee: Ravi Gummadi
 Attachments: MR-564.patch, MR-564.v1.patch, MR-564.v2.patch, 
 MR-564.v3.patch, MR-564.v4.1.patch, MR-564.v4.2.patch, MR-564.v4.patch


 Add counters for Number of Succeeded Maps and Number of Succeeded Reduces so 
 that client can get this number without iterating through all the task 
 reports while the job is in progress.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-736) Undefined variable is treated as string.

2014-07-22 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-736.


Resolution: Incomplete

Stale?

 Undefined variable is treated as string.
 

 Key: MAPREDUCE-736
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-736
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Suman Sehgal
Priority: Minor
 Attachments: hadoop_env.txt


 This issue is related to HADOOP-2838.
 For X=$X:Y : Append Y to X (which should be taken from the tasktracker) ,  if 
  we append to an undefined variable then value for undefined variable should 
 be displayed as blank 
 e.g. NEW_PATH=$NEW_PATH2:/tmp should be displayed as 
 :/tmp in child's environment 
 while that variable is being displayed as a string ($NEW_PATH2:/tmp) in the 
 environemnt.
  This is happening in case of default task-controller only. This scenario 
 works fine with linux task-controller.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-1767) Steaming infrastructures should provide statisics about job

2014-07-22 Thread Antonio Piccolboni (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071061#comment-14071061
 ] 

Antonio Piccolboni commented on MAPREDUCE-1767:
---

Such as?

 Steaming infrastructures should provide statisics about job
 ---

 Key: MAPREDUCE-1767
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1767
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/streaming
Reporter: arkady borkovsky

 This should include
 -- the commands (mapper and reducer commands) executed
 -- time information (e.g. min, max, and avg start time, end time, elapsed 
 time for tasks, total elapsed time )
 -- sizes -- bytes and records, min, max, avg per task and total, input and 
 output
 -- information about input and output data sets (all output data sets, if 
 there are several)
 -- all user counters (when they are implemented for streaming)
 the information should be stored in a file -- e.g. in the working directory 
 from where the job was launched, with a name derived from the job name



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5979) MR1 FairScheduler zero weight can cause sort failures

2014-07-22 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071082#comment-14071082
 ] 

Karthik Kambatla commented on MAPREDUCE-5979:
-

+1

 MR1 FairScheduler zero weight can cause sort failures
 -

 Key: MAPREDUCE-5979
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5979
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: scheduler
Affects Versions: 1.2.1
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: MAPREDUCE-5979.001.patch, MAPREDUCE-5979.002.patch


 When the weight is set to zero (which is possible with a custom weight 
 adjuster) we can get failures in comparing schedulables.
 This is because when calculating running tasks to weight ratio could result 
 in a 0.0/0.0 which ends up as NaN. Comparisons with NaN are undefined such 
 that (int)Math.signum(NaN - anyNumber) will be 0 causing different criteria 
 to be used in comparison which may not be consistent. This will result in 
 {{IllegalArgumentException: Comparison method violates its general contract!}}
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5979) FairScheduler: zero weight can cause sort failures

2014-07-22 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-5979:


Summary: FairScheduler: zero weight can cause sort failures  (was: MR1 
FairScheduler zero weight can cause sort failures)

 FairScheduler: zero weight can cause sort failures
 --

 Key: MAPREDUCE-5979
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5979
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: scheduler
Affects Versions: 1.2.1
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: MAPREDUCE-5979.001.patch, MAPREDUCE-5979.002.patch


 When the weight is set to zero (which is possible with a custom weight 
 adjuster) we can get failures in comparing schedulables.
 This is because when calculating running tasks to weight ratio could result 
 in a 0.0/0.0 which ends up as NaN. Comparisons with NaN are undefined such 
 that (int)Math.signum(NaN - anyNumber) will be 0 causing different criteria 
 to be used in comparison which may not be consistent. This will result in 
 {{IllegalArgumentException: Comparison method violates its general contract!}}
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5979) FairScheduler: zero weight can cause sort failures

2014-07-22 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-5979:


   Resolution: Fixed
Fix Version/s: 1.3.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks Anubhav. Just committed this to branch-1. 

 FairScheduler: zero weight can cause sort failures
 --

 Key: MAPREDUCE-5979
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5979
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: scheduler
Affects Versions: 1.2.1
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Fix For: 1.3.0

 Attachments: MAPREDUCE-5979.001.patch, MAPREDUCE-5979.002.patch


 When the weight is set to zero (which is possible with a custom weight 
 adjuster) we can get failures in comparing schedulables.
 This is because when calculating running tasks to weight ratio could result 
 in a 0.0/0.0 which ends up as NaN. Comparisons with NaN are undefined such 
 that (int)Math.signum(NaN - anyNumber) will be 0 causing different criteria 
 to be used in comparison which may not be consistent. This will result in 
 {{IllegalArgumentException: Comparison method violates its general contract!}}
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5966) MR1 FairScheduler use of custom weight adjuster is not thread safe for comparisons

2014-07-22 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071106#comment-14071106
 ] 

Karthik Kambatla commented on MAPREDUCE-5966:
-

Looks like this patch doesn't apply anymore, may be due MR-5979. Can you please 
update it? 

Also, I have the following minor comments:
# Reword the following comment to say Update demands and weights of jobs and 
pools
{code}
+  // Update demands of jobs and pools and update weights
{code}
# In the test case, I don't think Math.max is required anymore.
{code}
+
+// Until MAPREDUCE-5966 gets fixed we cannot have zero weight set
+return Math.max(curWeight * random, 0.001);
{code}
# We should be able to fit the following in two lines with throws in the line 
after the method name?
{code}
+  public void testJobSchedulableSortingWithCustomWeightAdjuster() throws
+  IOException,
+  InterruptedException {
{code}
# Can we make all these variables final and capital letters. Also, don't see 
the need for numRacks and numNodesPerRack.
{code}
+final int iterations = 100;
+int jobCount = 100;
+int numRacks = 100;
+int numNodesPerRack = 2;
+final int totalTaskTrackers = numNodesPerRack * numRacks;
{code}
# We should probably use pure camel-caps for this variable - {{randomTtid}}

 MR1 FairScheduler use of custom weight adjuster is not thread safe for 
 comparisons
 --

 Key: MAPREDUCE-5966
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5966
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: scheduler
Affects Versions: 1.2.1
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: MAPREDUCE-5966.001.patch


 When comparing JobSchedulables one of the factors is the weight. If someone 
 uses a custom weight adjuster, that may be called multiple times during a 
 sort causing different values to return. That causes a failure in sorting 
 because the weight may change during the sort.
 This reproes as 
 {code}
 java.io.IOException: java.lang.IllegalArgumentException: Comparison method 
 violates its general contract!
 at java.util.TimSort.mergeHi(TimSort.java:868)
 at java.util.TimSort.mergeAt(TimSort.java:485)
 at java.util.TimSort.mergeCollapse(TimSort.java:410)
 at java.util.TimSort.sort(TimSort.java:214)
 at java.util.TimSort.sort(TimSort.java:173)
 at java.util.Arrays.sort(Arrays.java:659)
 at java.util.Collections.sort(Collections.java:217)
 at 
 org.apache.hadoop.mapred.PoolSchedulable.assignTask(PoolSchedulable.java:163)
 at org.apache.hadoop.mapred.FairScheduler.assignTasks(FairScheduler.java:499)
 at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:2961)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5974) Allow map output collector fallback

2014-07-22 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071174#comment-14071174
 ] 

Chris Douglas commented on MAPREDUCE-5974:
--

bq. Doing fallback as the records are emitted would be pretty neat, but may 
also be somewhat difficult. [snip]

*nod* Fair enough, though if each MapTask is making independent decisions about 
the collector, they still need to agree on the format for the shuffle. Spilling 
one collector to disk and changing strategies should be compatible, assuming 
there isn't a different format for intermediate spills. But yeah, this is very 
abstract, given the use cases we have.

If the goal is to support a fallback collector when native libs aren't 
available; given the dependency on intermediate format, should the swap be 
internal to the native collector, even in init? If the interface were like the 
serialization, then one might use the keytype, etc. to pick the 
most-appropriate collector. As failover, I'm struggling to come up with a case 
that's not covered by making this an internal detail of the native collector.

 Allow map output collector fallback
 ---

 Key: MAPREDUCE-5974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5974
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: task
Affects Versions: 2.6.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: mapreduce-5974.txt


 Currently we only allow specifying a single MapOutputCollector implementation 
 class in a job. It would be nice to allow a comma-separated list of classes: 
 we should try each collector implementation in the user-specified order until 
 we find one that can be successfully instantiated and initted.
 This is useful for cases where a particular optimized collector 
 implementation cannot operate on all key/value types, or requires native 
 code. The cluster administrator can configure the cluster to try to use the 
 optimized collector and fall back to the default collector.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-5995) native-task: revert changes which expose Text internals

2014-07-22 Thread Todd Lipcon (JIRA)
Todd Lipcon created MAPREDUCE-5995:
--

 Summary: native-task: revert changes which expose Text internals
 Key: MAPREDUCE-5995
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5995
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor


The current branch has some changes to the Text writable which allow it to 
manually set the backing array, capacity, etc. Rather than exposing these 
internals, we should use the newly-committed facility from HADOOP-10855 to 
implement this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (MAPREDUCE-5994) native-task: TestBytesUtil fails

2014-07-22 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned MAPREDUCE-5994:
--

Assignee: Todd Lipcon

Working on removing the redundant functions here

 native-task: TestBytesUtil fails
 

 Key: MAPREDUCE-5994
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5994
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: task
Reporter: Todd Lipcon
Assignee: Todd Lipcon

 This class appears to have some bugs. Two tests fail consistently on my 
 system. BytesUtil itself appears to duplicate a lot of code from guava - we 
 should probably just use the Guava functions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-5996) native-task: Rename system tests into standard directory layout

2014-07-22 Thread Todd Lipcon (JIRA)
Todd Lipcon created MAPREDUCE-5996:
--

 Summary: native-task: Rename system tests into standard directory 
layout
 Key: MAPREDUCE-5996
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5996
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Todd Lipcon
Assignee: Todd Lipcon


Currently there are a number of tests in src/java/system. This confuses IDEs 
which think that the package should then be system.org.apache.hadoop instead of 
just org.apache.hadoop.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5996) native-task: Rename system tests into standard directory layout

2014-07-22 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071224#comment-14071224
 ] 

Todd Lipcon commented on MAPREDUCE-5996:


There's also a random file called testGlibcBugSpill.out which appears to be 
unused by any tests. I'll remove it in this patch as well.

 native-task: Rename system tests into standard directory layout
 ---

 Key: MAPREDUCE-5996
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5996
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: task
Reporter: Todd Lipcon
Assignee: Todd Lipcon

 Currently there are a number of tests in src/java/system. This confuses IDEs 
 which think that the package should then be system.org.apache.hadoop instead 
 of just org.apache.hadoop.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5994) native-task: TestBytesUtil fails

2014-07-22 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated MAPREDUCE-5994:
---

Attachment: mapreduce-5994.txt

 native-task: TestBytesUtil fails
 

 Key: MAPREDUCE-5994
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5994
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: task
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: mapreduce-5994.txt


 This class appears to have some bugs. Two tests fail consistently on my 
 system. BytesUtil itself appears to duplicate a lot of code from guava - we 
 should probably just use the Guava functions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5996) native-task: Rename system tests into standard directory layout

2014-07-22 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated MAPREDUCE-5996:
---

Attachment: mapreduce-5996.txt

 native-task: Rename system tests into standard directory layout
 ---

 Key: MAPREDUCE-5996
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5996
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: task
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: mapreduce-5996.txt


 Currently there are a number of tests in src/java/system. This confuses IDEs 
 which think that the package should then be system.org.apache.hadoop instead 
 of just org.apache.hadoop.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-5997) native-task: Use DirectBufferPool from Hadoop Common

2014-07-22 Thread Todd Lipcon (JIRA)
Todd Lipcon created MAPREDUCE-5997:
--

 Summary: native-task: Use DirectBufferPool from Hadoop Common
 Key: MAPREDUCE-5997
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5997
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor


The native task code has its own direct buffer pool, but Hadoop already has an 
implementation. HADOOP-10882 will move that implementation into Common, and 
this JIRA is to remove the duplicate code and use that one instead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5974) Allow map output collector fallback

2014-07-22 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071247#comment-14071247
 ] 

Todd Lipcon commented on MAPREDUCE-5974:


In the case of the native collector, it's still the same IFile format on disk, 
and the same reducer. I'm not sure about whether that's the case with other map 
collectors out there (eg from vendors) but I seem to recall some folks working 
on a specialized collector for memcmp-able keys. In that case, it might be nice 
to have a priority list like 
MemcmpableKeyCollector,NativeCollector,DefaultCollector, and each one would 
just throw an exception if it didn't support the types involved.

Implementing this inside the native collector init() method itself might be 
messy -- you'd have to essentially write a wrapper collector and have every 
method delegate to the real implementation. I would hope that the delegation 
would get devirtualized and inlined, but not certain about that. If you're -0 
or -1 on the current approach though, I'm willing to give it a go.

 Allow map output collector fallback
 ---

 Key: MAPREDUCE-5974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5974
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: task
Affects Versions: 2.6.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: mapreduce-5974.txt


 Currently we only allow specifying a single MapOutputCollector implementation 
 class in a job. It would be nice to allow a comma-separated list of classes: 
 we should try each collector implementation in the user-specified order until 
 we find one that can be successfully instantiated and initted.
 This is useful for cases where a particular optimized collector 
 implementation cannot operate on all key/value types, or requires native 
 code. The cluster administrator can configure the cluster to try to use the 
 optimized collector and fall back to the default collector.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


  1   2   >