[jira] [Commented] (MAPREDUCE-2911) Hamster: Hadoop And Mpi on the same cluSTER

2013-03-07 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595680#comment-13595680
 ] 

Harsh J commented on MAPREDUCE-2911:


Where exactly did all that naming you refer to, happen? I've not noticed it on 
the lists and there's been a few asks there as well (IIRC), but no negativism 
ever came in on its responses. I do not see any 'bile-spewing' on this very 
ticket either. So what community are you pointing this onto?

Thanks for still working on getting this available though, there are several 
people interested in this!

 Hamster: Hadoop And Mpi on the same cluSTER
 ---

 Key: MAPREDUCE-2911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2911
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv2
Affects Versions: 0.23.0
 Environment: All Unix-Environments
Reporter: Milind Bhandarkar
Assignee: Ralph H Castain
   Original Estimate: 336h
  Remaining Estimate: 336h

 MPI is commonly used for many machine-learning applications. OpenMPI 
 (http://www.open-mpi.org/) is a popular BSD-licensed version of MPI. In the 
 past, running MPI application on a Hadoop cluster was achieved using Hadoop 
 Streaming (http://videolectures.net/nipsworkshops2010_ye_gbd/), but it was 
 kludgy. After the resource-manager separation from JobTracker in Hadoop, we 
 have all the tools needed to make MPI a first-class citizen on a Hadoop 
 cluster. I am currently working on the patch to make MPI an 
 application-master. Initial version of this patch will be available soon 
 (hopefully before September 10.) This jira will track the development of 
 Hamster: The application master for MPI.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-3685) There are some bugs in implementation of MergeManager

2013-03-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595767#comment-13595767
 ] 

Hudson commented on MAPREDUCE-3685:
---

Integrated in Hadoop-Yarn-trunk #148 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/148/])
MAPREDUCE-3685. Fix bugs in MergeManager to ensure compression codec is 
appropriately used and that on-disk segments are correctly sorted on file-size. 
Contributed by Anty Rao and Ravi Prakash. (Revision 1453365)

 Result = SUCCESS
acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1453365
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Merger.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManagerImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestMergeManager.java


 There are some bugs in implementation of MergeManager
 -

 Key: MAPREDUCE-3685
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3685
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.1
Reporter: anty.rao
Assignee: anty
Priority: Critical
 Fix For: 0.23.7, 2.0.4-beta

 Attachments: MAPREDUCE-3685-branch-0.23.1.patch, 
 MAPREDUCE-3685-branch-0.23.1.patch, MAPREDUCE-3685-branch-0.23.1.patch, 
 MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.branch-0.23.patch, 
 MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.branch-0.23.patch, 
 MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, 
 MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, 
 MAPREDUCE-3685.patch, MAPREDUCE-3685.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-3685) There are some bugs in implementation of MergeManager

2013-03-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595832#comment-13595832
 ] 

Hudson commented on MAPREDUCE-3685:
---

Integrated in Hadoop-Hdfs-0.23-Build #546 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/546/])
MAPREDUCE-3685. Fix bugs in MergeManager to ensure compression codec is 
appropriately used and that on-disk segments are correctly sorted on file-size. 
Contributed by Anty Rao and Ravi Prakash. (Revision 1453373)

 Result = UNSTABLE
acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1453373
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Merger.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MapOutput.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManager.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestMergeManager.java


 There are some bugs in implementation of MergeManager
 -

 Key: MAPREDUCE-3685
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3685
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.1
Reporter: anty.rao
Assignee: anty
Priority: Critical
 Fix For: 0.23.7, 2.0.4-beta

 Attachments: MAPREDUCE-3685-branch-0.23.1.patch, 
 MAPREDUCE-3685-branch-0.23.1.patch, MAPREDUCE-3685-branch-0.23.1.patch, 
 MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.branch-0.23.patch, 
 MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.branch-0.23.patch, 
 MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, 
 MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, 
 MAPREDUCE-3685.patch, MAPREDUCE-3685.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-3685) There are some bugs in implementation of MergeManager

2013-03-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595854#comment-13595854
 ] 

Hudson commented on MAPREDUCE-3685:
---

Integrated in Hadoop-Hdfs-trunk #1337 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1337/])
MAPREDUCE-3685. Fix bugs in MergeManager to ensure compression codec is 
appropriately used and that on-disk segments are correctly sorted on file-size. 
Contributed by Anty Rao and Ravi Prakash. (Revision 1453365)

 Result = SUCCESS
acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1453365
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Merger.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManagerImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestMergeManager.java


 There are some bugs in implementation of MergeManager
 -

 Key: MAPREDUCE-3685
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3685
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.1
Reporter: anty.rao
Assignee: anty
Priority: Critical
 Fix For: 0.23.7, 2.0.4-beta

 Attachments: MAPREDUCE-3685-branch-0.23.1.patch, 
 MAPREDUCE-3685-branch-0.23.1.patch, MAPREDUCE-3685-branch-0.23.1.patch, 
 MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.branch-0.23.patch, 
 MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.branch-0.23.patch, 
 MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, 
 MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, 
 MAPREDUCE-3685.patch, MAPREDUCE-3685.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-2911) Hamster: Hadoop And Mpi on the same cluSTER

2013-03-07 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595881#comment-13595881
 ] 

Arun C Murthy commented on MAPREDUCE-2911:
--

I have the same questions as Harsh.

On May 17, 2012 Ralph said he was close to committing this to OpenMPI, as 
mentioned on this jira:  http://s.apache.org/uY

Where is this 'bile-spewing' and when did it start? 

I'm still looking forward to playing with this.

 Hamster: Hadoop And Mpi on the same cluSTER
 ---

 Key: MAPREDUCE-2911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2911
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv2
Affects Versions: 0.23.0
 Environment: All Unix-Environments
Reporter: Milind Bhandarkar
Assignee: Ralph H Castain
   Original Estimate: 336h
  Remaining Estimate: 336h

 MPI is commonly used for many machine-learning applications. OpenMPI 
 (http://www.open-mpi.org/) is a popular BSD-licensed version of MPI. In the 
 past, running MPI application on a Hadoop cluster was achieved using Hadoop 
 Streaming (http://videolectures.net/nipsworkshops2010_ye_gbd/), but it was 
 kludgy. After the resource-manager separation from JobTracker in Hadoop, we 
 have all the tools needed to make MPI a first-class citizen on a Hadoop 
 cluster. I am currently working on the patch to make MPI an 
 application-master. Initial version of this patch will be available soon 
 (hopefully before September 10.) This jira will track the development of 
 Hamster: The application master for MPI.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-3685) There are some bugs in implementation of MergeManager

2013-03-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595902#comment-13595902
 ] 

Hudson commented on MAPREDUCE-3685:
---

Integrated in Hadoop-Mapreduce-trunk #1365 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1365/])
MAPREDUCE-3685. Fix bugs in MergeManager to ensure compression codec is 
appropriately used and that on-disk segments are correctly sorted on file-size. 
Contributed by Anty Rao and Ravi Prakash. (Revision 1453365)

 Result = SUCCESS
acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1453365
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Merger.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManagerImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestMergeManager.java


 There are some bugs in implementation of MergeManager
 -

 Key: MAPREDUCE-3685
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3685
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.1
Reporter: anty.rao
Assignee: anty
Priority: Critical
 Fix For: 0.23.7, 2.0.5-beta

 Attachments: MAPREDUCE-3685-branch-0.23.1.patch, 
 MAPREDUCE-3685-branch-0.23.1.patch, MAPREDUCE-3685-branch-0.23.1.patch, 
 MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.branch-0.23.patch, 
 MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.branch-0.23.patch, 
 MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, 
 MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, 
 MAPREDUCE-3685.patch, MAPREDUCE-3685.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4993) AM thinks it was killed when an error occurs setting up a task container launch context

2013-03-07 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595972#comment-13595972
 ] 

Jason Lowe commented on MAPREDUCE-4993:
---

Yes, the AM cannot succeed if it cannot create the common task container launch 
context.  However the whole point of this JIRA is that it should mark the job 
as FAILED or ERROR with an appropriate diagnostic message for the application 
rather than marking the job as KILLED.  The latter status leads users to 
believe someone or something killed the job which is not the case.

 AM thinks it was killed when an error occurs setting up a task container 
 launch context
 ---

 Key: MAPREDUCE-4993
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4993
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 2.0.3-alpha, 0.23.5
Reporter: Jason Lowe
Assignee: Abhishek Kapoor

 If an IOException occurs while setting up a container launch context for a 
 task then the AM exits with a KILLED status and no diagnostics.  The job 
 should be marked as FAILED (or maybe ERROR) with a useful diagnostics message 
 indicating the nature of the error.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5050) Cannot find partition.lst in Terasort on Hadoop/Local File System

2013-03-07 Thread Matt Parker (JIRA)
Matt Parker created MAPREDUCE-5050:
--

 Summary: Cannot find partition.lst in Terasort on Hadoop/Local 
File System
 Key: MAPREDUCE-5050
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5050
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: examples
Affects Versions: 0.20.2
 Environment: Cloudera VM CDH3u4, VMWare, Linux, Java SE 1.6.0_31-b04
Reporter: Matt Parker
Priority: Minor


I'm trying to simulate running Hadoop on Lustre by configuring it to use the 
local file system using a single cloudera VM (cdh3u4).

I can generate the data just fine, but when running the sorting portion of the 
program, I get an error about not being able to find the _partition.lst file. 
It exists in the generated data directory.

Perusing the Terasort code, I see in the main method that has a Path reference 
to partition.lst, which is created with the parent directory.

  public int run(String[] args) throws Exception {
   LOG.info(starting);
  JobConf job = (JobConf) getConf();
  Path inputDir = new Path(args[0]);
  inputDir = inputDir.makeQualified(inputDir.getFileSystem(job));
  Path partitionFile = new Path(inputDir, TeraInputFormat.PARTITION_FILENAME);
  URI partitionUri = new URI(partitionFile.toString() +
   # + TeraInputFormat.PARTITION_FILENAME);
  TeraInputFormat.setInputPaths(job, new Path(args[0]));
  FileOutputFormat.setOutputPath(job, new Path(args[1]));
  job.setJobName(TeraSort);
  job.setJarByClass(TeraSort.class);
  job.setOutputKeyClass(Text.class);
  job.setOutputValueClass(Text.class);
  job.setInputFormat(TeraInputFormat.class);
  job.setOutputFormat(TeraOutputFormat.class);
  job.setPartitionerClass(TotalOrderPartitioner.class);
  TeraInputFormat.writePartitionFile(job, partitionFile);
  DistributedCache.addCacheFile(partitionUri, job);
  DistributedCache.createSymlink(job);
  job.setInt(dfs.replication, 1);
  TeraOutputFormat.setFinalSync(job, true);
  JobClient.runJob(job);
  LOG.info(done);
  return 0;
  }

But in the configure method, the Path isn't created with the parent directory 
reference.

public void configure(JobConf job) {

  try {
FileSystem fs = FileSystem.getLocal(job);
Path partFile = new Path(TeraInputFormat.PARTITION_FILENAME);
splitPoints = readPartitions(fs, partFile, job);
trie = buildTrie(splitPoints, 0, splitPoints.length, new Text(), 2);
  } catch (IOException ie) {
throw new IllegalArgumentException(can't read paritions file, ie);
  }

}

I modified the code as follows, and now sorting portion of the Terasort test 
works using the
general file system. I think the above code is a bug.

public void configure(JobConf job) {

  try {
FileSystem fs = FileSystem.getLocal(job);

Path[] inputPaths = TeraInputFormat.getInputPaths(job);
Path partFile = new Path(inputPaths[0], 
TeraInputFormat.PARTITION_FILENAME);

splitPoints = readPartitions(fs, partFile, job);
trie = buildTrie(splitPoints, 0, splitPoints.length, new Text(), 2);
  } catch (IOException ie) {
throw new IllegalArgumentException(can't read paritions file, ie);
  }

}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5051) Combiner not used when NUM_REDUCES=0

2013-03-07 Thread Damien Hardy (JIRA)
Damien Hardy created MAPREDUCE-5051:
---

 Summary: Combiner not used when NUM_REDUCES=0
 Key: MAPREDUCE-5051
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5051
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 2.0.2-alpha
 Environment: CDH4.1.2 MR1
Reporter: Damien Hardy


We have a M/R job that use Mapper + Combiner but have nothing to do in Reducer :
Bulk indexing of HBase data in ElasticSearch,
Map output is K / V : #bulk / json_data_to_be_indexed.

So job is launched maps work, combiners index and a reducer is created for 
nothing (sometimes waiting for other M/R job to free a tasktracker slot for 
reducer cf. MAPREDUCE-5019 )

When we put ```job.setNumReduceTasks(0);``` in our job .run(), mapper are 
started but combiner are not used.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5052) Job History UI and web services confusing job start time and job submit time

2013-03-07 Thread Kendall Thrapp (JIRA)
Kendall Thrapp created MAPREDUCE-5052:
-

 Summary: Job History UI and web services confusing job start time 
and job submit time
 Key: MAPREDUCE-5052
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5052
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.6
Reporter: Kendall Thrapp


The Start Time column shown on Job History server's main webpage 
(http://host:port/jobhistory) is actually showing the *submit* time for 
jobs.  However, when you drill down to an individual job's page, there the 
Start Time really does refer to when the job actually started.  

This also true for the web services REST API, where the Jobs listing returns 
the submit times as startTime, but the single Job API returns the start time 
as startTime.

The two different times being referred to by the same name is confusing.  
However, it is useful to have both times, as the difference between the submit 
time and start time can show how long a job was stuck waiting in a queue.  The 
column on the main job history page should be changed to Submit Time and the 
individual job's page should show both the submit time and start time.  The web 
services REST API should be updated with these changes as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (MAPREDUCE-5023) History Server Web Services missing Job Counters

2013-03-07 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash reassigned MAPREDUCE-5023:
---

Assignee: Ravi Prakash

 History Server Web Services missing Job Counters
 

 Key: MAPREDUCE-5023
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5023
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.6
Reporter: Kendall Thrapp
Assignee: Ravi Prakash
Priority: Critical

 The History Server's Job Counters API is not returning all the counters seen 
 on the Job's Counters webpage.  Specifically, I'm not seeing any of the 
 counters in the org.apache.hadoop.mapreduce.JobCounter group:
 TOTAL_LAUNCHED_MAPS
 TOTAL_LAUNCHED_REDUCES
 OTHER_LOCAL_MAPS
 SLOTS_MILLIS_MAPS
 SLOTS_MILLIS_REDUCES

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5038) old API CombineFileInputFormat missing fixes that are in new API

2013-03-07 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596115#comment-13596115
 ] 

Tom White commented on MAPREDUCE-5038:
--

+1

 old API CombineFileInputFormat missing fixes that are in new API 
 -

 Key: MAPREDUCE-5038
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5038
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.1
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-5038-1.patch, MAPREDUCE-5038.patch


 The following changes patched the CombineFileInputFormat in mapreduce, but 
 neglected the one in mapred
 MAPREDUCE-1597 enabled the CombineFileInputFormat to work on splittable files
 MAPREDUCE-2021 solved returning duplicate hostnames in split locations
 MAPREDUCE-1806 CombineFileInputFormat does not work with paths not on default 
 FS
 In trunk this is not an issue as the one in mapred extends the one in 
 mapreduce.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5049) CombineFileInputFormat counts all compressed files non-splitable

2013-03-07 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596117#comment-13596117
 ] 

Tom White commented on MAPREDUCE-5049:
--

+1

 CombineFileInputFormat counts all compressed files non-splitable
 

 Key: MAPREDUCE-5049
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5049
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.1
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-5049.patch


 In branch-1, CombineFileInputFormat doesn't take SplittableCompressionCodec 
 into account and thinks that all compressible input files aren't splittable.  
 This is a regression from when handling for non-splitable compression codecs 
 was originally added in MAPREDUCE-1597, and seems to have somehow gotten in 
 when the code was pulled from 0.22 to branch-1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5023) History Server Web Services missing Job Counters

2013-03-07 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated MAPREDUCE-5023:


Attachment: MAPREDUCE-5023.patch

Simple enough patch. There's some code duplication between getCounters() in 
CountersBlock.java JobCounterInfo.java but I don't know if it can be helped.

 History Server Web Services missing Job Counters
 

 Key: MAPREDUCE-5023
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5023
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.6
Reporter: Kendall Thrapp
Assignee: Ravi Prakash
Priority: Critical
 Attachments: MAPREDUCE-5023.patch


 The History Server's Job Counters API is not returning all the counters seen 
 on the Job's Counters webpage.  Specifically, I'm not seeing any of the 
 counters in the org.apache.hadoop.mapreduce.JobCounter group:
 TOTAL_LAUNCHED_MAPS
 TOTAL_LAUNCHED_REDUCES
 OTHER_LOCAL_MAPS
 SLOTS_MILLIS_MAPS
 SLOTS_MILLIS_REDUCES

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-3916) various issues with running yarn proxyserver

2013-03-07 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596136#comment-13596136
 ] 

Suresh Srinivas commented on MAPREDUCE-3916:


Alejandro, all I see in this patch is just a change in the description in 
yarn-default.xml. How does this solve the problem? If it is just a doc update, 
should the title of this jira be updated?

 various issues with running yarn proxyserver
 

 Key: MAPREDUCE-3916
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3916
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, resourcemanager, webapps
Affects Versions: 0.23.1, 2.0.0-alpha, 3.0.0
Reporter: Roman Shaposhnik
Assignee: Devaraj K
Priority: Critical
  Labels: mrv2
 Fix For: 2.0.0-alpha

 Attachments: MAPREDUCE-3916.patch


 Seem like yarn proxyserver is not operational when running out of the 0.23.1 
 RC2 tarball.
 # Setting yarn.web-proxy.address to match yarn.resourcemanager.address 
 doesn't disable the proxyserver (althought not setting yarn.web-proxy.address 
 at all correctly disable it and produces a message: 
 org.apache.hadoop.yarn.YarnException: yarn.web-proxy.address is not set so 
 the proxy will not run). This contradicts the documentation provided for 
 yarn.web-proxy.address in yarn-default.xml
 # Setting yarn.web-proxy.address and running the service results in the 
 following:
 {noformat}
 $ ./sbin/yarn-daemon.sh start proxyserver 
 starting proxyserver, logging to 
 /tmp/hadoop-0.23.1/logs/yarn-rvs-proxyserver-ahmed-laptop.out
 /usr/java/64/jdk1.6.0_22/bin/java -Dproc_proxyserver -Xmx1000m 
 -Dhadoop.log.dir=/tmp/hadoop-0.23.1/logs 
 -Dyarn.log.dir=/tmp/hadoop-0.23.1/logs 
 -Dhadoop.log.file=yarn-rvs-proxyserver-ahmed-laptop.log 
 -Dyarn.log.file=yarn-rvs-proxyserver-ahmed-laptop.log -Dyarn.home.dir= 
 -Dyarn.id.str=rvs -Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA 
 -Djava.library.path=/tmp/hadoop-0.23.1/lib/native 
 -Dyarn.policy.file=hadoop-policy.xml -Dhadoop.log.dir=/tmp/hadoop-0.23.1/logs 
 -Dyarn.log.dir=/tmp/hadoop-0.23.1/logs 
 -Dhadoop.log.file=yarn-rvs-proxyserver-ahmed-laptop.log 
 -Dyarn.log.file=yarn-rvs-proxyserver-ahmed-laptop.log 
 -Dyarn.home.dir=/tmp/hadoop-0.23.1 -Dhadoop.root.logger=INFO,DRFA 
 -Dyarn.root.logger=INFO,DRFA 
 -Djava.library.path=/tmp/hadoop-0.23.1/lib/native -classpath 
 /tmp/hadoop-0.23.1/etc/hadoop:/tmp/hadoop-0.23.1/etc/hadoop:/tmp/hadoop-0.23.1/etc/hadoop:/tmp/hadoop-0.23.1/share/hadoop/common/lib/*:/tmp/hadoop-0.23.1/share/hadoop/common/*:/tmp/hadoop-0.23.1/share/hadoop/hdfs:/tmp/hadoop-0.23.1/share/hadoop/hdfs/lib/*:/tmp/hadoop-0.23.1/share/hadoop/hdfs/*:/tmp/hadoop-0.23.1/share/hadoop/mapreduce/lib/*:/tmp/hadoop-0.23.1/share/hadoop/mapreduce/*:/tmp/hadoop-0.23.1/share/hadoop/mapreduce/*:/tmp/hadoop-0.23.1/share/hadoop/mapreduce/lib/*
  org.apache.hadoop.yarn.server.webproxy.WebAppProxyServer
 {noformat}
 with the following message found in the logs:
 {noformat}
 2012-02-24 09:26:31,099 FATAL 
 org.apache.hadoop.yarn.server.webproxy.WebAppProxy: Could not start proxy web 
 server
 java.io.FileNotFoundException: webapps/proxy not found in CLASSPATH
 at 
 org.apache.hadoop.http.HttpServer.getWebAppsPath(HttpServer.java:532)
 at org.apache.hadoop.http.HttpServer.init(HttpServer.java:224)
 at org.apache.hadoop.http.HttpServer.init(HttpServer.java:164)
 at 
 org.apache.hadoop.yarn.server.webproxy.WebAppProxy.start(WebAppProxy.java:85)
 at 
 org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
 at 
 org.apache.hadoop.yarn.server.webproxy.WebAppProxyServer.main(WebAppProxyServer.java:76)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-3900) mr-jobhistory-daemon.sh should rely on MAPREDUCE env. variables instead of the YARN ones

2013-03-07 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated MAPREDUCE-3900:
---

Issue Type: Improvement  (was: Bug)

 mr-jobhistory-daemon.sh should rely on MAPREDUCE env. variables instead of 
 the YARN ones
 

 Key: MAPREDUCE-3900
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3900
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Affects Versions: 0.23.0
Reporter: Roman Shaposhnik
Assignee: Roman Shaposhnik

 It nice to see yarn-deamo.sh be split into a separate script for managing MR 
 service(s), but once that has happened we should go all the way and make it 
 configurable as an MR entity.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-3980) mr-jobhistory-daemon.sh should look for mapred script in HADOOP_MAPRED_HOME

2013-03-07 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas resolved MAPREDUCE-3980.


   Resolution: Duplicate
Fix Version/s: 2.0.2-alpha

Resolving it as duplicate as this has been addressed by MAPREDUCE-4649 for 
2.0.2-alpha.

 mr-jobhistory-daemon.sh should look for mapred script in HADOOP_MAPRED_HOME
 ---

 Key: MAPREDUCE-3980
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3980
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 0.23.1
Reporter: Roman Shaposhnik
 Fix For: 2.0.2-alpha


 The following:
 {noformat}
 nohup nice -n $YARN_NICENESS $YARN_HOME/bin/mapred --config $YARN_CONF_DIR 
 $command $@  $log 21  /dev/null 
 {noformat}
 should be this instead:
 {noformat}
 nohup nice -n $YARN_NICENESS $HADOOP_MAPRED_HOME/bin/mapred --config 
 $YARN_CONF_DIR $command $@  $log 21  /dev/null 
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5023) History Server Web Services missing Job Counters

2013-03-07 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated MAPREDUCE-5023:


Status: Patch Available  (was: Open)

 History Server Web Services missing Job Counters
 

 Key: MAPREDUCE-5023
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5023
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.6
Reporter: Kendall Thrapp
Assignee: Ravi Prakash
Priority: Critical
 Attachments: MAPREDUCE-5023.patch


 The History Server's Job Counters API is not returning all the counters seen 
 on the Job's Counters webpage.  Specifically, I'm not seeing any of the 
 counters in the org.apache.hadoop.mapreduce.JobCounter group:
 TOTAL_LAUNCHED_MAPS
 TOTAL_LAUNCHED_REDUCES
 OTHER_LOCAL_MAPS
 SLOTS_MILLIS_MAPS
 SLOTS_MILLIS_REDUCES

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5051) Combiner not used when NUM_REDUCES=0

2013-03-07 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596218#comment-13596218
 ] 

Robert Joseph Evans commented on MAPREDUCE-5051:


Damien,  The combiner only runs as part of the shuffle phase.  The shuffle 
phase only runs when there is a reducer that needs the data to be shuffled.  So 
your indexing works just fine if all of the indexes for a given key are not in 
the same file?

If you want just a combiner to run with no reducers configured, you are going 
to have to write something for that yourself.

 Combiner not used when NUM_REDUCES=0
 

 Key: MAPREDUCE-5051
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5051
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 2.0.2-alpha
 Environment: CDH4.1.2 MR1
Reporter: Damien Hardy

 We have a M/R job that use Mapper + Combiner but have nothing to do in 
 Reducer :
 Bulk indexing of HBase data in ElasticSearch,
 Map output is K / V : #bulk / json_data_to_be_indexed.
 So job is launched maps work, combiners index and a reducer is created for 
 nothing (sometimes waiting for other M/R job to free a tasktracker slot for 
 reducer cf. MAPREDUCE-5019 )
 When we put ```job.setNumReduceTasks(0);``` in our job .run(), mapper are 
 started but combiner are not used.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-5051) Combiner not used when NUM_REDUCES=0

2013-03-07 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans resolved MAPREDUCE-5051.


Resolution: Won't Fix

If you feel strongly that this should be supported you can reopen this JIRA as 
new feature work.

 Combiner not used when NUM_REDUCES=0
 

 Key: MAPREDUCE-5051
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5051
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 2.0.2-alpha
 Environment: CDH4.1.2 MR1
Reporter: Damien Hardy

 We have a M/R job that use Mapper + Combiner but have nothing to do in 
 Reducer :
 Bulk indexing of HBase data in ElasticSearch,
 Map output is K / V : #bulk / json_data_to_be_indexed.
 So job is launched maps work, combiners index and a reducer is created for 
 nothing (sometimes waiting for other M/R job to free a tasktracker slot for 
 reducer cf. MAPREDUCE-5019 )
 When we put ```job.setNumReduceTasks(0);``` in our job .run(), mapper are 
 started but combiner are not used.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5023) History Server Web Services missing Job Counters

2013-03-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596228#comment-13596228
 ] 

Hadoop QA commented on MAPREDUCE-5023:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12572564/MAPREDUCE-5023.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3390//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3390//console

This message is automatically generated.

 History Server Web Services missing Job Counters
 

 Key: MAPREDUCE-5023
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5023
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.6
Reporter: Kendall Thrapp
Assignee: Ravi Prakash
Priority: Critical
 Attachments: MAPREDUCE-5023.patch


 The History Server's Job Counters API is not returning all the counters seen 
 on the Job's Counters webpage.  Specifically, I'm not seeing any of the 
 counters in the org.apache.hadoop.mapreduce.JobCounter group:
 TOTAL_LAUNCHED_MAPS
 TOTAL_LAUNCHED_REDUCES
 OTHER_LOCAL_MAPS
 SLOTS_MILLIS_MAPS
 SLOTS_MILLIS_REDUCES

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5042) Reducer unable to fetch for a map task that was recovered

2013-03-07 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596289#comment-13596289
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-5042:


In my prelim security work, I once had the JobClient generate the secret and 
then later had the MR AM generate the tokens and reupload the tokens file into 
the submit directory. That was another hop to DFS and we changed that since, 
but this recovery code bug fell through. So there are multiple solutions:
 - Have a single secret but let the client generate it
 - Have a single secret but upload the tokens file for future app-attempts
 - Have multiple tokens

It's future proof to separate the task and shuffle security secrets, but not 
sure that is tied in directly to this one if we consider the reupload solution.

I don't feel strongly about any solution, but one thing we should keep in mind 
is to move as much stuff into the AM so that the client is thinner and enables 
us to do submits via web services.

 Reducer unable to fetch for a map task that was recovered
 -

 Key: MAPREDUCE-5042
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5042
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, security
Affects Versions: 0.23.7, 2.0.5-beta
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Blocker
 Attachments: MAPREDUCE-5042.patch, MAPREDUCE-5042.patch


 If an application attempt fails and is relaunched the AM will try to recover 
 previously completed tasks.  If a reducer needs to fetch the output of a map 
 task attempt that was recovered then it will fail with a 401 error like this:
 {noformat}
 java.io.IOException: Server returned HTTP response code: 401 for URL: 
 http://xx:xx/mapOutput?job=job_1361569180491_21845reduce=0map=attempt_1361569180491_21845_m_16_0
   at 
 sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1615)
   at 
 org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:231)
   at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:156)
 {noformat}
 Looking at the corresponding NM's logs, we see the shuffle failed due to 
 Verification of the hashReply failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5042) Reducer unable to fetch for a map task that was recovered

2013-03-07 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596332#comment-13596332
 ] 

Jason Lowe commented on MAPREDUCE-5042:
---

I thought about the upload-to-staging-for-future-attempts solution but it 
seemed passing the secret in the job credentials was a bit cleaner and avoided 
the extra HDFS operations.

As for splitting the job token into shuffle and task, I didn't want to change 
the current task authentication behavior.  Allowing an old task attempt to 
authenticate with a new app attempt seemed like it would be a problem waiting 
to happen.  But we need the shuffle secret to persist across app attempts, 
hence the push to split them as part of this change.

 Reducer unable to fetch for a map task that was recovered
 -

 Key: MAPREDUCE-5042
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5042
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, security
Affects Versions: 0.23.7, 2.0.5-beta
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Blocker
 Attachments: MAPREDUCE-5042.patch, MAPREDUCE-5042.patch


 If an application attempt fails and is relaunched the AM will try to recover 
 previously completed tasks.  If a reducer needs to fetch the output of a map 
 task attempt that was recovered then it will fail with a 401 error like this:
 {noformat}
 java.io.IOException: Server returned HTTP response code: 401 for URL: 
 http://xx:xx/mapOutput?job=job_1361569180491_21845reduce=0map=attempt_1361569180491_21845_m_16_0
   at 
 sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1615)
   at 
 org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:231)
   at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:156)
 {noformat}
 Looking at the corresponding NM's logs, we see the shuffle failed due to 
 Verification of the hashReply failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5023) History Server Web Services missing Job Counters

2013-03-07 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596342#comment-13596342
 ] 

Thomas Graves commented on MAPREDUCE-5023:
--

Ravi can you please make sure this works for the AM also.

 History Server Web Services missing Job Counters
 

 Key: MAPREDUCE-5023
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5023
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.6
Reporter: Kendall Thrapp
Assignee: Ravi Prakash
Priority: Critical
 Attachments: MAPREDUCE-5023.patch


 The History Server's Job Counters API is not returning all the counters seen 
 on the Job's Counters webpage.  Specifically, I'm not seeing any of the 
 counters in the org.apache.hadoop.mapreduce.JobCounter group:
 TOTAL_LAUNCHED_MAPS
 TOTAL_LAUNCHED_REDUCES
 OTHER_LOCAL_MAPS
 SLOTS_MILLIS_MAPS
 SLOTS_MILLIS_REDUCES

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5023) History Server Web Services missing Job Counters

2013-03-07 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated MAPREDUCE-5023:


Attachment: MAPREDUCE-5023.patch

Thanks Tom. This new patch makes it work for both the HS and AM

 History Server Web Services missing Job Counters
 

 Key: MAPREDUCE-5023
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5023
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.6
Reporter: Kendall Thrapp
Assignee: Ravi Prakash
Priority: Critical
 Attachments: MAPREDUCE-5023.patch, MAPREDUCE-5023.patch


 The History Server's Job Counters API is not returning all the counters seen 
 on the Job's Counters webpage.  Specifically, I'm not seeing any of the 
 counters in the org.apache.hadoop.mapreduce.JobCounter group:
 TOTAL_LAUNCHED_MAPS
 TOTAL_LAUNCHED_REDUCES
 OTHER_LOCAL_MAPS
 SLOTS_MILLIS_MAPS
 SLOTS_MILLIS_REDUCES

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4885) streaming tests have multiple failures on Windows

2013-03-07 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated MAPREDUCE-4885:
-

 Target Version/s: 3.0.0, trunk-win  (was: trunk-win)
Affects Version/s: 3.0.0

 streaming tests have multiple failures on Windows
 -

 Key: MAPREDUCE-4885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4885
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/streaming, test
Affects Versions: 3.0.0, trunk-win
Reporter: Chris Nauroth
Assignee: Chris Nauroth

 There are multiple test failures due to Queue configuration missing child 
 queue names for root.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4885) streaming tests have multiple failures on Windows

2013-03-07 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated MAPREDUCE-4885:
-

Attachment: MAPREDUCE-4885.1.patch

With this patch, all of the streaming tests pass consistently on Windows.  Note 
that to see the tests pass, you'll also need the patch for MAPREDUCE-5006, 
which hasn't been committed yet.

The problems were:

# The now-infamous problem of attempting to use paths rooted on test.build.data 
with HDFS, which rejects paths containing ':', such as the Windows drive spec.  
The patch implements our standard work-around to allow overriding the test path 
to /tmp/test name.
# There was an assumption of Unix-style commands available for use as streaming 
mapper and reducer functions.  To work around this, I introduced some cmd 
scripts that roughly approximate Unix cat and xargs cat.
# There was one actual bug in {{StreamJob}}.  It was attempting to pass a 
string file path into the {{URI}} constructor.  On Windows, this would contain 
drive spec, and {{URI}} would consider it invalid and throw an error.  The only 
reason we needed the {{URI}} was to pass it in to the constructor of {{Path}}.  
Fortunately, we already have the logic in the {{Path}} constructor now to 
handle this case correctly cross-platform, so the simple fix is just to call 
the {{Path}} constructor with the string file path directly.
# I've increased a few test timeouts.  The old timeout values were borderline 
in my environment, sometimes causing the tests to fail sporadically on 
timeouts.  This was not a Windows-specific problem.

I've tested this patch on Mac and Windows.


 streaming tests have multiple failures on Windows
 -

 Key: MAPREDUCE-4885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4885
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/streaming, test
Affects Versions: 3.0.0, trunk-win
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: MAPREDUCE-4885.1.patch


 There are multiple test failures due to Queue configuration missing child 
 queue names for root.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4885) streaming tests have multiple failures on Windows

2013-03-07 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated MAPREDUCE-4885:
-

Status: Patch Available  (was: Open)

 streaming tests have multiple failures on Windows
 -

 Key: MAPREDUCE-4885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4885
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/streaming, test
Affects Versions: 3.0.0, trunk-win
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: MAPREDUCE-4885.1.patch


 There are multiple test failures due to Queue configuration missing child 
 queue names for root.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5023) History Server Web Services missing Job Counters

2013-03-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596412#comment-13596412
 ] 

Hadoop QA commented on MAPREDUCE-5023:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12572613/MAPREDUCE-5023.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3391//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3391//console

This message is automatically generated.

 History Server Web Services missing Job Counters
 

 Key: MAPREDUCE-5023
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5023
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.6
Reporter: Kendall Thrapp
Assignee: Ravi Prakash
Priority: Critical
 Attachments: MAPREDUCE-5023.patch, MAPREDUCE-5023.patch


 The History Server's Job Counters API is not returning all the counters seen 
 on the Job's Counters webpage.  Specifically, I'm not seeing any of the 
 counters in the org.apache.hadoop.mapreduce.JobCounter group:
 TOTAL_LAUNCHED_MAPS
 TOTAL_LAUNCHED_REDUCES
 OTHER_LOCAL_MAPS
 SLOTS_MILLIS_MAPS
 SLOTS_MILLIS_REDUCES

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4885) streaming tests have multiple failures on Windows

2013-03-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596438#comment-13596438
 ] 

Hadoop QA commented on MAPREDUCE-4885:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12572615/MAPREDUCE-4885.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 11 new 
or modified test files.

{color:green}+1 tests included appear to have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 2 
release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-tools/hadoop-streaming:

  org.apache.hadoop.streaming.TestStreamReduceNone
  org.apache.hadoop.streaming.TestStreamXmlRecordReader

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3392//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3392//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3392//console

This message is automatically generated.

 streaming tests have multiple failures on Windows
 -

 Key: MAPREDUCE-4885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4885
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/streaming, test
Affects Versions: 3.0.0, trunk-win
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: MAPREDUCE-4885.1.patch


 There are multiple test failures due to Queue configuration missing child 
 queue names for root.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4885) streaming tests have multiple failures on Windows

2013-03-07 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596469#comment-13596469
 ] 

Chris Nauroth commented on MAPREDUCE-4885:
--

The test failures reported by Hudson are unrelated and will be resolved by the 
patch on MAPREDUCE-5006.

 streaming tests have multiple failures on Windows
 -

 Key: MAPREDUCE-4885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4885
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/streaming, test
Affects Versions: 3.0.0, trunk-win
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: MAPREDUCE-4885.1.patch


 There are multiple test failures due to Queue configuration missing child 
 queue names for root.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5038) old API CombineFileInputFormat missing fixes that are in new API

2013-03-07 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated MAPREDUCE-5038:
--

   Resolution: Fixed
Fix Version/s: 1.2.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks Sandy. Thanks Tom for reviewing. Committed to branch-1.

 old API CombineFileInputFormat missing fixes that are in new API 
 -

 Key: MAPREDUCE-5038
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5038
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.1
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 1.2.0

 Attachments: MAPREDUCE-5038-1.patch, MAPREDUCE-5038.patch


 The following changes patched the CombineFileInputFormat in mapreduce, but 
 neglected the one in mapred
 MAPREDUCE-1597 enabled the CombineFileInputFormat to work on splittable files
 MAPREDUCE-2021 solved returning duplicate hostnames in split locations
 MAPREDUCE-1806 CombineFileInputFormat does not work with paths not on default 
 FS
 In trunk this is not an issue as the one in mapred extends the one in 
 mapreduce.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5049) CombineFileInputFormat counts all compressed files non-splitable

2013-03-07 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated MAPREDUCE-5049:
--

   Resolution: Fixed
Fix Version/s: 1.2.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks Sandy. Thanks Tom for reviewing. Committed to branch-1.

 CombineFileInputFormat counts all compressed files non-splitable
 

 Key: MAPREDUCE-5049
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5049
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.1
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 1.2.0

 Attachments: MAPREDUCE-5049.patch


 In branch-1, CombineFileInputFormat doesn't take SplittableCompressionCodec 
 into account and thinks that all compressible input files aren't splittable.  
 This is a regression from when handling for non-splitable compression codecs 
 was originally added in MAPREDUCE-1597, and seems to have somehow gotten in 
 when the code was pulled from 0.22 to branch-1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5038) old API CombineFileInputFormat missing fixes that are in new API

2013-03-07 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596501#comment-13596501
 ] 

Sangjin Lee commented on MAPREDUCE-5038:


I think MAPREDUCE-5046 can be closed, as it is a subset of this patch.

 old API CombineFileInputFormat missing fixes that are in new API 
 -

 Key: MAPREDUCE-5038
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5038
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.1
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 1.2.0

 Attachments: MAPREDUCE-5038-1.patch, MAPREDUCE-5038.patch


 The following changes patched the CombineFileInputFormat in mapreduce, but 
 neglected the one in mapred
 MAPREDUCE-1597 enabled the CombineFileInputFormat to work on splittable files
 MAPREDUCE-2021 solved returning duplicate hostnames in split locations
 MAPREDUCE-1806 CombineFileInputFormat does not work with paths not on default 
 FS
 In trunk this is not an issue as the one in mapred extends the one in 
 mapreduce.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-5046) backport MAPREDUCE-1423 to mapred.lib.CombineFileInputFormat

2013-03-07 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee resolved MAPREDUCE-5046.


   Resolution: Fixed
Fix Version/s: 1.2.0

It was fixed as part of MAPREDUCE-5038.

 backport MAPREDUCE-1423 to mapred.lib.CombineFileInputFormat
 

 Key: MAPREDUCE-5046
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5046
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 1.1.1
Reporter: Sangjin Lee
 Fix For: 1.2.0


 The CombineFileInputFormat class in org.apache.hadoop.mapred.lib (the old 
 API) has a couple of issues. These issues were addressed in the new API 
 (MAPREDUCE-1423), but the old class was not fixed.
 The main issue the JIRA refers to is a performance problem. However, IMO 
 there is a more serious problem which is a thread-safety issue (rackToNodes) 
 which was fixed alongside.
 What is the policy on addressing issues in the old API? Can we backport this 
 to the old class?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5023) History Server Web Services missing Job Counters

2013-03-07 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596522#comment-13596522
 ] 

Thomas Graves commented on MAPREDUCE-5023:
--

Thanks Ravi!  +1 on the first patch.  It works for me on AM and HS.  I'll 
commit shortly

 History Server Web Services missing Job Counters
 

 Key: MAPREDUCE-5023
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5023
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.6
Reporter: Kendall Thrapp
Assignee: Ravi Prakash
Priority: Critical
 Attachments: MAPREDUCE-5023.patch, MAPREDUCE-5023.patch


 The History Server's Job Counters API is not returning all the counters seen 
 on the Job's Counters webpage.  Specifically, I'm not seeing any of the 
 counters in the org.apache.hadoop.mapreduce.JobCounter group:
 TOTAL_LAUNCHED_MAPS
 TOTAL_LAUNCHED_REDUCES
 OTHER_LOCAL_MAPS
 SLOTS_MILLIS_MAPS
 SLOTS_MILLIS_REDUCES

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5023) History Server Web Services missing Job Counters

2013-03-07 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated MAPREDUCE-5023:
-

   Resolution: Fixed
Fix Version/s: 2.0.4-alpha
   0.23.7
   3.0.0
   Status: Resolved  (was: Patch Available)

 History Server Web Services missing Job Counters
 

 Key: MAPREDUCE-5023
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5023
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.6
Reporter: Kendall Thrapp
Assignee: Ravi Prakash
Priority: Critical
 Fix For: 3.0.0, 0.23.7, 2.0.4-alpha

 Attachments: MAPREDUCE-5023.patch, MAPREDUCE-5023.patch


 The History Server's Job Counters API is not returning all the counters seen 
 on the Job's Counters webpage.  Specifically, I'm not seeing any of the 
 counters in the org.apache.hadoop.mapreduce.JobCounter group:
 TOTAL_LAUNCHED_MAPS
 TOTAL_LAUNCHED_REDUCES
 OTHER_LOCAL_MAPS
 SLOTS_MILLIS_MAPS
 SLOTS_MILLIS_REDUCES

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5023) History Server Web Services missing Job Counters

2013-03-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596543#comment-13596543
 ] 

Hudson commented on MAPREDUCE-5023:
---

Integrated in Hadoop-trunk-Commit #3437 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3437/])
MAPREDUCE-5023. History Server Web Services missing Job Counters (Ravi 
Prakash via tgraves) (Revision 1454156)

 Result = SUCCESS
tgraves : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1454156
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/dao/JobCounterInfo.java


 History Server Web Services missing Job Counters
 

 Key: MAPREDUCE-5023
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5023
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.6
Reporter: Kendall Thrapp
Assignee: Ravi Prakash
Priority: Critical
 Fix For: 3.0.0, 0.23.7, 2.0.4-alpha

 Attachments: MAPREDUCE-5023.patch, MAPREDUCE-5023.patch


 The History Server's Job Counters API is not returning all the counters seen 
 on the Job's Counters webpage.  Specifically, I'm not seeing any of the 
 counters in the org.apache.hadoop.mapreduce.JobCounter group:
 TOTAL_LAUNCHED_MAPS
 TOTAL_LAUNCHED_REDUCES
 OTHER_LOCAL_MAPS
 SLOTS_MILLIS_MAPS
 SLOTS_MILLIS_REDUCES

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4885) streaming tests have multiple failures on Windows

2013-03-07 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated MAPREDUCE-4885:
-

 Target Version/s: 3.0.0  (was: 3.0.0, trunk-win)
Affects Version/s: (was: trunk-win)

 streaming tests have multiple failures on Windows
 -

 Key: MAPREDUCE-4885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4885
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/streaming, test
Affects Versions: 3.0.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: MAPREDUCE-4885.1.patch


 There are multiple test failures due to Queue configuration missing child 
 queue names for root.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5053) java.lang.InternalError from decompression codec cause reducer to fail

2013-03-07 Thread Robert Parker (JIRA)
Robert Parker created MAPREDUCE-5053:


 Summary: java.lang.InternalError from decompression codec cause 
reducer to fail
 Key: MAPREDUCE-5053
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5053
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.5, 2.0.3-alpha, trunk
Reporter: Robert Parker
Assignee: Robert Parker


lz4, snappy, zlib, and lzo Decompressor's only throw java.lang.InternalError. 
This exception will cause the reducer to fail and bypass all of the fetch 
failure logic.  The decompressing errors should be treated as fetch failures.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-207) Computing Input Splits on the MR Cluster

2013-03-07 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596692#comment-13596692
 ] 

Sandy Ryza commented on MAPREDUCE-207:
--

Arun, are you still planning on working on this?  If not, do you mind if I pick 
it up?

 Computing Input Splits on the MR Cluster
 

 Key: MAPREDUCE-207
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-207
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: applicationmaster, mrv2
Reporter: Philip Zeyliger
Assignee: Arun C Murthy
 Attachments: MAPREDUCE-207.patch


 Instead of computing the input splits as part of job submission, Hadoop could 
 have a separate job task type that computes the input splits, therefore 
 allowing that computation to happen on the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5053) java.lang.InternalError from decompression codec cause reducer to fail

2013-03-07 Thread Robert Parker (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Parker updated MAPREDUCE-5053:
-

Status: Patch Available  (was: Open)

 java.lang.InternalError from decompression codec cause reducer to fail
 --

 Key: MAPREDUCE-5053
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5053
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.5, 2.0.3-alpha, trunk
Reporter: Robert Parker
Assignee: Robert Parker
 Attachments: MAPREDUCE-5053-1.patch


 lz4, snappy, zlib, and lzo Decompressor's only throw java.lang.InternalError. 
 This exception will cause the reducer to fail and bypass all of the fetch 
 failure logic.  The decompressing errors should be treated as fetch failures.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5053) java.lang.InternalError from decompression codec cause reducer to fail

2013-03-07 Thread Robert Parker (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Parker updated MAPREDUCE-5053:
-

Attachment: MAPREDUCE-5053-1.patch

 java.lang.InternalError from decompression codec cause reducer to fail
 --

 Key: MAPREDUCE-5053
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5053
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: trunk, 2.0.3-alpha, 0.23.5
Reporter: Robert Parker
Assignee: Robert Parker
 Attachments: MAPREDUCE-5053-1.patch


 lz4, snappy, zlib, and lzo Decompressor's only throw java.lang.InternalError. 
 This exception will cause the reducer to fail and bypass all of the fetch 
 failure logic.  The decompressing errors should be treated as fetch failures.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5053) java.lang.InternalError from decompression codec cause reducer to fail

2013-03-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596724#comment-13596724
 ] 

Hadoop QA commented on MAPREDUCE-5053:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12572678/MAPREDUCE-5053-1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 tests included appear to have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3393//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3393//console

This message is automatically generated.

 java.lang.InternalError from decompression codec cause reducer to fail
 --

 Key: MAPREDUCE-5053
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5053
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: trunk, 2.0.3-alpha, 0.23.5
Reporter: Robert Parker
Assignee: Robert Parker
 Attachments: MAPREDUCE-5053-1.patch


 lz4, snappy, zlib, and lzo Decompressor's only throw java.lang.InternalError. 
 This exception will cause the reducer to fail and bypass all of the fetch 
 failure logic.  The decompressing errors should be treated as fetch failures.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4961) Map reduce running local should also go through ShuffleConsumerPlugin for enabling different MergeManager implementations

2013-03-07 Thread Jerry Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry Chen updated MAPREDUCE-4961:
--

Attachment: MAPREDUCE-4961.patch

 Map reduce running local should also go through ShuffleConsumerPlugin for 
 enabling different MergeManager implementations
 -

 Key: MAPREDUCE-4961
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4961
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: trunk
Reporter: Jerry Chen
Assignee: Jerry Chen
 Attachments: MAPREDUCE-4961.patch, MAPREDUCE-4961.patch

   Original Estimate: 72h
  Remaining Estimate: 72h

 MAPREDUCE-4049 provide the ability for pluggable Shuffle and MAPREDUCE-4080 
 extends Shuffle to be able to provide different MergeManager implementations. 
 While using these pluggable features, I find that when a map reduce is 
 running locally, a RawKeyValueIterator was returned directly from a static 
 call of Merge.merge, which break the assumption that the Shuffle may provide 
 different merge methods although there is no copy phase for this situation.
 The use case is when I am implementating a hash-based MergeManager, we don't 
 need sort in map side, while when running the map reduce locally, the 
 hash-based MergeManager will have no chance to be used as it goes directly to 
 Merger.merge. This makes the pluggable Shuffle and MergeManager incomplete.
 So we need to move the code calling Merger.merge from Reduce Task to 
 ShuffleConsumerPlugin implementation, so that the Suffle implementation can 
 decide how to do the merge and return corresponding iterator.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4961) Map reduce running local should also go through ShuffleConsumerPlugin for enabling different MergeManager implementations

2013-03-07 Thread Jerry Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry Chen updated MAPREDUCE-4961:
--

Status: Open  (was: Patch Available)

Cancel the current patch for taking Asokan's advice.

 Map reduce running local should also go through ShuffleConsumerPlugin for 
 enabling different MergeManager implementations
 -

 Key: MAPREDUCE-4961
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4961
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: trunk
Reporter: Jerry Chen
Assignee: Jerry Chen
 Attachments: MAPREDUCE-4961.patch, MAPREDUCE-4961.patch

   Original Estimate: 72h
  Remaining Estimate: 72h

 MAPREDUCE-4049 provide the ability for pluggable Shuffle and MAPREDUCE-4080 
 extends Shuffle to be able to provide different MergeManager implementations. 
 While using these pluggable features, I find that when a map reduce is 
 running locally, a RawKeyValueIterator was returned directly from a static 
 call of Merge.merge, which break the assumption that the Shuffle may provide 
 different merge methods although there is no copy phase for this situation.
 The use case is when I am implementating a hash-based MergeManager, we don't 
 need sort in map side, while when running the map reduce locally, the 
 hash-based MergeManager will have no chance to be used as it goes directly to 
 Merger.merge. This makes the pluggable Shuffle and MergeManager incomplete.
 So we need to move the code calling Merger.merge from Reduce Task to 
 ShuffleConsumerPlugin implementation, so that the Suffle implementation can 
 decide how to do the merge and return corresponding iterator.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4961) Map reduce running local should also go through ShuffleConsumerPlugin for enabling different MergeManager implementations

2013-03-07 Thread Jerry Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry Chen updated MAPREDUCE-4961:
--

Status: Patch Available  (was: Open)

The new patch submitted. Remove the changes related to MergeManager and keep it 
in ShuffleConsumerPlugin interface. Please kindly help review.

 Map reduce running local should also go through ShuffleConsumerPlugin for 
 enabling different MergeManager implementations
 -

 Key: MAPREDUCE-4961
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4961
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: trunk
Reporter: Jerry Chen
Assignee: Jerry Chen
 Attachments: MAPREDUCE-4961.patch, MAPREDUCE-4961.patch

   Original Estimate: 72h
  Remaining Estimate: 72h

 MAPREDUCE-4049 provide the ability for pluggable Shuffle and MAPREDUCE-4080 
 extends Shuffle to be able to provide different MergeManager implementations. 
 While using these pluggable features, I find that when a map reduce is 
 running locally, a RawKeyValueIterator was returned directly from a static 
 call of Merge.merge, which break the assumption that the Shuffle may provide 
 different merge methods although there is no copy phase for this situation.
 The use case is when I am implementating a hash-based MergeManager, we don't 
 need sort in map side, while when running the map reduce locally, the 
 hash-based MergeManager will have no chance to be used as it goes directly to 
 Merger.merge. This makes the pluggable Shuffle and MergeManager incomplete.
 So we need to move the code calling Merger.merge from Reduce Task to 
 ShuffleConsumerPlugin implementation, so that the Suffle implementation can 
 decide how to do the merge and return corresponding iterator.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4961) Map reduce running local should also go through ShuffleConsumerPlugin for enabling different MergeManager implementations

2013-03-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596916#comment-13596916
 ] 

Hadoop QA commented on MAPREDUCE-4961:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12572711/MAPREDUCE-4961.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 tests included appear to have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3394//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3394//console

This message is automatically generated.

 Map reduce running local should also go through ShuffleConsumerPlugin for 
 enabling different MergeManager implementations
 -

 Key: MAPREDUCE-4961
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4961
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: trunk
Reporter: Jerry Chen
Assignee: Jerry Chen
 Attachments: MAPREDUCE-4961.patch, MAPREDUCE-4961.patch

   Original Estimate: 72h
  Remaining Estimate: 72h

 MAPREDUCE-4049 provide the ability for pluggable Shuffle and MAPREDUCE-4080 
 extends Shuffle to be able to provide different MergeManager implementations. 
 While using these pluggable features, I find that when a map reduce is 
 running locally, a RawKeyValueIterator was returned directly from a static 
 call of Merge.merge, which break the assumption that the Shuffle may provide 
 different merge methods although there is no copy phase for this situation.
 The use case is when I am implementating a hash-based MergeManager, we don't 
 need sort in map side, while when running the map reduce locally, the 
 hash-based MergeManager will have no chance to be used as it goes directly to 
 Merger.merge. This makes the pluggable Shuffle and MergeManager incomplete.
 So we need to move the code calling Merger.merge from Reduce Task to 
 ShuffleConsumerPlugin implementation, so that the Suffle implementation can 
 decide how to do the merge and return corresponding iterator.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira