[jira] [Created] (MAPREDUCE-5216) While using TextSplitter in DataDrivenDBInputformat, the lower limit (split start) always remains the same, for all splits.

2013-05-07 Thread Gelesh (JIRA)
Gelesh created MAPREDUCE-5216:
-

 Summary: While using TextSplitter in DataDrivenDBInputformat, the 
lower limit (split start) always remains the same, for all splits.
 Key: MAPREDUCE-5216
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5216
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Gelesh




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5217) DistCp fails when launched by Oozie in a secure cluster

2013-05-07 Thread Venkat Ranganathan (JIRA)
Venkat Ranganathan created MAPREDUCE-5217:
-

 Summary: DistCp fails when launched by Oozie in a secure cluster
 Key: MAPREDUCE-5217
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5217
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distcp
Affects Versions: 1.1.2
 Environment: Hadoop secure cluster
Reporter: Venkat Ranganathan
Assignee: Venkat Ranganathan
 Attachments: MAPREDUCE-5217.patch

As mentioned in MAPREDUCE-4324, Oozie has the following boilerplate code in
in the main launcher for Pig, Hive, MR and Sqoop actions.

if (System.getenv(HADOOP_TOKEN_FILE_LOCATION) != null) {
jobConf.set(mapreduce.job.credentials.binary, 
System.getenv(HADOOP_TOKEN_FILE_LOCATION));
}

For Java action, which does not have a main launcher in oozie, the above 
codecan be added by the user as the user purportedly has the code that is 
launched.

But for DistCp action, the user has no such luxury.  The solution attempted in
MAPREDUCE-4324 would have helped DistCp, but it was not implemented as it would 
break MAPREDUCE-3727.  So, we have to fix DistCp and
add the same boilerplate code so that DistCp action can be launched by Oozie
in a secure cluster.

The code added checks for an System env. variable to be set which is not
typically set in normal command line execution of DistCp,  DistCp runs fine
with commnad  line usage both in secure and non-secure cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5218) Annotate (comment) internal classes as Private

2013-05-07 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created MAPREDUCE-5218:
---

 Summary: Annotate (comment) internal classes as Private
 Key: MAPREDUCE-5218
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5218
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.2
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Fix For: 1.3.0


The following classes are intended for internal use and it would be nice to 
explicitly state that in comments/annotation.

# TaskUmbilicalProtocol
# TaskInProgress
# MapReducePolicyProvider
# MRAdmin?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5219) JobStatus#getJobPriority changed to JobStatus#getPriority in MR2

2013-05-07 Thread Sandy Ryza (JIRA)
Sandy Ryza created MAPREDUCE-5219:
-

 Summary: JobStatus#getJobPriority changed to JobStatus#getPriority 
in MR2
 Key: MAPREDUCE-5219
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5219
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.4-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza


We should change it back for compatibility

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5220) Setter methods in TaskCompletionEvent are public in MR1 and protected in MR2

2013-05-07 Thread Sandy Ryza (JIRA)
Sandy Ryza created MAPREDUCE-5220:
-

 Summary: Setter methods in TaskCompletionEvent are public in MR1 
and protected in MR2
 Key: MAPREDUCE-5220
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5220
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: client
Affects Versions: 2.0.4-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5221) Reduce side Combiner is not used when using the new API

2013-05-07 Thread Siddharth Seth (JIRA)
Siddharth Seth created MAPREDUCE-5221:
-

 Summary: Reduce side Combiner is not used when using the new API
 Key: MAPREDUCE-5221
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5221
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth


If a combiner is specified using o.a.h.mapreduce.Job.setCombinerClass - this 
will silently ignored on the reduce side since the reduce side usage is only 
aware of the old api combiner.
This doesn't fail the job - since the new combiner key does not deprecate the 
old key.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5222) Add missing methods to JobClient

2013-05-07 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created MAPREDUCE-5222:
---

 Summary: Add missing methods to JobClient 
 Key: MAPREDUCE-5222
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5222
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Affects Versions: 2.0.4-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Fix For: 2.0.5-beta


JobClient is missing the following two public methods we need to add for binary 
compatibility:

# static isJobDirValid(Path, FileSystem)
# Path getStagingAreaDir()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (MAPREDUCE-5038) old API CombineFileInputFormat missing fixes that are in new API

2013-05-07 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner reopened MAPREDUCE-5038:
---


Can someone please re-investigate? This is causing hive to fail it's test with 
har filesystem. Here's the stack trace:

{noformat}
java.io.IOException: URI: 
har://pfile-:/grid/0/jenkins/workspace/UnitTest-Hive-condor-0.11.0/label/centos5/hdp-BUILDS/hive-0.11.0.1.3.0.0/build/ql/test/data/warehouse/tstsrcpart/ds=2008-04-08/data.har/hr=11/00_0
 is an invalid Har URI since host==null. Expecting har://scheme-host/path.
at org.apache.hadoop.fs.HarFileSystem.decodeHarURI(HarFileSystem.java:191)
at org.apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:100)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1482)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:251)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
at 
org.apache.hadoop.mapred.lib.CombineFileInputFormat.getMoreSplits(CombineFileInputFormat.java:270)
at 
org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:226)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:385)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:351)
at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:389)
at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1081)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1073)
at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910)
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:447)
at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:687)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Job Submission failed with exception 'java.io.IOException(URI: 
har://pfile-:/grid/0/jenkins/workspace/UnitTest-Hive-condor-0.11.0/label/centos5/hdp-BUILDS/hive-0.11.0.1.3.0.0/build/ql/test/data/warehouse/tstsrcpart/ds=2008-04-08/data.har/hr=11/00_0
 is an invalid Har URI since host==null. Expecting 
har://scheme-host/path.)'
{noformat}



Steps to reproduce:
{noformat}
$ ant test -Dtestcase=TestCliDriver -Dqfile=archive_multi.q
{noformat}

Also, I verified the same test passes when I run with a local build after 
reverting this patch.

Thanks.

 old API CombineFileInputFormat missing fixes that are in new API 
 -

 Key: MAPREDUCE-5038
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5038
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.1
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 1.3.0

 Attachments: MAPREDUCE-5038-1.patch, MAPREDUCE-5038.patch, 
 MAPREDUCE-5038-revised-1.patch, MAPREDUCE-5038-revised-1.patch, 
 MAPREDUCE-5038-revised.patch


 The following changes patched the CombineFileInputFormat in mapreduce, but 
 neglected the one in mapred
 MAPREDUCE-1597 enabled the CombineFileInputFormat to work on splittable files
 MAPREDUCE-2021 solved returning duplicate hostnames in split locations
 MAPREDUCE-1806 CombineFileInputFormat does not work with paths not on default 
 FS
 In trunk this is not an issue as the one in mapred extends the one in 
 mapreduce.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira