[jira] [Created] (MAPREDUCE-5216) While using TextSplitter in DataDrivenDBInputformat, the lower limit (split start) always remains the same, for all splits.
Gelesh created MAPREDUCE-5216: - Summary: While using TextSplitter in DataDrivenDBInputformat, the lower limit (split start) always remains the same, for all splits. Key: MAPREDUCE-5216 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5216 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Gelesh -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5217) DistCp fails when launched by Oozie in a secure cluster
Venkat Ranganathan created MAPREDUCE-5217: - Summary: DistCp fails when launched by Oozie in a secure cluster Key: MAPREDUCE-5217 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5217 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp Affects Versions: 1.1.2 Environment: Hadoop secure cluster Reporter: Venkat Ranganathan Assignee: Venkat Ranganathan Attachments: MAPREDUCE-5217.patch As mentioned in MAPREDUCE-4324, Oozie has the following boilerplate code in in the main launcher for Pig, Hive, MR and Sqoop actions. if (System.getenv(HADOOP_TOKEN_FILE_LOCATION) != null) { jobConf.set(mapreduce.job.credentials.binary, System.getenv(HADOOP_TOKEN_FILE_LOCATION)); } For Java action, which does not have a main launcher in oozie, the above codecan be added by the user as the user purportedly has the code that is launched. But for DistCp action, the user has no such luxury. The solution attempted in MAPREDUCE-4324 would have helped DistCp, but it was not implemented as it would break MAPREDUCE-3727. So, we have to fix DistCp and add the same boilerplate code so that DistCp action can be launched by Oozie in a secure cluster. The code added checks for an System env. variable to be set which is not typically set in normal command line execution of DistCp, DistCp runs fine with commnad line usage both in secure and non-secure cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5218) Annotate (comment) internal classes as Private
Karthik Kambatla created MAPREDUCE-5218: --- Summary: Annotate (comment) internal classes as Private Key: MAPREDUCE-5218 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5218 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.2 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Fix For: 1.3.0 The following classes are intended for internal use and it would be nice to explicitly state that in comments/annotation. # TaskUmbilicalProtocol # TaskInProgress # MapReducePolicyProvider # MRAdmin? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5219) JobStatus#getJobPriority changed to JobStatus#getPriority in MR2
Sandy Ryza created MAPREDUCE-5219: - Summary: JobStatus#getJobPriority changed to JobStatus#getPriority in MR2 Key: MAPREDUCE-5219 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5219 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza We should change it back for compatibility -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5220) Setter methods in TaskCompletionEvent are public in MR1 and protected in MR2
Sandy Ryza created MAPREDUCE-5220: - Summary: Setter methods in TaskCompletionEvent are public in MR1 and protected in MR2 Key: MAPREDUCE-5220 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5220 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: client Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5221) Reduce side Combiner is not used when using the new API
Siddharth Seth created MAPREDUCE-5221: - Summary: Reduce side Combiner is not used when using the new API Key: MAPREDUCE-5221 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5221 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.4-alpha Reporter: Siddharth Seth If a combiner is specified using o.a.h.mapreduce.Job.setCombinerClass - this will silently ignored on the reduce side since the reduce side usage is only aware of the old api combiner. This doesn't fail the job - since the new combiner key does not deprecate the old key. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5222) Add missing methods to JobClient
Karthik Kambatla created MAPREDUCE-5222: --- Summary: Add missing methods to JobClient Key: MAPREDUCE-5222 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5222 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.0.4-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Fix For: 2.0.5-beta JobClient is missing the following two public methods we need to add for binary compatibility: # static isJobDirValid(Path, FileSystem) # Path getStagingAreaDir() -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (MAPREDUCE-5038) old API CombineFileInputFormat missing fixes that are in new API
[ https://issues.apache.org/jira/browse/MAPREDUCE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner reopened MAPREDUCE-5038: --- Can someone please re-investigate? This is causing hive to fail it's test with har filesystem. Here's the stack trace: {noformat} java.io.IOException: URI: har://pfile-:/grid/0/jenkins/workspace/UnitTest-Hive-condor-0.11.0/label/centos5/hdp-BUILDS/hive-0.11.0.1.3.0.0/build/ql/test/data/warehouse/tstsrcpart/ds=2008-04-08/data.har/hr=11/00_0 is an invalid Har URI since host==null. Expecting har://scheme-host/path. at org.apache.hadoop.fs.HarFileSystem.decodeHarURI(HarFileSystem.java:191) at org.apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:100) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1482) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:251) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187) at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getMoreSplits(CombineFileInputFormat.java:270) at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:226) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:385) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:351) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:389) at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1081) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1073) at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:447) at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:687) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) Job Submission failed with exception 'java.io.IOException(URI: har://pfile-:/grid/0/jenkins/workspace/UnitTest-Hive-condor-0.11.0/label/centos5/hdp-BUILDS/hive-0.11.0.1.3.0.0/build/ql/test/data/warehouse/tstsrcpart/ds=2008-04-08/data.har/hr=11/00_0 is an invalid Har URI since host==null. Expecting har://scheme-host/path.)' {noformat} Steps to reproduce: {noformat} $ ant test -Dtestcase=TestCliDriver -Dqfile=archive_multi.q {noformat} Also, I verified the same test passes when I run with a local build after reverting this patch. Thanks. old API CombineFileInputFormat missing fixes that are in new API - Key: MAPREDUCE-5038 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5038 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.1 Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 1.3.0 Attachments: MAPREDUCE-5038-1.patch, MAPREDUCE-5038.patch, MAPREDUCE-5038-revised-1.patch, MAPREDUCE-5038-revised-1.patch, MAPREDUCE-5038-revised.patch The following changes patched the CombineFileInputFormat in mapreduce, but neglected the one in mapred MAPREDUCE-1597 enabled the CombineFileInputFormat to work on splittable files MAPREDUCE-2021 solved returning duplicate hostnames in split locations MAPREDUCE-1806 CombineFileInputFormat does not work with paths not on default FS In trunk this is not an issue as the one in mapred extends the one in mapreduce. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira