[jira] [Reopened] (MAPREDUCE-5038) old API CombineFileInputFormat missing fixes that are in new API

2013-05-07 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner reopened MAPREDUCE-5038:
---


Can someone please re-investigate? This is causing hive to fail it's test with 
har filesystem. Here's the stack trace:

{noformat}
java.io.IOException: URI: 
har://pfile-:/grid/0/jenkins/workspace/UnitTest-Hive-condor-0.11.0/label/centos5/hdp-BUILDS/hive-0.11.0.1.3.0.0/build/ql/test/data/warehouse/tstsrcpart/ds=2008-04-08/data.har/hr=11/00_0
 is an invalid Har URI since host==null. Expecting har://scheme-host/path.
at org.apache.hadoop.fs.HarFileSystem.decodeHarURI(HarFileSystem.java:191)
at org.apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:100)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1482)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:251)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
at 
org.apache.hadoop.mapred.lib.CombineFileInputFormat.getMoreSplits(CombineFileInputFormat.java:270)
at 
org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:226)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:385)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:351)
at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:389)
at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1081)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1073)
at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910)
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:447)
at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:687)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Job Submission failed with exception 'java.io.IOException(URI: 
har://pfile-:/grid/0/jenkins/workspace/UnitTest-Hive-condor-0.11.0/label/centos5/hdp-BUILDS/hive-0.11.0.1.3.0.0/build/ql/test/data/warehouse/tstsrcpart/ds=2008-04-08/data.har/hr=11/00_0
 is an invalid Har URI since host==null. Expecting 
har://scheme-host/path.)'
{noformat}



Steps to reproduce:
{noformat}
$ ant test -Dtestcase=TestCliDriver -Dqfile=archive_multi.q
{noformat}

Also, I verified the same test passes when I run with a local build after 
reverting this patch.

Thanks.

 old API CombineFileInputFormat missing fixes that are in new API 
 -

 Key: MAPREDUCE-5038
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5038
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.1
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 1.3.0

 Attachments: MAPREDUCE-5038-1.patch, MAPREDUCE-5038.patch, 
 MAPREDUCE-5038-revised-1.patch, MAPREDUCE-5038-revised-1.patch, 
 MAPREDUCE-5038-revised.patch


 The following changes patched the CombineFileInputFormat in mapreduce, but 
 neglected the one in mapred
 MAPREDUCE-1597 enabled the CombineFileInputFormat to work on splittable files
 MAPREDUCE-2021 solved returning duplicate hostnames in split locations
 MAPREDUCE-1806 CombineFileInputFormat does not work with paths not on default 
 FS
 In trunk this is not an issue as the one in mapred extends the one in 
 mapreduce.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (MAPREDUCE-5038) old API CombineFileInputFormat missing fixes that are in new API

2013-03-16 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza reopened MAPREDUCE-5038:
---


 old API CombineFileInputFormat missing fixes that are in new API 
 -

 Key: MAPREDUCE-5038
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5038
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.1
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 1.2.0

 Attachments: MAPREDUCE-5038-1.patch, MAPREDUCE-5038.patch


 The following changes patched the CombineFileInputFormat in mapreduce, but 
 neglected the one in mapred
 MAPREDUCE-1597 enabled the CombineFileInputFormat to work on splittable files
 MAPREDUCE-2021 solved returning duplicate hostnames in split locations
 MAPREDUCE-1806 CombineFileInputFormat does not work with paths not on default 
 FS
 In trunk this is not an issue as the one in mapred extends the one in 
 mapreduce.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira