[
https://issues.apache.org/jira/browse/MAPREDUCE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gunther Hagleitner reopened MAPREDUCE-5038:
-------------------------------------------
Can someone please re-investigate? This is causing hive to fail it's test with
"har" filesystem. Here's the stack trace:
{noformat}
java.io.IOException: URI:
har://pfile-:/grid/0/jenkins/workspace/UnitTest-Hive-condor-0.11.0/label/centos5/hdp-BUILDS/hive-0.11.0.1.3.0.0/build/ql/test/data/warehouse/tstsrcpart/ds=2008-04-08/data.har/hr=11/000000_0
is an invalid Har URI since host==null. Expecting har://<scheme>-<host>/<path>.
at org.apache.hadoop.fs.HarFileSystem.decodeHarURI(HarFileSystem.java:191)
at org.apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:100)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1482)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:251)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
at
org.apache.hadoop.mapred.lib.CombineFileInputFormat.getMoreSplits(CombineFileInputFormat.java:270)
at
org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:226)
at
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:385)
at
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:351)
at
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:389)
at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1081)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1073)
at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910)
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:447)
at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:687)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Job Submission failed with exception 'java.io.IOException(URI:
har://pfile-:/grid/0/jenkins/workspace/UnitTest-Hive-condor-0.11.0/label/centos5/hdp-BUILDS/hive-0.11.0.1.3.0.0/build/ql/test/data/warehouse/tstsrcpart/ds=2008-04-08/data.har/hr=11/000000_0
is an invalid Har URI since host==null. Expecting
har://<scheme>-<host>/<path>.)'
{noformat}
----
Steps to reproduce:
{noformat}
$ ant test -Dtestcase=TestCliDriver -Dqfile=archive_multi.q
{noformat}
Also, I verified the same test passes when I run with a local build after
reverting this patch.
Thanks.
> old API CombineFileInputFormat missing fixes that are in new API
> -----------------------------------------------------------------
>
> Key: MAPREDUCE-5038
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5038
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 1.1.1
> Reporter: Sandy Ryza
> Assignee: Sandy Ryza
> Fix For: 1.3.0
>
> Attachments: MAPREDUCE-5038-1.patch, MAPREDUCE-5038.patch,
> MAPREDUCE-5038-revised-1.patch, MAPREDUCE-5038-revised-1.patch,
> MAPREDUCE-5038-revised.patch
>
>
> The following changes patched the CombineFileInputFormat in mapreduce, but
> neglected the one in mapred
> MAPREDUCE-1597 enabled the CombineFileInputFormat to work on splittable files
> MAPREDUCE-2021 solved returning duplicate hostnames in split locations
> MAPREDUCE-1806 CombineFileInputFormat does not work with paths not on default
> FS
> In trunk this is not an issue as the one in mapred extends the one in
> mapreduce.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira