[
https://issues.apache.org/jira/browse/MAPREDUCE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gunther Hagleitner reopened MAPREDUCE-5038:
---
Can someone please re-investigate? This is causing hive to fail it's test with
har filesystem. Here's the stack trace:
{noformat}
java.io.IOException: URI:
har://pfile-:/grid/0/jenkins/workspace/UnitTest-Hive-condor-0.11.0/label/centos5/hdp-BUILDS/hive-0.11.0.1.3.0.0/build/ql/test/data/warehouse/tstsrcpart/ds=2008-04-08/data.har/hr=11/00_0
is an invalid Har URI since host==null. Expecting har://scheme-host/path.
at org.apache.hadoop.fs.HarFileSystem.decodeHarURI(HarFileSystem.java:191)
at org.apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:100)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1482)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:251)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
at
org.apache.hadoop.mapred.lib.CombineFileInputFormat.getMoreSplits(CombineFileInputFormat.java:270)
at
org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:226)
at
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:385)
at
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:351)
at
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:389)
at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1081)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1073)
at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910)
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:447)
at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:687)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Job Submission failed with exception 'java.io.IOException(URI:
har://pfile-:/grid/0/jenkins/workspace/UnitTest-Hive-condor-0.11.0/label/centos5/hdp-BUILDS/hive-0.11.0.1.3.0.0/build/ql/test/data/warehouse/tstsrcpart/ds=2008-04-08/data.har/hr=11/00_0
is an invalid Har URI since host==null. Expecting
har://scheme-host/path.)'
{noformat}
Steps to reproduce:
{noformat}
$ ant test -Dtestcase=TestCliDriver -Dqfile=archive_multi.q
{noformat}
Also, I verified the same test passes when I run with a local build after
reverting this patch.
Thanks.
old API CombineFileInputFormat missing fixes that are in new API
-
Key: MAPREDUCE-5038
URL: https://issues.apache.org/jira/browse/MAPREDUCE-5038
Project: Hadoop Map/Reduce
Issue Type: Bug
Affects Versions: 1.1.1
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Fix For: 1.3.0
Attachments: MAPREDUCE-5038-1.patch, MAPREDUCE-5038.patch,
MAPREDUCE-5038-revised-1.patch, MAPREDUCE-5038-revised-1.patch,
MAPREDUCE-5038-revised.patch
The following changes patched the CombineFileInputFormat in mapreduce, but
neglected the one in mapred
MAPREDUCE-1597 enabled the CombineFileInputFormat to work on splittable files
MAPREDUCE-2021 solved returning duplicate hostnames in split locations
MAPREDUCE-1806 CombineFileInputFormat does not work with paths not on default
FS
In trunk this is not an issue as the one in mapred extends the one in
mapreduce.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira