Sergey Shelukhin created HADOOP-15403:
-----------------------------------------
Summary: FileInputFormat recursive=false fails instead of ignoring
the directories.
Key: HADOOP-15403
URL: https://issues.apache.org/jira/browse/HADOOP-15403
Project: Hadoop Common
Issue Type: Bug
Reporter: Sergey Shelukhin
We are trying to create a split in Hive that will only read files in a
directory and not subdirectories.
That fails with the below error.
Given how this error comes about (two pieces of code interact, one explicitly
adding directories to results without failing, and one failing on any
directories in results), this seems like a bug.
{noformat}
Caused by: java.io.IOException: Not a file:
file:/,...warehouse/simple_to_mm_text/delta_0000001_0000001_0000
at
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:329)
~[hadoop-mapreduce-client-core-3.1.0.jar:?]
at
org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:553)
~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:754)
~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:203)
~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
{noformat}
This code, when recursion is disabled, adds directories to results
{noformat}
if (recursive && stat.isDirectory()) {
result.dirsNeedingRecursiveCalls.add(stat);
} else {
result.locatedFileStatuses.add(stat);
}
{noformat}
However the getSplits code after that computes the size like this
{noformat}
long totalSize = 0; // compute total size
for (FileStatus file: files) { // check we have valid files
if (file.isDirectory()) {
throw new IOException("Not a file: "+ file.getPath());
}
totalSize +=
{noformat}
which would always fail combined with the above code.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]