Jason Dere created HADOOP-10340: ----------------------------------- Summary: FileInputFormat.listStatus() including directories in its results Key: HADOOP-10340 URL: https://issues.apache.org/jira/browse/HADOOP-10340 Project: Hadoop Common Issue Type: Bug Reporter: Jason Dere
Trying to track down HIVE-6401, where we see some "is not a file" errors because getSplits() is giving us directories. I believe the culprit is FileInputFormat.listStatus(): {code} if (recursive && stat.isDirectory()) { addInputPathRecursively(result, fs, stat.getPath(), inputFilter); } else { result.add(stat); } {code} Which seems to be allowing directories to be added to the results if recursive is false. Is this meant to return directories? If not, I think it should look like this: {code} if (stat.isDirectory()) { if (recursive) { addInputPathRecursively(result, fs, stat.getPath(), inputFilter); } } else { result.add(stat); } {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)