[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13899653#comment-13899653
 ] 

Jason Lowe commented on MAPREDUCE-5756:
---------------------------------------

Yes, the recursive stuff was added later.

I can't speak for what downstream consumers of FileInputFormat are expecting.  
Some may expect and handle directories while others do not.  The difficult part 
is we don't want to break the former to fix the latter, as that leads to the 
worst of all outcomes: silent data loss.


> FileInputFormat.listStatus() including directories in its results
> -----------------------------------------------------------------
>
>                 Key: MAPREDUCE-5756
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5756
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Jason Dere
>
> Trying to track down HIVE-6401, where we see some "is not a file" errors 
> because getSplits() is giving us directories.  I believe the culprit is 
> FileInputFormat.listStatus():
> {code}
>                 if (recursive && stat.isDirectory()) {
>                   addInputPathRecursively(result, fs, stat.getPath(),
>                       inputFilter);
>                 } else {
>                   result.add(stat);
>                 }
> {code}
> Which seems to be allowing directories to be added to the results if 
> recursive is false.  Is this meant to return directories? If not, I think it 
> should look like this:
> {code}
>                 if (stat.isDirectory()) {
>                  if (recursive) {
>                   addInputPathRecursively(result, fs, stat.getPath(),
>                       inputFilter);
>                  }
>                 } else {
>                   result.add(stat);
>                 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to