Steve Loughran created MAPREDUCE-6800:
-----------------------------------------
Summary: FileInputFormat.singleThreadedListStatus to use
listFiles(recursive)
Key: MAPREDUCE-6800
URL: https://issues.apache.org/jira/browse/MAPREDUCE-6800
Project: Hadoop Map/Reduce
Issue Type: Improvement
Components: mrv2
Affects Versions: 2.7.3
Reporter: Steve Loughran
Priority: Minor
{{FileInputFormat.singleThreadedListStatus}} does recursive directory walks to
pick files to scan. This is very inefficient on object stores, and can be
bypassed if {{listFiles(recursive=true)}} can be used instead.
Based on the experience of SPARK-2984, it should also be resilient to a source
file going away during the iteration, downgrading an FNFE to a "skip that
nonexistent path"
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]