Jason Lowe created MAPREDUCE-6219:
-------------------------------------

             Summary: Reduce memory required for FileInputFormat located status 
optimization
                 Key: MAPREDUCE-6219
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6219
             Project: Hadoop Map/Reduce
          Issue Type: Bug
    Affects Versions: 2.1.1-beta
            Reporter: Jason Lowe
            Priority: Minor


MAPREDUCE-1981 introduced an optimization to drastically reduce the number of 
namenode operations required to compute input splits when processing a 
directory.  However it requires more memory to perform this optimization as it 
retains the full LocatedFileStatus object for all input files while computing 
the splits.  This can lead to odd situations for users where using a directory 
as input can run the job client out of heap space but using directory/* as the 
input spec allows it to run within the original heap space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to