[jira] [Updated] (MAPREDUCE-7241) FileInputFormat listStatus oom caused by lots of unwanted block infos

Zhihua Deng (Jira) Tue, 24 Sep 2019 01:21:26 -0700


     [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Zhihua Deng updated MAPREDUCE-7241:
-----------------------------------
    Attachment:     (was: MAPREDUCE-7241.01.patch)

> FileInputFormat listStatus oom caused by lots of unwanted block infos
> ---------------------------------------------------------------------
>
>                 Key: MAPREDUCE-7241
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7241
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: job submission
>    Affects Versions: 2.6.1
>            Reporter: Zhihua Deng
>            Priority: Major
>         Attachments: MAPREDUCE-7241.trunk.02.patch, 
> MAPREDUCE-7241.trunk.patch, filestatus.png
>
>
> This case sometimes sees in hive when user issues queries over all partitions 
> by mistakes. The file status cached when listing status could accumulate to 
> over 3g.  After digging into the  dumped memory, the LocatedBlock occupies 
> about 50%(sometimes over 60%) memory that retained by LocatedFileStatus, as 
> shows followed,
> !filestatus.png!
> Right now we only extract the block locations info from LocatedFileStatus,  
> the datanode infos(types) or block token are not taken into account. So there 
> is no need to cache LocatedBlock, as do like this:
> BlockLocation[] blockLocations = dedup(stat.getBlockLocations());
>  LocatedFileStatus shrink = new LocatedFileStatus(stat, blockLocations);
> private static BlockLocation[] dup(BlockLocation[] blockLocations) {
>      BlockLocation[] copyLocs = new BlockLocation[blockLocations.length];
>      int i = 0;
>      for (BlockLocation location : blockLocations)
> {         copyLocs[i++] = new BlockLocation(location);     }
>     return copyLocs;
>  }
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-7241) FileInputFormat listStatus oom caused by lots of unwanted block infos

Reply via email to