[jira] Commented: (HADOOP-3638) Cache the iFile index files in memory to reduce seeks during map output serving

Jothi Padmanabhan (JIRA) Thu, 11 Sep 2008 23:24:15 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-3638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12630497#action_12630497
 ]


Jothi Padmanabhan commented on HADOOP-3638:
-------------------------------------------

What would be a reasonable amount of memory that can be set aside for the Index 
Cache at the Map side?

Each individual record is 24 bytes (3 longs)
Let num reducers = R
Let num map slots = S
Let total number of Spill Files = M

Total Number of Entries per Map task = M*R*24

For a *node*, that is running S Map tasks (slots) at a time, total memory 
consumed = S*M*R*24

If M = 100, S = 6, R = 100, then 
Total Memory Consumed ~= 1.4M  

I think 1.4M is a a very reasonable amount for the index cache, at the node 
level. 

So, do we need to worry about the memory limit at the map side at all? We could 
just LRU at the task tracker level alone.

> Cache the iFile index files in memory to reduce seeks during map output 
> serving
> -------------------------------------------------------------------------------
>
>                 Key: HADOOP-3638
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3638
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Devaraj Das
>            Assignee: Jothi Padmanabhan
>             Fix For: 0.19.0
>
>         Attachments: hadoop-3638-v1.patch, hadoop-3638-v2.patch
>
>
> The iFile index files can be cached in memory to reduce seeks during map 
> output serving.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3638) Cache the iFile index files in memory to reduce seeks during map output serving

Reply via email to