[ 
https://issues.apache.org/jira/browse/HADOOP-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12596502#action_12596502
 ] 

Devaraj Das commented on HADOOP-3366:
-------------------------------------

Sameer, today's ramfs serves both as a memory manager and as a filesystem. So 
if we were to implement a new memory manager, I am guessing that it'd be close 
to what we already have in the ramfs (for e.g. it already does byte array 
allocations, keeps track of mem usage, etc.). We can get to an optimal memory 
manager by reducing the complexity (if any) in the ramfs memory manager.

Regarding using the ramfs as a FileSystem, I think if we remove the ChecksumFS 
layer, we'd have removed a good amount of complexity. Other than that if we 
ensure that the apis that read from the ramfs do not allocate buffers but reset 
internal pointers on the byte arrays for the keys and values, we should be 
good. So the two classes that is used as the destination of data read from 
files are the DataOutputBuffer and the ValueBytes. Both these internally 
allocate byte arrays. I am suggesting that we implement these two classes 
specially for the ramfs files wherein we'd just update the 
pointers/offsets/lengths in these classes instead of copying from the files.

> Shuffle/Merge improvements
> --------------------------
>
>                 Key: HADOOP-3366
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3366
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>             Fix For: 0.18.0
>
>
> This is intended to be a meta-issue to track various improvements to 
> shuffle/merge in the reducer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to