[
https://issues.apache.org/jira/browse/HADOOP-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12596502#action_12596502
]
Devaraj Das commented on HADOOP-3366:
-------------------------------------
Sameer, today's ramfs serves both as a memory manager and as a filesystem. So
if we were to implement a new memory manager, I am guessing that it'd be close
to what we already have in the ramfs (for e.g. it already does byte array
allocations, keeps track of mem usage, etc.). We can get to an optimal memory
manager by reducing the complexity (if any) in the ramfs memory manager.
Regarding using the ramfs as a FileSystem, I think if we remove the ChecksumFS
layer, we'd have removed a good amount of complexity. Other than that if we
ensure that the apis that read from the ramfs do not allocate buffers but reset
internal pointers on the byte arrays for the keys and values, we should be
good. So the two classes that is used as the destination of data read from
files are the DataOutputBuffer and the ValueBytes. Both these internally
allocate byte arrays. I am suggesting that we implement these two classes
specially for the ramfs files wherein we'd just update the
pointers/offsets/lengths in these classes instead of copying from the files.
> Shuffle/Merge improvements
> --------------------------
>
> Key: HADOOP-3366
> URL: https://issues.apache.org/jira/browse/HADOOP-3366
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Arun C Murthy
> Assignee: Arun C Murthy
> Fix For: 0.18.0
>
>
> This is intended to be a meta-issue to track various improvements to
> shuffle/merge in the reducer.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.