[
https://issues.apache.org/jira/browse/HDFS-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106025#comment-14106025
]
Colin Patrick McCabe commented on HDFS-6581:
--------------------------------------------
Looks good overall. It's good to see progress on this.
Some comments about the design doc:
* Why not use ramfs instead of tmpfs? ramfs can't swap.
** The problem with using tmpfs is that the system could move the data to swap
at any time. In addition to performance problems, this could cause correctness
problems later when we read back the data from swap (i.e. from the hard disk).
Since we don't want to verify checksums here, we should use a storage method
that we know never touches the disk. Tachyon uses ramfs instead of tmpfs for
this reason.
* An LRU replacement policy isn't a good choice. It's very easy for a batch
job to kick out everything in memory before it can ever be used again
(thrashing). An LFU (least frequently used) policy would be much better. We'd
have to keep usage statistics to implement this, but that doesn't seem too bad.
* How is the maximum tmpfs/ramfs size per datanode configured? I think we
should use the existing {{dfs.datanode.max.locked.memory}} property to
configure this, for consistency. System administrators should not need to
configure separate pools of memory for HDFS-4949 and this feature. It should
be one memory size.
** I also think that cache directives from HDFS-4949 should take precedence
over this opportunistic write caching. If we need to evict some HDFS-5851
cache items to finish our HDFS-4949 caching, we should do that.
* Related to that, we might want to rename {{dfs.datanode.max.locked.memory}}
to {{dfs.data.node.max.cache.memory}} or something.
* You can effectively revoke access to a block file stored in ramfs or tmpfs by
truncating that file to 0 bytes. The client can hang on to the file
descriptor, but this doesn't keep any data bytes in memory. So we can move
things out of the cache even if the clients are unresponsive. Also see
HDFS-6750 and HDFS-6036 for examples of how we can ask the clients to stop
using a short-circuit replica before tearing it down.
> Write to single replica in memory
> ---------------------------------
>
> Key: HDFS-6581
> URL: https://issues.apache.org/jira/browse/HDFS-6581
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode
> Reporter: Arpit Agarwal
> Assignee: Arpit Agarwal
> Attachments: HDFSWriteableReplicasInMemory.pdf
>
>
> Per discussion with the community on HDFS-5851, we will implement writing to
> a single replica in DN memory via DataTransferProtocol.
> This avoids some of the issues with short-circuit writes, which we can
> revisit at a later time.
--
This message was sent by Atlassian JIRA
(v6.2#6252)