[
https://issues.apache.org/jira/browse/HDFS-5182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13891257#comment-13891257
]
Haohui Mai commented on HDFS-5182:
----------------------------------
I'm curious why shared memory segment is necessary -- given the ability to pass
file descriptors around, the client can read the data using the file descriptor
directly.
I see a couple potential issues of using shared memory segment to implement
zero-copy I/O:
# No lazy reads. It seems that you're calling mlock() on the datanode side to
pin the the data to the physical memory. The whole block has to be read into
the memory even if the client is only interested some parts of the file (e.g.
the index of the database)
# SIGBUS. The client does not have SIGBUS at the cost of (1) the data is pinned
to the physical memory, and (2) the datanode can have SIGBUS when there is an
I/O error. If the client is using the file descriptor directly, the OS will
manage the data using its buffer cache, and there will be no SIGBUS errors on
both sides.
# VM space. Indeed it won't exhaust the 64-bit virtual memory space, but a
process running inside a container could have limited vm space (e.g., 1 GB)
I'm wondering what would be the downsides of passing the file descriptor
directly. Can you comment on this?
> BlockReaderLocal must allow zero-copy reads only when the DN believes it's
> valid
> ---------------------------------------------------------------------------------
>
> Key: HDFS-5182
> URL: https://issues.apache.org/jira/browse/HDFS-5182
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs-client
> Affects Versions: 3.0.0
> Reporter: Colin Patrick McCabe
> Assignee: Colin Patrick McCabe
>
> BlockReaderLocal must allow zero-copy reads only when the DN believes it's
> valid. This implies adding a new field to the response to
> REQUEST_SHORT_CIRCUIT_FDS. We also need some kind of heartbeat from the
> client to the DN, so that the DN can inform the client when the mapped region
> is no longer locked into memory.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)