[
https://issues.apache.org/jira/browse/HDFS-5182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13867110#comment-13867110
]
Colin Patrick McCabe commented on HDFS-5182:
--------------------------------------------
bq. That seems much longer than necessary – don't we want clients to be able to
keep mmaps around in their cache for very long periods of time? And then, when
the user requests the read, we can "anchor" the mmap only for the duration of
time for which the user holds onto the zero-copy buffer? Once the user returns
the zero-copy buffer, we can decrement the count and allow the DN to evict the
block from the cache.
Sorry, I was unclear. When I said "closed" I mean that the user had returned
the zero-copy buffer. So the same thing you suggested.
bq. I disagree on this. Just because you want to skip checksumming doesn't mean
you can tolerate SIGBUS. For example, many file formats have their own
checksums, so we can safely skip HDFS checksumming, but we still want to ensure
that we're only reading locked (i.e safe) memory via mmap.
What I was referring to here is where a client has specifically requested an
mmap region using the zero-copy API and the SKIP_CHECKSUMS option. In that
case, the user is clearly going to be reading without any guarantees from us.
If the user just uses the normal (non-zero-copy, non-mmap) read path, SIGBUS
will not be an issue.
(There have been some proposals to improve the SIGBUS situation for zero-copy
reads without mlock, but they're certainly out of scope for this JIRA.)
bq. Maybe this can be put into a separate JIRA, and first implement just a very
simple timeout-based mechanism? The DN could change the anchor flag to a magic
value which invalidates the segment and then close it after some amount of
time. Then if the client looks at it again it will know to invalidate.
Timeouts and two-way protocols get complex. I already have the code for
closing the shared memory segment based on listening for the remote socket
getting closed. As for where the socket comes from-- we just don't put the
socket we used to get the FDs in the first place back into the peer cache.
> BlockReaderLocal must allow zero-copy reads only when the DN believes it's
> valid
> ---------------------------------------------------------------------------------
>
> Key: HDFS-5182
> URL: https://issues.apache.org/jira/browse/HDFS-5182
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: hdfs-client
> Affects Versions: 3.0.0
> Reporter: Colin Patrick McCabe
> Assignee: Colin Patrick McCabe
>
> BlockReaderLocal must allow zero-copy reads only when the DN believes it's
> valid. This implies adding a new field to the response to
> REQUEST_SHORT_CIRCUIT_FDS. We also need some kind of heartbeat from the
> client to the DN, so that the DN can inform the client when the mapped region
> is no longer locked into memory.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)