[jira] [Commented] (HDFS-5182) BlockReaderLocal must allow zero-copy reads only when the DN believes it's valid

Colin Patrick McCabe (JIRA) Thu, 09 Jan 2014 13:44:30 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-5182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13867110#comment-13867110
 ]


Colin Patrick McCabe commented on HDFS-5182:
--------------------------------------------

bq. That seems much longer than necessary – don't we want clients to be able to 
keep mmaps around in their cache for very long periods of time? And then, when 
the user requests the read, we can "anchor" the mmap only for the duration of 
time for which the user holds onto the zero-copy buffer? Once the user returns 
the zero-copy buffer, we can decrement the count and allow the DN to evict the 
block from the cache.

Sorry, I was unclear.  When I said "closed" I mean that the user had returned 
the zero-copy buffer.  So the same thing you suggested.

bq. I disagree on this. Just because you want to skip checksumming doesn't mean 
you can tolerate SIGBUS. For example, many file formats have their own 
checksums, so we can safely skip HDFS checksumming, but we still want to ensure 
that we're only reading locked (i.e safe) memory via mmap.

What I was referring to here is where a client has specifically requested an 
mmap region using the zero-copy API and the SKIP_CHECKSUMS option.  In that 
case, the user is clearly going to be reading without any guarantees from us.  
If the user just uses the normal (non-zero-copy, non-mmap) read path, SIGBUS 
will not be an issue.

(There have been some proposals to improve the SIGBUS situation for zero-copy 
reads without mlock, but they're certainly out of scope for this JIRA.)

bq. Maybe this can be put into a separate JIRA, and first implement just a very 
simple timeout-based mechanism? The DN could change the anchor flag to a magic 
value which invalidates the segment and then close it after some amount of 
time. Then if the client looks at it again it will know to invalidate.

Timeouts and two-way protocols get complex.  I already have the code for 
closing the shared memory segment based on listening for the remote socket 
getting closed.  As for where the socket comes from-- we just don't put the 
socket we used to get the FDs in the first place back into the peer cache.

> BlockReaderLocal must allow zero-copy  reads only when the DN believes it's 
> valid
> ---------------------------------------------------------------------------------
>
>                 Key: HDFS-5182
>                 URL: https://issues.apache.org/jira/browse/HDFS-5182
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>    Affects Versions: 3.0.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>
> BlockReaderLocal must allow zero-copy reads only when the DN believes it's 
> valid.  This implies adding a new field to the response to 
> REQUEST_SHORT_CIRCUIT_FDS.  We also need some kind of heartbeat from the 
> client to the DN, so that the DN can inform the client when the mapped region 
> is no longer locked into memory.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5182) BlockReaderLocal must allow zero-copy reads only when the DN believes it's valid

Reply via email to