[jira] [Commented] (HDFS-6036) Forcibly timeout misbehaving DFSClients that try to do no-checksum reads that extend too long

Andrew Wang (JIRA) Wed, 27 Aug 2014 17:57:16 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-6036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14113137#comment-14113137
 ]


Andrew Wang commented on HDFS-6036:
-----------------------------------

Hi Colin, nice work here, few review comments:

Nitty logging comments:
* The slf4j style uses {} as a template to avoid string concatenation, let's 
make sure that's used for all the LOG calls.
* shouldDefer, the {{!anchored}} case, could we lower the LOG to debug?
* In UncachingTask#run, there's a little ternary to add "Deferred" before. We 
could have it switch between "Deferred u" and "U" so the capitalization of 
"Uncaching" is always correct.

Rest:
* The default is set to 15 hours, isn't this a really long time? I expected 
something like a few mins.
* New keys should be added to hdfs-default.xml as well.
* Regarding the minimum polling rate, I'd prefer to abort if it's not 
configured correctly. Silent correction means bad conf values live a continued 
existence, and confs get copy pasted around.
* Having the min be revocation/2 is also somewhat arbitrary, but I'll go along 
with it. Nyquist-ish?
* I wondered if we needed any changes in ShortCircuitRegistry / 
DfsClientShmManager, but it seems like it's already handled correctly by the 
UNCACHING state. Nice!

Debugging aids:
* A metric for the # of forcibly uncached blocks would be a nice health check.
* It'd also be nice to print which client is holding on to anchors for too long.

Again, nice work here. None of these are major comments, so I'd be happy to +1 
soon.

> Forcibly timeout misbehaving DFSClients that try to do no-checksum reads that 
> extend too long
> ---------------------------------------------------------------------------------------------
>
>                 Key: HDFS-6036
>                 URL: https://issues.apache.org/jira/browse/HDFS-6036
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: caching, datanode
>    Affects Versions: 2.5.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-6036.001.patch
>
>
> We should forcibly timeout misbehaving DFSClients that try to do no-checksum 
> reads that extend too long.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6036) Forcibly timeout misbehaving DFSClients that try to do no-checksum reads that extend too long

Reply via email to