[
https://issues.apache.org/jira/browse/HDFS-6036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14113137#comment-14113137
]
Andrew Wang commented on HDFS-6036:
-----------------------------------
Hi Colin, nice work here, few review comments:
Nitty logging comments:
* The slf4j style uses {} as a template to avoid string concatenation, let's
make sure that's used for all the LOG calls.
* shouldDefer, the {{!anchored}} case, could we lower the LOG to debug?
* In UncachingTask#run, there's a little ternary to add "Deferred" before. We
could have it switch between "Deferred u" and "U" so the capitalization of
"Uncaching" is always correct.
Rest:
* The default is set to 15 hours, isn't this a really long time? I expected
something like a few mins.
* New keys should be added to hdfs-default.xml as well.
* Regarding the minimum polling rate, I'd prefer to abort if it's not
configured correctly. Silent correction means bad conf values live a continued
existence, and confs get copy pasted around.
* Having the min be revocation/2 is also somewhat arbitrary, but I'll go along
with it. Nyquist-ish?
* I wondered if we needed any changes in ShortCircuitRegistry /
DfsClientShmManager, but it seems like it's already handled correctly by the
UNCACHING state. Nice!
Debugging aids:
* A metric for the # of forcibly uncached blocks would be a nice health check.
* It'd also be nice to print which client is holding on to anchors for too long.
Again, nice work here. None of these are major comments, so I'd be happy to +1
soon.
> Forcibly timeout misbehaving DFSClients that try to do no-checksum reads that
> extend too long
> ---------------------------------------------------------------------------------------------
>
> Key: HDFS-6036
> URL: https://issues.apache.org/jira/browse/HDFS-6036
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: caching, datanode
> Affects Versions: 2.5.0
> Reporter: Colin Patrick McCabe
> Assignee: Colin Patrick McCabe
> Attachments: HDFS-6036.001.patch
>
>
> We should forcibly timeout misbehaving DFSClients that try to do no-checksum
> reads that extend too long.
--
This message was sent by Atlassian JIRA
(v6.2#6252)