[
https://issues.apache.org/jira/browse/HDFS-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Colin Patrick McCabe updated HDFS-6107:
---------------------------------------
Attachment: HDFS-6107.001.patch
I fixed the error handling case and added a unit test.
I noticed that we were incrementing the DN metrics for BlocksCached and
BlocksUncached just as soon as we received the DNA_CACHE and DNA_CACHE
commands. This is wrong, since if caching takes a while, the NN may send those
commands more than once. The command itself is idempotent. I fixed it so that
FsDatasetCache changes those stats instead.
I think this might fix some flaky unit tests we had, since we'll no longer
double-count a block if the NN happens to send a DNA_CACHE for it twice.
> When a block can't be cached due to limited space on the DataNode, that block
> becomes uncacheable
> -------------------------------------------------------------------------------------------------
>
> Key: HDFS-6107
> URL: https://issues.apache.org/jira/browse/HDFS-6107
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode
> Affects Versions: 2.4.0
> Reporter: Colin Patrick McCabe
> Assignee: Colin Patrick McCabe
> Attachments: HDFS-6107.001.patch
>
>
> When a block can't be cached due to limited space on the DataNode, that block
> becomes uncacheable. This is because the CachingTask fails to reset the
> block state in this error handling case.
--
This message was sent by Atlassian JIRA
(v6.2#6252)