[
https://issues.apache.org/jira/browse/HDFS-6604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Colin Patrick McCabe updated HDFS-6604:
---------------------------------------
Attachment: HDFS-6604.002.patch
Thanks, Chris. There is an existing unit test for timing out replicas, but it
was broken. Basically, "staleness" was kicking in, even though the cache
cleaner was not working.
"Staleness" is different than the LRU timeout, since the LRU timeout is based
on the time since the replica was last used, but staleness is based on the time
since the replica was created (unless shared memory is enabled.)
This patch fixes that, and also fixes a case where we were using the configured
value of {{dfs.client.short.circuit.replica.stale.threshold.ms}} instead of the
value of {{dfs.client.mmap.cache.timeout.ms}} to deal with mmap timeouts.
> Disk space leak with shortcircuit
> ---------------------------------
>
> Key: HDFS-6604
> URL: https://issues.apache.org/jira/browse/HDFS-6604
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs-client
> Affects Versions: 2.4.0
> Environment: Centos 6.5 and distribution Hortonworks Data Platform
> v2.1
> Reporter: Giuseppe Reina
> Assignee: Colin Patrick McCabe
> Priority: Critical
> Attachments: HDFS-6604.001.patch, HDFS-6604.002.patch
>
>
> When HDFS shortcircuit is enabled, the file descriptors of the deleted HDFS
> blocks are kept open until the cache is full. This prevents the operating
> system to free the space on disk.
> More details on the [mailing list
> thread|http://mail-archives.apache.org/mod_mbox/hbase-user/201406.mbox/%3CCAPjB-CA3RV=slhuhwue5cv3pc4+rffz10-tkydbfs9rt2de...@mail.gmail.com%3E]
--
This message was sent by Atlassian JIRA
(v6.2#6252)