[
https://issues.apache.org/jira/browse/HDFS-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13918834#comment-13918834
]
Colin Patrick McCabe commented on HDFS-5957:
--------------------------------------------
bq. I will cycle back to this once I've get more hive specific stuff worked
out. I think setting the size to 1 will probably work itself out for me,
setting it to zero doesn't seem to work at all (i.e only cache it within the
open InputStream, not within a cross-thread cache).
There seems to be a bit of confusion here. In Hadoop 2.4, the short-circuit
cache is cross-thread-- it doesn't exist in individual InputStreams. This
allows us to make the best use of our limited file descriptor resources.
In Hadoop 2.4, setting {{dfs.client.mmap.cache.size}} to 0 should be
essentially the same as setting it to 1 in your case. In both cases, the
chance of reusing an mmap from the cache should be nil. In older versions, we
did used to treat {{dfs.client.mmap.cache.size}} = 0 as a special value meaning
"don't ever mmap." However, we don't do this any more.
> Provide support for different mmap cache retention policies in
> ShortCircuitCache.
> ---------------------------------------------------------------------------------
>
> Key: HDFS-5957
> URL: https://issues.apache.org/jira/browse/HDFS-5957
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs-client
> Affects Versions: 2.3.0
> Reporter: Chris Nauroth
>
> Currently, the {{ShortCircuitCache}} retains {{mmap}} regions for reuse by
> multiple reads of the same block or by multiple threads. The eventual
> {{munmap}} executes on a background thread after an expiration period. Some
> client usage patterns would prefer strict bounds on this cache and
> deterministic cleanup by calling {{munmap}}. This issue proposes additional
> support for different caching policies that better fit these usage patterns.
--
This message was sent by Atlassian JIRA
(v6.2#6252)