[jira] [Commented] (HDFS-5957) Provide support for different mmap cache retention policies in ShortCircuitCache.

Chris Nauroth (JIRA) Thu, 20 Feb 2014 15:16:23 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907681#comment-13907681
 ]


Chris Nauroth commented on HDFS-5957:
-------------------------------------

bq. mmap regions don't consume physical memory. They do consume virtual memory.

YARN has checks on both physical and virtual memory.  I reviewed the logs from 
the application, and it is in fact the physical memory threshold that was 
exceeded.  YARN calculates this by checking /proc/pid/stat for the RSS and 
multiplying by page size.  The process was well within the virtual memory 
threshold, so virtual address space was not a problem.

{code}
containerID=container_1392067467498_0193_01_000282] is running beyond physical 
memory limits. Current usage: 4.5 GB of 4 GB physical memory used; 9.4 GB of 40 
GB virtual memory used. Killing container.

Dump of the process-tree for container_1392067467498_0193_01_000282 :

        |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE

        |- 27095 27015 27015 27015 (java) 8640 1190 9959014400 1189585 
/grid/0/jdk/bin/java -Djava.net.preferIPv4Stack=true 
-Dhadoop.metrics.log.level=WARN -server -Xmx3584m 
-Djava.net.preferIPv4Stack=true -XX:+UseNUMA -XX:+UseParallelGC 
-Dlog4j.configuration=tez-container-log4j.properties 
-Dyarn.app.container.log.dir=/grid/4/cluster/yarn/logs/application_1392067467498_0193/container_1392067467498_0193_01_000282
 -Dtez.root.logger=INFO,CLA 
-Djava.io.tmpdir=/grid/4/cluster/yarn/local/usercache/gopal/appcache/application_1392067467498_0193/container_1392067467498_0193_01_000282/tmp
 org.apache.hadoop.mapred.YarnTezDagChild 172.19.0.45 38627 
container_1392067467498_0193_01_000282 application_1392067467498_0193 1 
{code}

bq. I don't think YARN should limit the consumption of virtual memory. virtual 
memory imposes almost no cost on the system and limiting it leads to problems 
like this one.

I don't know the full history behind the virtual memory threshold.  I've always 
assumed that it was in place to guard against virtual address space exhaustion 
and possible intervention by the OOM killer.  So far, the virtual memory 
threshold doesn't appear to be a factor in this case.

bq. It should be possible to limit the consumption of actual memory (not 
virtual address space) and solve this problem that way. What do you think?

Yes, I agree that the issue here is physical memory based on the logs.  What we 
know at this point is that short-circuit reads were counted against the 
process's RSS, eventually triggering YARN's physical memory check.  Then, 
downtuning {{dfs.client.mmap.cache.timeout.ms}} made the problem go away.  I 
think we can come up with a minimal repro that demonstrates it.  Gopal might 
even already have this.

bq. In our tests, mmap provided no performance advantage unless it was reused. 
If Gopal needs to purge mmaps immediately after using them, the correct thing 
is simply not to use zero-copy reads.

Yes, something doesn't quite jive here.  [~gopalv], can you comment on whether 
or not you're seeing a performance benefit with zero-copy read after 
down-tuning {{dfs.client.mmap.cache.timeout.ms}} like I advised?  If so, then 
did I miss something in the description of your application's access pattern?

> Provide support for different mmap cache retention policies in 
> ShortCircuitCache.
> ---------------------------------------------------------------------------------
>
>                 Key: HDFS-5957
>                 URL: https://issues.apache.org/jira/browse/HDFS-5957
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>    Affects Versions: 2.3.0
>            Reporter: Chris Nauroth
>
> Currently, the {{ShortCircuitCache}} retains {{mmap}} regions for reuse by 
> multiple reads of the same block or by multiple threads.  The eventual 
> {{munmap}} executes on a background thread after an expiration period.  Some 
> client usage patterns would prefer strict bounds on this cache and 
> deterministic cleanup by calling {{munmap}}.  This issue proposes additional 
> support for different caching policies that better fit these usage patterns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5957) Provide support for different mmap cache retention policies in ShortCircuitCache.

Reply via email to