[ 
https://issues.apache.org/jira/browse/IMPALA-7239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16530610#comment-16530610
 ] 

Tim Armstrong commented on IMPALA-7239:
---------------------------------------

Here's an example on a system that's running a heavy workload of concurrent 
queries, which results in 20k+ VM maps in the Impala

On a RHEL7 system with kernel 3.10.0-327.36.3.el7.x86_64
{noformat}
$ time sudo bash -c 'cat /proc/$(pgrep impalad)/smaps | grep 'Size:' | wc -l'
55752

real    0m13.216s
user    0m0.082s
sys     0m12.415s
{noformat}

It takes 12s of system time just to walk the maps, which mean with the default 
memory maintenance time that thread will be running non-stop iterating over 
smaps, burning CPU and potentially interfering with other memory operations.

> Mitigate ParseSmaps() overhead
> ------------------------------
>
>                 Key: IMPALA-7239
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7239
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 2.10.0, Impala 2.11.0, Impala 3.0, Impala 2.12.0
>            Reporter: Tim Armstrong
>            Assignee: Tim Armstrong
>            Priority: Critical
>              Labels: perf, resource-management
>
> I've heard anecdotes of high system time spent in functions related this the 
> smap parsing. It appears that this can be expensive on systems once the 
> impalad virtual memory gets fragmented and there are 10s of thousands of maps.
> We can try to mitigate by reducing frequency of the parsing or disabling it 
> entirely. I'm not sure if there are cheaper ways to get all of the same 
> metrics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to