[
https://issues.apache.org/jira/browse/KUDU-2836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yingchun Lai updated KUDU-2836:
-------------------------------
Description:
On one of my tserver, memory used about 95%, "top" result like:
{code:java}
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
8359 work 20 0 0.326t 0.116t 81780 S 727.9 94.6 230228:10 kudu_tablet_ser
{code}
That is kudu_tablet_server process used about 116G memory.
On mem-trackers page, I find the "Total consumption" value is about 65G, much
lower than 116G.
Then, I login to the server to check any free memory MM operations are work
correctly. Unfortunatly, the memory pressure detect
function(process_memory::UnderMemoryPressure) doesn't report it's under
pressure, because the tcmalloc function GetNumericProperty(const char*
property, size_t* value) with parameter "generic.current_allocated_bytes"
doesn't return the memory as the memory use reported by the OS.
[https://gperftools.github.io/gperftools/tcmalloc.html]
{quote}
|{{generic.current_allocated_bytes}}|Number of bytes used by the application.
This will not typically match the memory use reported by the OS, because it
does not include TCMalloc overhead or memory fragmentation.|
{quote}
This situation may lead to OPs prefer to free memory could not be scheduled
promptly, and the OS memory may consumed empty, and then kill tserver because
of OOM.
was:
On one of my tserver, memory used about 95%, "top" result like:
{code:java}
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
8359 work 20 0 0.326t 0.116t 81780 S 727.9 94.6 230228:10 kudu_tablet_ser
{code}
That is kudu_tablet_server process used about 116G memory. I login to the
server to check any free memory MM operations are work correctly. Unfortunatly,
the memory pressure detect function(process_memory::UnderMemoryPressure)
doesn't report it's under pressure, because the tcmalloc function
GetNumericProperty(const char* property, size_t* value) with parameter
"generic.current_allocated_bytes" doesn't return the memory as the memory use
reported by the OS.
https://gperftools.github.io/gperftools/tcmalloc.html
{quote}
|{{generic.current_allocated_bytes}}|Number of bytes used by the application.
This will not typically match the memory use reported by the OS, because it
does not include TCMalloc overhead or memory fragmentation.|
{quote}
> Wrong memory used detection
> ---------------------------
>
> Key: KUDU-2836
> URL: https://issues.apache.org/jira/browse/KUDU-2836
> Project: Kudu
> Issue Type: Improvement
> Components: tserver
> Reporter: Yingchun Lai
> Assignee: Yingchun Lai
> Priority: Critical
>
> On one of my tserver, memory used about 95%, "top" result like:
> {code:java}
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 8359 work 20 0 0.326t 0.116t 81780 S 727.9 94.6 230228:10 kudu_tablet_ser
> {code}
> That is kudu_tablet_server process used about 116G memory.
> On mem-trackers page, I find the "Total consumption" value is about 65G, much
> lower than 116G.
> Then, I login to the server to check any free memory MM operations are work
> correctly. Unfortunatly, the memory pressure detect
> function(process_memory::UnderMemoryPressure) doesn't report it's under
> pressure, because the tcmalloc function GetNumericProperty(const char*
> property, size_t* value) with parameter "generic.current_allocated_bytes"
> doesn't return the memory as the memory use reported by the OS.
> [https://gperftools.github.io/gperftools/tcmalloc.html]
> {quote}
> |{{generic.current_allocated_bytes}}|Number of bytes used by the application.
> This will not typically match the memory use reported by the OS, because it
> does not include TCMalloc overhead or memory fragmentation.|
> {quote}
> This situation may lead to OPs prefer to free memory could not be scheduled
> promptly, and the OS memory may consumed empty, and then kill tserver because
> of OOM.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)