Cyl created ZOOKEEPER-5009:
------------------------------
Summary: Memory Leak in zoo_sasl_client_create
Key: ZOOKEEPER-5009
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-5009
Project: ZooKeeper
Issue Type: Bug
Components: c client
Affects Versions: 3.9.4
Reporter: Cyl
Attachments: fake_strdup_trigger.c, repro_sasl_leak_loop.c,
verify_sasl_leak.py
{*}Description{*}: In
{{{}zookeeper-client/zookeeper-client-c/src/zk_sasl.c{}}}, the function
{{zoo_sasl_client_create}} allocates memory for a {{zoo_sasl_client_t}}
structure ({{{}sc{}}}). It then attempts to duplicate several strings (service,
host, mechlist) using {{{}_zsasl_strdup{}}}.
If any of these string duplications fail (e.g., due to Out Of Memory), the
function calls {{zoo_sasl_client_destroy(sc)}} and returns {{{}NULL{}}}.
However, {{zoo_sasl_client_destroy}} only frees the _members_ of the structure,
not the structure pointer ({{{}sc{}}}) itself. This results in a memory leak of
the {{zoo_sasl_client_t}} struct (size of the struct) every time initialization
fails.
{code:java}
// Vulnerable pattern
zoo_sasl_client_t *sc = calloc(1, sizeof(*sc));
// ...
if (rc != ZOK) {
zoo_sasl_client_destroy(sc);
return NULL; // Leak: sc is never freed
} {code}
*Impact* Memory allocation failures can be transient (e.g., temporary spikes in
load). Robust applications (like ZooKeeper clients) are designed to retry
connections and initializations. If every failed attempt leaks memory, a
temporary issue becomes a permanent degradation, eventually crashing the
application completely due to OOM.
*Reproduction* A Proof-of-Concept (PoC) was created to simulate an allocation
failure during the {{strdup}} calls.
# {*}Hook Library ({{{}fake_strdup_trigger.c{}}}){*}: A shared library loaded
via {{LD_PRELOAD}} that intercepts {{{}strdup{}}}. It returns {{NULL}} when a
specific trigger string ("TRIGGER_OOM") is passed, simulating an OOM condition.
# {*}Loop Trigger ({{{}repro_sasl_leak_loop.c{}}}){*}: A C program that
repeatedly calls {{zookeeper_init_sasl}} with the trigger string as the host.
This causes {{zoo_sasl_client_create}} to fail at the {{strdup(host)}} step.
# {*}Verification Script ({{{}verify_sasl_leak.py{}}}){*}: Compiles the tools,
runs the loop, and monitors the process RSS memory usage.
*Fix* Add {{free(sc)}} to the error handling path.
{code:java}
if (rc != ZOK) {
zoo_sasl_client_destroy(sc);
free(sc); // Fix
return NULL;
} {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)