Cyl created ZOOKEEPER-5009:
------------------------------

             Summary: Memory Leak in zoo_sasl_client_create
                 Key: ZOOKEEPER-5009
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-5009
             Project: ZooKeeper
          Issue Type: Bug
          Components: c client
    Affects Versions: 3.9.4
            Reporter: Cyl
         Attachments: fake_strdup_trigger.c, repro_sasl_leak_loop.c, 
verify_sasl_leak.py

{*}Description{*}: In 
{{{}zookeeper-client/zookeeper-client-c/src/zk_sasl.c{}}}, the function 
{{zoo_sasl_client_create}} allocates memory for a {{zoo_sasl_client_t}} 
structure ({{{}sc{}}}). It then attempts to duplicate several strings (service, 
host, mechlist) using {{{}_zsasl_strdup{}}}.

If any of these string duplications fail (e.g., due to Out Of Memory), the 
function calls {{zoo_sasl_client_destroy(sc)}} and returns {{{}NULL{}}}.

However, {{zoo_sasl_client_destroy}} only frees the _members_ of the structure, 
not the structure pointer ({{{}sc{}}}) itself. This results in a memory leak of 
the {{zoo_sasl_client_t}} struct (size of the struct) every time initialization 
fails.

 
{code:java}
// Vulnerable pattern
zoo_sasl_client_t *sc = calloc(1, sizeof(*sc));
// ...
if (rc != ZOK) {
    zoo_sasl_client_destroy(sc);
    return NULL; // Leak: sc is never freed
} {code}
 

*Impact* Memory allocation failures can be transient (e.g., temporary spikes in 
load). Robust applications (like ZooKeeper clients) are designed to retry 
connections and initializations. If every failed attempt leaks memory, a 
temporary issue becomes a permanent degradation, eventually crashing the 
application completely due to OOM.

*Reproduction* A Proof-of-Concept (PoC) was created to simulate an allocation 
failure during the {{strdup}} calls.
 # {*}Hook Library ({{{}fake_strdup_trigger.c{}}}){*}: A shared library loaded 
via {{LD_PRELOAD}} that intercepts {{{}strdup{}}}. It returns {{NULL}} when a 
specific trigger string ("TRIGGER_OOM") is passed, simulating an OOM condition.
 # {*}Loop Trigger ({{{}repro_sasl_leak_loop.c{}}}){*}: A C program that 
repeatedly calls {{zookeeper_init_sasl}} with the trigger string as the host. 
This causes {{zoo_sasl_client_create}} to fail at the {{strdup(host)}} step.
 # {*}Verification Script ({{{}verify_sasl_leak.py{}}}){*}: Compiles the tools, 
runs the loop, and monitors the process RSS memory usage.

 

*Fix* Add {{free(sc)}} to the error handling path.
{code:java}
if (rc != ZOK) {
    zoo_sasl_client_destroy(sc);
    free(sc); // Fix
    return NULL;
} {code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to