Issue 181761
Summary Clang: false positive CFI sanitizer diagnostic caused by a malloc+memset->calloc optimization
Labels clang, false-positive
Assignees
Reporter orlowd
    I have a code example that triggers what seems to be a false positive CFI diagnostic for an indirect function call when linking statically using Clang 21.1.8. Also reproduces on Clang 18.1.3.

One file (`a.cpp`) defines a simple function, `alloc_zeroed_mem()`, which returns a pointer to memory allocated with `malloc()` and zeroed with `memset()`. Another file (`b.cpp`) initializes a global variable, `My_calloc`, with the address of `calloc()`. The main file (`main.cpp`) uses corresponding declarations.

`a.cpp`:
```cpp
#include <cstdlib>
#include <cstring>

void* alloc_zeroed_mem(unsigned long count)
{
    void* res = std::malloc(count);
    if (res)
    {
        std::memset(res, 0, count);
    }
    return res;
}
```

`b.cpp`:
```cpp
#include <cstdlib>

using my_calloc_callback = void*(*)(size_t nmemb, size_t size);
extern my_calloc_callback My_calloc = (my_calloc_callback)calloc;
```

`main.cpp`:
```cpp
#include <cstdlib>

void* alloc_zeroed_mem(unsigned long count);
using my_calloc_callback = void*(*)(size_t nmemb, size_t size);
extern my_calloc_callback My_calloc;

int main()
{
    My_calloc(1ul, 1ul);
 alloc_zeroed_mem(1ul);
}
```

Compiling and linking these files using this command and running the resulting binary produces a CFI diagnostic.
```shell
clang -std=c++17 -flto -O2 -fsanitize=cfi -fvisibility=hidden -fno-sanitize-trap=all -fsanitize-recover=all a.cpp b.cpp main.cpp && ./a.out
```
```plaintext
main.cpp:9:5: runtime error: control flow integrity check for type 'void *(unsigned long, unsigned long)' failed during indirect function call
malloc/malloc.c:3699:1: note: calloc defined here
main.cpp:9:5: note: check failed in /home/user/temp/clang_cfi_bug/a.out, destination function located in /lib/x86_64-linux-gnu/libc.so.6
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior main.cpp:9:5
```
Switching the order of `a.cpp` and `b.cpp` in the compiler invocation resolves the issue and eliminates the diagnostic.

The example is also available on Compiler Explorer: https://godbolt.org/z/q1GYWzMGf. Similarly, swapping `a.cpp` and `b.cpp` in `add_executable()` within `CMakeLists.txt` fixes the issue.

The issue appears to be caused by an optimization that replaces a `malloc()` and `memset()` sequence with a single `calloc()` call. In the generated LLVM IR, the resulting `calloc` declaration lacks the associated type metadata. Because the address of `calloc` is taken in another file and used for an indirect call via a function pointer, the CFI sanitizer inserts a corresponding runtime check that relies on this type information. However, when the two files are linked, the first encountered `calloc` declaration seems to be used. As a result, depending on the link order, this results in an invalid CFI check. In the problematic scenario, with everything being inlined and heavily optimized, the runtime checks are replaced by an unconditional call to `__ubsan_handle_cfi_check_fail()` (which actually should not be called in this case).

Looking at the generated IR for the example code, the `calloc` declaration for `b.cpp` (which directly calls `calloc()`) appears as follows.
```llvm
; Function Attrs: mustprogress nofree nounwind willreturn allockind("alloc,zeroed") allocsize(0,1) memory(inaccessiblemem: readwrite)
declare !type !8 !type !9 noalias noundef ptr @calloc(i64 noundef, i64 noundef) #0
...
!8 = !{i64 0, !"typeinfo name for void* (unsigned long, unsigned long)"}
!9 = !{i64 0, !"typeinfo name for void* (unsigned long, unsigned long) [clone .generalized]"}
```

And for `a.cpp` (which uses `malloc()` and `memset())` it looks like this.
```llvm
; Function Attrs: nofree nounwind willreturn allockind("alloc,zeroed") allocsize(0,1) memory(inaccessiblemem: readwrite)
declare noalias noundef ptr @calloc(i64 noundef, i64 noundef) local_unnamed_addr #1
```

The problem was initially seen while compiling code with CFI checks enabled in a project that uses the `curl` library. `curl` exhibits this exact problematic pattern. It stores the address of `calloc()` in a function pointer (optionally allowing users to override it with a custom allocation function) and later invokes it indirectly from other functions (e.g. [this is the code initializing the function pointer](https://github.com/curl/curl/blob/970e59a82fab2ae16acbfe763aebf3e1d875fbb9/lib/easy.c#L110) and [this is its usage](https://github.com/curl/curl/blob/970e59a82fab2ae16acbfe763aebf3e1d875fbb9/lib/multi.c#L235) through [the macro](https://github.com/curl/curl/blob/970e59a82fab2ae16acbfe763aebf3e1d875fbb9/lib/curl_setup.h#L1474)).
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to