llvmorg-github-actions[bot] wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-clang
Author: AutoJanitor (Scottcjn)
<details>
<summary>Changes</summary>
## Problem
`CXUnsavedFile::Length` in `clang-c/Index.h` is declared `unsigned long`:
```c
struct CXUnsavedFile {
const char *Filename;
const char *Contents;
unsigned long Length;
};
```
`unsigned long` is 4 bytes under LLP64 (Windows) and 8 bytes under LP64, so the
struct layout differs across data models. FFI consumers that bind this C API
with a fixed-size assumption read a corrupted length or crash — see the
downstream OpenJDK bug
[CODETOOLS-7904079](https://bugs.openjdk.org/browse/CODETOOLS-7904079).
## Fix
Make `Length` a fixed-width `uint64_t` (adding `<stdint.h>`), so the
field is consistently 8 bytes regardless of data model. On LP64 (the common
64-bit Unix case) `unsigned long` is already 8 bytes, so there's no layout
change there; the layout only changes where `unsigned long` wasn't 8 bytes
(Windows / 32-bit) — which is the point.
The two internal debug `fprintf`s of `Length` are updated from `%ld` to `%llu`
(with an `unsigned long long` cast) to match the new type. `c-index-test.c`'s
uses are assignments / `malloc` / `fread` sizes and remain correct via implicit
conversion.
As the issue notes, this **intentionally alters libclang's C API layout**;
documented in the release notes.
Closes #<!-- -->160729.
(Picking this up since @<!-- -->nizarbenalla marked it up for grabs.)
---
Full diff: https://github.com/llvm/llvm-project/pull/204716.diff
4 Files Affected:
- (modified) clang/docs/ReleaseNotes.rst (+6)
- (modified) clang/include/clang-c/Index.h (+7-1)
- (modified) clang/tools/libclang/CIndex.cpp (+2-2)
- (modified) clang/tools/libclang/Indexing.cpp (+2-2)
``````````diff
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index ed668ca6f207c..aecdd95f1e8fc 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -531,6 +531,12 @@ clang-format
libclang
--------
+- The ``Length`` field of ``CXUnsavedFile`` (in ``clang-c/Index.h``) is now a
+ fixed-width ``uint64_t`` instead of ``unsigned long``. This makes the struct
+ layout identical across data models; previously the field was 4 bytes under
+ LLP64 (Windows) and 8 bytes under LP64, which could corrupt the length or
+ crash FFI consumers. This is an intentional ABI change to the C API.
+ (#GH160729)
Code Completion
---------------
diff --git a/clang/include/clang-c/Index.h b/clang/include/clang-c/Index.h
index be038d9165fc6..1b04cc0303e48 100644
--- a/clang/include/clang-c/Index.h
+++ b/clang/include/clang-c/Index.h
@@ -16,6 +16,8 @@
#ifndef LLVM_CLANG_C_INDEX_H
#define LLVM_CLANG_C_INDEX_H
+#include <stdint.h>
+
#include "clang-c/BuildSystem.h"
#include "clang-c/CXDiagnostic.h"
#include "clang-c/CXErrorCode.h"
@@ -118,8 +120,12 @@ struct CXUnsavedFile {
/**
* The length of the unsaved contents of this buffer.
+ *
+ * A fixed-width type is used so the struct layout is identical across data
+ * models (e.g. LLP64 vs LP64); `unsigned long` previously made the field 4
+ * bytes on Windows and 8 bytes elsewhere, breaking FFI consumers.
*/
- unsigned long Length;
+ uint64_t Length;
};
/**
diff --git a/clang/tools/libclang/CIndex.cpp b/clang/tools/libclang/CIndex.cpp
index 5aab74348967d..778648ced755f 100644
--- a/clang/tools/libclang/CIndex.cpp
+++ b/clang/tools/libclang/CIndex.cpp
@@ -4445,8 +4445,8 @@ enum CXErrorCode clang_parseTranslationUnit2FullArgv(
for (unsigned i = 0; i != num_unsaved_files; ++i) {
if (i)
fprintf(stderr, ", ");
- fprintf(stderr, "('%s', '...', %ld)", unsaved_files[i].Filename,
- unsaved_files[i].Length);
+ fprintf(stderr, "('%s', '...', %llu)", unsaved_files[i].Filename,
+ (unsigned long long)unsaved_files[i].Length);
}
fprintf(stderr, "],\n");
fprintf(stderr, " 'options' : %d,\n", options);
diff --git a/clang/tools/libclang/Indexing.cpp
b/clang/tools/libclang/Indexing.cpp
index c142f142d5071..9d893e4e21cd5 100644
--- a/clang/tools/libclang/Indexing.cpp
+++ b/clang/tools/libclang/Indexing.cpp
@@ -924,8 +924,8 @@ int clang_indexSourceFileFullArgv(
for (unsigned i = 0; i != num_unsaved_files; ++i) {
if (i)
fprintf(stderr, ", ");
- fprintf(stderr, "('%s', '...', %ld)", unsaved_files[i].Filename,
- unsaved_files[i].Length);
+ fprintf(stderr, "('%s', '...', %llu)", unsaved_files[i].Filename,
+ (unsigned long long)unsaved_files[i].Length);
}
fprintf(stderr, "],\n");
fprintf(stderr, " 'options' : %d,\n", TU_options);
``````````
</details>
https://github.com/llvm/llvm-project/pull/204716
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits