https://github.com/Scottcjn created 
https://github.com/llvm/llvm-project/pull/204716

## Problem

`CXUnsavedFile::Length` in `clang-c/Index.h` is declared `unsigned long`:

```c
struct CXUnsavedFile {
  const char *Filename;
  const char *Contents;
  unsigned long Length;
};
```

`unsigned long` is 4 bytes under LLP64 (Windows) and 8 bytes under LP64, so the 
struct layout differs across data models. FFI consumers that bind this C API 
with a fixed-size assumption read a corrupted length or crash — see the 
downstream OpenJDK bug 
[CODETOOLS-7904079](https://bugs.openjdk.org/browse/CODETOOLS-7904079).

## Fix

Make `Length` a fixed-width `uint64_t` (adding `<stdint.h>`), so the field is 
consistently 8 bytes regardless of data model. On LP64 (the common 64-bit Unix 
case) `unsigned long` is already 8 bytes, so there's no layout change there; 
the layout only changes where `unsigned long` wasn't 8 bytes (Windows / 32-bit) 
— which is the point.

The two internal debug `fprintf`s of `Length` are updated from `%ld` to `%llu` 
(with an `unsigned long long` cast) to match the new type. `c-index-test.c`'s 
uses are assignments / `malloc` / `fread` sizes and remain correct via implicit 
conversion.

As the issue notes, this **intentionally alters libclang's C API layout**; 
documented in the release notes.

Closes #160729.

(Picking this up since @nizarbenalla marked it up for grabs.)

>From 6642a5cffa9543819e29f34f3e31753758c331e1 Mon Sep 17 00:00:00 2001
From: Scott Boudreaux <[email protected]>
Date: Thu, 18 Jun 2026 20:55:54 -0500
Subject: [PATCH] [libclang] Make CXUnsavedFile::Length a fixed-width uint64_t

CXUnsavedFile::Length was `unsigned long`, whose width depends on the data
model: 4 bytes under LLP64 (Windows) and 8 bytes under LP64. The struct
layout therefore differs across platforms, which corrupts the length or
crashes FFI consumers that bind the C API with a fixed-size assumption
(e.g. the OpenJDK CODETOOLS-7904079 downstream bug).

Use uint64_t so the field is consistently 8 bytes. The two internal debug
prints are switched from %ld to %llu to match. This is an intentional ABI
change to libclang's C API, documented in the release notes.

Fixes #160729.
---
 clang/docs/ReleaseNotes.rst       | 6 ++++++
 clang/include/clang-c/Index.h     | 8 +++++++-
 clang/tools/libclang/CIndex.cpp   | 4 ++--
 clang/tools/libclang/Indexing.cpp | 4 ++--
 4 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index ed668ca6f207c..aecdd95f1e8fc 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -531,6 +531,12 @@ clang-format
 
 libclang
 --------
+- The ``Length`` field of ``CXUnsavedFile`` (in ``clang-c/Index.h``) is now a
+  fixed-width ``uint64_t`` instead of ``unsigned long``. This makes the struct
+  layout identical across data models; previously the field was 4 bytes under
+  LLP64 (Windows) and 8 bytes under LP64, which could corrupt the length or
+  crash FFI consumers. This is an intentional ABI change to the C API.
+  (#GH160729)
 
 Code Completion
 ---------------
diff --git a/clang/include/clang-c/Index.h b/clang/include/clang-c/Index.h
index be038d9165fc6..1b04cc0303e48 100644
--- a/clang/include/clang-c/Index.h
+++ b/clang/include/clang-c/Index.h
@@ -16,6 +16,8 @@
 #ifndef LLVM_CLANG_C_INDEX_H
 #define LLVM_CLANG_C_INDEX_H
 
+#include <stdint.h>
+
 #include "clang-c/BuildSystem.h"
 #include "clang-c/CXDiagnostic.h"
 #include "clang-c/CXErrorCode.h"
@@ -118,8 +120,12 @@ struct CXUnsavedFile {
 
   /**
    * The length of the unsaved contents of this buffer.
+   *
+   * A fixed-width type is used so the struct layout is identical across data
+   * models (e.g. LLP64 vs LP64); `unsigned long` previously made the field 4
+   * bytes on Windows and 8 bytes elsewhere, breaking FFI consumers.
    */
-  unsigned long Length;
+  uint64_t Length;
 };
 
 /**
diff --git a/clang/tools/libclang/CIndex.cpp b/clang/tools/libclang/CIndex.cpp
index 5aab74348967d..778648ced755f 100644
--- a/clang/tools/libclang/CIndex.cpp
+++ b/clang/tools/libclang/CIndex.cpp
@@ -4445,8 +4445,8 @@ enum CXErrorCode clang_parseTranslationUnit2FullArgv(
     for (unsigned i = 0; i != num_unsaved_files; ++i) {
       if (i)
         fprintf(stderr, ", ");
-      fprintf(stderr, "('%s', '...', %ld)", unsaved_files[i].Filename,
-              unsaved_files[i].Length);
+      fprintf(stderr, "('%s', '...', %llu)", unsaved_files[i].Filename,
+              (unsigned long long)unsaved_files[i].Length);
     }
     fprintf(stderr, "],\n");
     fprintf(stderr, "  'options' : %d,\n", options);
diff --git a/clang/tools/libclang/Indexing.cpp 
b/clang/tools/libclang/Indexing.cpp
index c142f142d5071..9d893e4e21cd5 100644
--- a/clang/tools/libclang/Indexing.cpp
+++ b/clang/tools/libclang/Indexing.cpp
@@ -924,8 +924,8 @@ int clang_indexSourceFileFullArgv(
     for (unsigned i = 0; i != num_unsaved_files; ++i) {
       if (i)
         fprintf(stderr, ", ");
-      fprintf(stderr, "('%s', '...', %ld)", unsaved_files[i].Filename,
-              unsaved_files[i].Length);
+      fprintf(stderr, "('%s', '...', %llu)", unsaved_files[i].Filename,
+              (unsigned long long)unsaved_files[i].Length);
     }
     fprintf(stderr, "],\n");
     fprintf(stderr, "  'options' : %d,\n", TU_options);

_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to