[Lldb-commits] [clang] [lldb] [Clang] Introduce OverflowBehaviorType for fine-grained overflow control (PR #148914)

Justin Stitt via lldb-commits Fri, 17 Oct 2025 16:05:32 -0700

JustinStitt wrote:

With all the review it is clear that we have to choose some new semantics for 
OBTs. It is also important that OBTs are useful in their design and purpose for 
many projects. OBTs should provide type-level overflow behavior handling. With 
this goal in mind, there's two customers: 1) large existing codebases and 2) 
new projects.


To help all kinds of projects I met with @kees again and we talked about 
separating some behaviors into separate modes. We identified two modes that 
would give the best path forward for the Linux kernel, other existing projects, 
and new projects. We were thinking of a "compliant" mode and a "strict" mode. 
Name bike-shedding welcome. :)

@ojhunt We see your concerns with a strict mode and want to address them by 
ensuring code compatibility between all modes. Further below are some examples 
and design principles for the modes. First, let's make the distinction clear:

The "compliant" mode would use traditional C promotion rules with the exception 
that the OBT qualifier is persisted through implicit casts. This allows us to 
get truncation signal during storage of less-than-int arithmetic results and 
overflow signal on other results. This mode would be the introductory mode for 
large projects that cannot make the direct jump to strict mode. @kees has shown 
that this compliant mode would still provide useful signal in the Linux kernel, 
where truncation accounts for a large percentage of the integer overflow flaws.

An example of compliant mode:
```c
typedef unsigned short __ob_trap tu16;

tu16 a = 65535; // USHORT_MAX
int b = 1;
tu16 c = a + b; // a and b promoted to "__ob_trap int", traps on truncated 
assignment

// a+b is implicitly promoted to __ob_trap int
// result of a=b is 65536
// 65536 doesn't fit within tu16's storage space, so we can trap on the 
assignment
```

The "strict" mode would require matching bitwidths and obt kinds which results 
in no ambiguity and provides homogeneity of arithmetic results to gain full 
visibility into potential overflows. For example, this matches the semantics of 
Rust. Nothing implicit happens in this mode. To make this mode most useful, it 
would need that casts to strict obt kinds get instrumented for overflow (more 
on this below).

Same example with strict mode:
```c
typedef unsigned short __ob_strict_trap tu16;

tu16 a = 65535; // USHORT_MAX
int b = some();
tu16 c = a + b; // error: a and b have different types

solution 1:
// Make sure everything has the same type
tu16 a = 65535;
tu16 b = some(); // some() must also return tu16.
tu16 c = a + b; // OK, everything is the same type. 

solution 2:
// add explicit casts
tu16 a = 65535;
int b = some();
tu16 c = a + (tu16)b; // OK, everything is the same type...
// ... but we must avoid potential silent dataloss during c-style cast.
```

To make the "strict" mode usable, it would also necessitate the need for 
c-style casts to be instrumentable, otherwise we risk silent truncations. This 
cast instrumentation would be used to get full signal across arithmetic 
expressions being converted from compliant to strict mode.

Take a look at this toy example case which shows the usefulness of strict obts, 
which catches the other major class of integer overflow the Linux kernel wants 
to catch:

```c
// vanilla C with -fwrapv (i.e. Linux kernel today)
int x = INT_MAX;
int y = (INT_MAX-99);
u8 sz = x * y; // sz is 100 due to wrap-around, no truncation
... = malloc(sz); // buggy malloc of 100 bytes

// compliant trap mode (would not catch this kind of overflow)
typedef unsigned char __ob_trap tu8;

int x = INT_MAX;
int y = (INT_MAX-99);
tu8 sz = x * y; // sz is 100 due to wrap-around, no truncation so no trap
... = malloc(sz); // buggy malloc 100 bytes

// strict trap mode
typedef unsigned char __ob_strict_trap tu8;

int x = INT_MAX;
int y = (INT_MAX-99);
tu8 sz = (tu8)x * (tu8)y; // strict mode forces same types, so add casts
// The casts will add instrumentation to catch data loss at runtime.
... = malloc(sz);

// When using strict obts the ultimate goal is to have code changed to all 
matching types instead of littering casts everywhere

// So, imagine a more opaque example with some() and other()
tu8 x = some(); // refactor these apis to use tu8
tu8 y = other(); 
tu8 sz = x * y; // now all types are the same with no casts locally.
... = malloc(sz);
```

Refactoring to use __ob_strict_trap is still safe/stable when obts are 
unsupported, because the casts don't make anything worse. If, instead, we added 
optional bit-width/kind mismatch warnings to the "compliant" mode, we run the 
risk of bad casts (e.g. just "u8" above) being added, which would silently hide 
the overflow and silence the bit-width warning. 

It's also possible that we just need a stand-alone "strict" type qualifier that 
requires annotated types cannot participate in any implicit promotions. And 
this could then just be applied to the existing obts (or any other types).

So the process for an existing project would be to migrate some types (e.g. 
size_t) via the compliant obts, and in other places (e.g. new types, new code, 
APIs, etc), use the strict obts. This should provide the greatest flexibility 
without compromising on coverage. For example:

```c
#if __has_attribute(overflow_behavior)
// "compliant" obt for an "existing" type
typedef unsigned long __ob_trap size_t;
// "strict" obt for a "new" type
typedef unsigned char __strict __ob_trap tu8;
#else
typedef unsinged long size_t;
typedef unsigned char tu8;
#endif
```


It is important that strict mode doesn't carry different conversion semantics. 
It may only enforce stricter type rules requring explicit casts or type changes.
```c
typedef unsigned char __strict __ob_trap tu8;

extern void some(int);

void foo(int x, int y) {
  // must get 'x' and 'y' to be of type 'tu8'
  // old compilers will build this code just fine (and trap on truncation)
  // obt-enabled compilers will fail to build as there are mismatching types
  tu8 a = x * y;
  some(a);
}
```

... Now convert the code to build with 'strict' mode.

```c
void foo(int x, int y) {
  // old compiler will still build this just fine
  // obt-enabled compilers will now build (and instrument explicit casts for 
data loss)
  // no difference in result between compilers
  tu8 a = (tu8)x * (tu8)y;
  some(a);
}
```



https://github.com/llvm/llvm-project/pull/148914
_______________________________________________
lldb-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [clang] [lldb] [Clang] Introduce OverflowBehaviorType for fine-grained overflow control (PR #148914)

Reply via email to