turran wrote:

Hello @dschuff
I'd like to continue discussing this PR and provide you some updates on the 
current status after rebasing with the current `main`.

Continuing with the `EMULATE_FUNCTION_POINTER_CASTS` approach, this solution is 
still not enough to have
a complete solution for the undefined behavior in function pointer 
type-mismatch cases. For complex programs, it simply does not work. Also, there 
are other correlated issues like:

* https://github.com/emscripten-core/emscripten/issues/15081
* https://github.com/emscripten-core/emscripten/issues/22285

Despite the performance penalty, I already commented on my previous post.

Getting back to the main concern of adding this to LLVM, I'd like to address 
each of the different points raised

### On C standard compliance

It is correct that the current thunk substitution breaks the round-trip 
identity guarantee, but there are several existing mechanisms on LLVM to 
deliberately deviate from the standard compliance, always through an explicit 
opt-in flag. For example, `-fwrapv`, `-ffast-math`, `-fno-strict-aliasing`, 
etc. I'd like to point out that a no-longer-working mechanism was already 
accepted based on the IR type information 
(`WebAssemblyFixFunctionBitcasts.cpp`). This PR restores that behavior at the 
Clang level, where type information is still available. Also, the behavior is 
gated behind `-fwasm-fix-function-bitcasts`, so no code is affected unless the 
developer explicitly opts in, fully aware of the implications. Keeping the 
overhead proportional to the actual problem.

### On compile-time vs. runtime casts

This is correct, and it is by design. The goal is not to handle every possible 
case of undefined behavior; it is to handle the specific, statically visible 
patterns.

Since the original `WebAssemblyFixFunctionBitcasts` pass, the scope was always 
limited to cases that are statically detectable. The opaque pointer change made 
those cases invisible at the IR level; this PR moves the detection back to the 
AST level in Clang, where the type information is preserved. 

### On coverage and the three code paths

I have extended the tests to include the `void *` cast too, making the 
implementation more generic now. I have created four test cases that might need 
to be extended for other scenarios, but so far, these cover the patterns that 
appear in real-world libraries like GLib/GObject.

As I mentioned before, I understand there might be other theoretical solutions, 
but no such solution exists today that covers the struct field case, for 
example, or limits the costs of doing an indirect call through a thunk. The 
choice is between an imperfect compile-time solution that covers the patterns 
found in GLib/GObject and similar C callback-heavy libraries, or no solution at 
all. The flag makes the trade-off explicit and opt-in.

Let me know your thoughts

https://github.com/llvm/llvm-project/pull/153168
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to