[clang] [Serialization] Fix source location data loss during decoding. (PR #145529)

Chuanqi Xu via cfe-commits Wed, 25 Jun 2025 00:33:50 -0700

ChuanqiXu9 wrote:

> > The design is, the higher 32 bits are used for module file index and the 
> > lower bits are used for offsets. Could you give a concrete example why the 
> > current implementation is problematic?
> 
> I don’t have a concrete failure case, but I noticed this while working on 
> 64-bit source locations.


Yeah, the current situation conflicts with 64-bit source location. We discussed 
this. The conclusion is, if we need large source location space, use 48 bits 
and leave upper 16 bits for module file index. 48 bits should be enough for 
most cases.

> 
> Currently, we encode offsets in two ways depending on the module file index:
> 
> 1. If the module file index is `0`, we use delta encoding (see [this code 
> path](https://github.com/llvm/llvm-project/blob/01b288fe6a1e627954329198ed5641f2bf55ee8d/clang/include/clang/Serialization/SourceLocationEncoding.h#L165-L168)).
> 2. otherwise, we use raw encoding (see [this code 
> path](https://github.com/llvm/llvm-project/blob/main/clang/include/clang/Serialization/SourceLocationEncoding.h#L173-L183)).
> 
> The 2) case is fine, as the encoded value fits into a 32-bit integer. 
> However, the 1) case can produce a 33-bit value, which doesn’t fit in the 
> lower 32 bits.
> 
> It appears we only use 16 bits for the module file index, the fix here is to 
> preserve an additional bit for the offset to avoid this issue.



https://github.com/llvm/llvm-project/pull/145529
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Serialization] Fix source location data loss during decoding. (PR #145529)

Reply via email to