| Issue |
178530
|
| Summary |
[TableGen] Duplicate Enum Definitions in Register Class Intersections
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
dmpots
|
## Summary
TableGen's name-flattening optimization in register class intersection generation produces duplicate C++ enum definitions, causing compilation failures. When TableGen creates intersection register classes with nested inferred relationships, it can generate the same shortened name for two different register sets.
## Problem Description
While adding a new register class configuration, we encountered a C++ compilation error due to duplicate enum definitions:
```cpp
error: redefinition of enumerator 'R128_Aligned4_with_sub0_in_R32_RestrictedRegClassID'
R128_Aligned4_with_sub0_in_R32_RestrictedRegClassID = 7,
^
note: previous definition is here
R128_Aligned4_with_sub0_in_R32_RestrictedRegClassID = 9,
^
```
The same enum name appears twice in the generated code with different enum values (7 and 9), representing two distinct register sets with different members.
## Steps to Reproduce
A minimal test case has been added to demonstrate this bug: https://github.com/dmpots/llvm-project/commit/30aa1e1172c37e7493f1c0e77ea82e4a2b8a89c7
- **Command**: `llvm-lit llvm/test/TableGen/intersection-class-duplicate-enum.td`
- **TestFile**: `llvm/test/TableGen/intersection-class-duplicate-enum.td`
- **Status**: Marked XFAIL as the bug is currently present
### Manual Reproduction
```bash
# Run the test case
llvm-tblgen -gen-register-info -I llvm/include \
llvm/test/TableGen/intersection-class-duplicate-enum.td -o - | \
grep "R128_Aligned4_with_sub0_in_R32_RestrictedRegClassID ="
```
**Expected output (with bug):**
```
R128_Aligned4_with_sub0_in_R32_RestrictedRegClassID = 7,
R128_Aligned4_with_sub0_in_R32_RestrictedRegClassID = 9,
```
The duplicate name appears at enum IDs 7 and 9.
### Test Case Structure
The minimal reproducer uses:
- 16 32-bit registers (R0-R15)
- 96-bit and 128-bit register tuples with various spacing
- A restricted register class that excludes R0 (creating intersection differences)
Key register classes:
- `R32_Restricted`: R1-R15 (excludes R0)
- `R96_WithAligned4`: 96-bit tuples at positions 0, 4, 8, 12
- `R96_Aligned8`: 96-bit tuples at positions 0, 8 (subset of above)
- `R128_Aligned4`: 128-bit tuples at positions 0, 4, 8, 12
## Expected Behavior
TableGen should generate distinct enum names for different register sets. When the name-shortening logic is disabled, the correct distinct names are generated:
```cpp
enum {
R128_Aligned4RegClassID = 6,
R128_Aligned4_with_sub0_sub1_sub2_in_R96_WithAligned4_with_sub0_in_R32_RestrictedRegClassID = 7,
R128_Aligned4_with_sub0_sub1_sub2_in_R96_Aligned8RegClassID = 8,
R128_Aligned4_with_sub0_sub1_sub2_in_R96_Aligned8_with_sub0_in_R32_RestrictedRegClassID = 9,
// ...
};
```
## Actual Behavior
With name-shortening enabled (current behavior), both IDs 7 and 9 get incorrectly flattened to the same name:
```cpp
enum {
R128_Aligned4RegClassID = 6,
R128_Aligned4_with_sub0_in_R32_RestrictedRegClassID = 7, // First occurrence
R128_Aligned4_with_sub0_sub1_sub2_in_R96_Aligned8RegClassID = 8,
R128_Aligned4_with_sub0_in_R32_RestrictedRegClassID = 9, // DUPLICATE!
// ...
};
```
This causes C++ compilation to fail with "redefinition of enumerator" errors.
## Root Cause Analysis
We believe the bug is related to PR #134865
The issue is in TableGen's name-shortening logic in `inferMatchingSuperRegClass()`:
- **Location**: `llvm/utils/TableGen/Common/CodeGenRegisters.cpp:2528`
- **Code reference**: https://github.com/llvm/llvm-project/blob/09e59745fc7bc011e908b4e0298327de96ebffaa/llvm/utils/TableGen/Common/CodeGenRegisters.cpp#L2528
### Bug Mechanism
When processing `R128_Aligned4` with composite sub-register indices, TableGen infers intersection classes through different paths:
**Path A** (via R96_Aligned8):
- Full name: `R128_Aligned4_with_sub0_sub1_sub2_in_R96_Aligned8_with_sub0_in_R32_Restricted`
- Contains 1 register
- Flattened to: `R128_Aligned4_with_sub0_in_R32_Restricted`
**Path B** (via R96_WithAligned4):
- Full name: `R128_Aligned4_with_sub0_sub1_sub2_in_R96_WithAligned4_with_sub0_in_R32_Restricted`
- Contains 3 registers
- Flattened to: `R128_Aligned4_with_sub0_in_R32_Restricted` (SAME NAME!)
The name-flattening optimization attempts to shorten these long names but doesn't check if the shortened name already exists for a different register set (different Key with different members and/or RSI).
### Name Sensitivity
Interestingly, the bug is sensitive to the exact class names used. In our testing:
- ✅ `R96_WithAligned4` → Bug reproduces
- ❌ `R96_Aligned4` → Bug does NOT reproduce
This suggests the bug depends on TableGen's internal naming heuristics and processing order, making it somewhat fragile.
## Related
- PR #134865: (introduces the name-shortening optimization)
- Repro https://github.com/dmpots/llvm-project/commit/30aa1e1172c37e7493f1c0e77ea82e4a2b8a89c7
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs