Issue 69294
Summary Pointer Dereference Optimization Bug in Clang-18 on ARM64 Depending on Data Patterns at Different Optimization Levels
Labels new issue
Assignees
Reporter gyuminb
    ### **Description:**

When compiling the provided PoC on ARM64 architecture with Clang-18, there seems to be a pointer dereference optimization issue. The behavior of the code changes based on different optimization levels, and it's influenced by the data patterns used as well as the structure of adjacent **`printf`** calls. For some data patterns, the issue is observed across optimization levels **`O1`** to **`O3`**. Intriguingly, when replacing two identical **`printf`** calls with two distinct ones before and after the problematic line, the issue exclusively appears in **`O3`**. It suggests that the optimization is influenced not just by data patterns but also by the presence and structure of adjacent print functions.

### **Environment:**

- **Compiler**: Clang-18
- **Target Architecture**: ARM64
- **Optimization Level**: This issue is noticeable at **`O1`**, **`O2`**, and **`O3`** depending on the data patterns used. For patterns like **`0x123456789abcdeff`**, the issue can be observed from  to , but for patterns like **`0x1234567fffffffff`**, it exclusively appears at .
- **OS**: Ubuntu 22.04.2

### **PoC:**

```c
cCopy code
#include <stdio.h>#include <stdint.h>struct StructA {
   uint32_t val1;
   const int8_t  val2;
   uint64_t  val3;
   uint16_t val4;
};

union UnionB {
   uint32_t  u_val1;
   struct StructA  s_val;
   uint32_t  u_val2;
   int32_t   u_val3;
   int32_t u_val4;
   uint64_t  u_val5;
};

static union UnionB main_union = {1UL};
static uint32_t *ptr_val1 = &main_union.s_val.val1;
static uint32_t **double_ptr = &ptr_val1;

static uint32_t ***triple_ptr = &double_ptr;

int main() {
    printf("Before main_union.u_val5: %lx\n", main_union.u_val5);

    uint32_t **local_double_ptr = &ptr_val1;
    uint64_t local_val = 0x123456789abcedffLL;
    uint64_t *local_ptr = &main_union.u_val5;

    (*local_ptr) = local_val;
 (triple_ptr = &local_double_ptr);
    (***triple_ptr) = 0UL;

 printf("After main_union.u_val5: %lx\n", main_union.u_val5);

 return 0;
}

```

### **Expected Behavior:**

The value of **`main_union.u_val5`** should be consistent across different optimization levels after the pointer dereference operation.

### **Observed Behavior:**

he value of **`main_union.u_val5`** changes depending on the optimization level, data patterns, and the structure of adjacent **`printf`** calls.

### **Analysis:**

The optimization seems to overlook the **`(**triple_ptr) = 0UL;`** operation. The discrepancy in output, depending on the structure of **`printf`** calls and data patterns, indicates a misoptimization during the compilation process. Notably, when changing the structure of the **`printf`** statement or using a data pattern with repeating digits, the issue singularly appears in **`O3`** optimization level. This brings to light the complex nature of this optimization bug that is sensitive to both the data patterns and surrounding code structures.

### **Steps to Reproduce:**

1. Compile the PoC code using Clang-18 on ARM64 with various optimization levels (**`O1`**, **`O2`**, and **`O3`**).
2. Execute the compiled binary.
3. Observe the inconsistent behavior dependent on optimization level, data patterns, and **`printf`** structure.

### **Evidence:**

The following output showcases the behavior for various optimization levels:

```

O0 Output:
main_union.u_val5: 1
main_union.u_val5: 1234567800000000

O1 Output:
main_union.u_val5: 1
main_union.u_val5: 123456789abcdeff

O2 Output:
main_union.u_val5: 1
main_union.u_val5: 123456789abcdeff

O3 Output:
main_union.u_val5: 1
main_union.u_val5: 123456789abcdeff

```

What's intriguing is that when we replace two identical **`printf`** calls before and after the problematic line with two distinct **`printf`** calls, such as:

```c
printf("Before main_union.u_val5: %lx\n", main_union.u_val5);
```

and

```c
printf("After main_union.u_val5: %lx\n", main_union.u_val5);
```

the issue only manifests at **`O3`** optimization level.

### **Conclusion:**

Across different optimization levels (**`O1`** to **`O3`**), there is a clear evidence of a bug likely resulting from incorrect compiler optimization. The unique scenarios under which this bug emerges, especially when altering the **`printf`** structures or data patterns, further underline the unpredictable nature of this issue. This bug certainly requires attention to ensure consistent and correct behavior across all optimization levels.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to