https://bugs.llvm.org/show_bug.cgi?id=37469

            Bug ID: 37469
           Summary: A miscompilation bug with unsigned char
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: Transformation Utilities
          Assignee: unassignedb...@nondot.org
          Reporter: juneyoung....@sf.snu.ac.kr
                CC: llvm-bugs@lists.llvm.org

Created attachment 20304
  --> https://bugs.llvm.org/attachment.cgi?id=20304&action=edit
A source file that raises the bug.

```
$ cat test-main.c
#include <string.h>
#include <stdio.h>
#include <stdint.h>

// If f(A, B + 4) is given, and integer representation of A and B + 4
// are the same, c1 == c2 in the loop becomes true,
// so arr3 = arr1. Hence r = A, and *A should be 10.
// However, if compiled with -O3, *A is still 1.
void store_10_to_p(int *p, int *q) {
  unsigned char arr1[8];
  unsigned char arr2[8];
  unsigned char arr3[8];
  // Type punning with char* is allowed.
  memcpy((unsigned char*)arr1, (unsigned char *)&p, sizeof(p));
  memcpy((unsigned char*)arr2, (unsigned char *)&q, sizeof(q));
  // Now arr1 is p, arr2 is q.

  for (int i = 0; i < sizeof(q); i++) {
    int c1 = (int)arr1[i], c2 = (int)arr2[i];
    // Note that c1 == c2 is a comparison between _integers_ (not pointers).
    if (c1 == c2)
      // Always true if p and q have same integer representation.
      arr3[i] = arr1[i];
    else
      arr3[i] = arr2[i];
  }
  // Now arr3 is equivalent to arr1, which is p.
  int *r;
  memcpy(&r, (unsigned char *)arr3, sizeof(r));
  // Now r is p.
  *p = 1;
  *r = 10;
}

int main() {
  int A[4], B[4];
  printf("%p %p\n", A, &B[4]);
  if ((uintptr_t)A == (uintptr_t)&B[4]) {
    store_10_to_p(A, &B[4]);
    printf("%d\n", A[0]);
  }
  return 0;
}

$ clang -O3 -o test-main test-main.c
$ ./test-main
0x7fffffffe580 0x7fffffffe580
1
$ clang -O0 -o test-main test-main.c
$ ./test-main
0x7fffffffe580 0x7fffffffe580
10
```

This is what is happening inside LLVM:
(1) Instcombine changes the loop body to "arr3[i] = arr2[i];"
(2) Loop idiom recognizer replaces the loop with a "memcpy(arr3, arr2, 8)"
(3) Instcombine does the store forwarding from the memcpy to the load


I think this is related with lowering 'unsigned char' in C into 'i8' in LLVM.

There are two choices:
(1) Lowering 'unsigned char' into 'i8' is correct.
(2) Lowering 'unsigned char' into 'i8' is incorrect.

If (1) is right, then one of the optimizations happening in this example should
be disabled, to stop the miscompilation.
If (2) is right, then it is clang which miscompiles this example. 'unsigned
char' should be lowered into something else.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to