[Bug tree-optimization/65752] Too strong optimizations int - pointer casts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752 --- Comment #35 from Chung-Kil Hur gil.hur at sf dot snu.ac.kr --- (In reply to rguent...@suse.de from comment #34) On Sat, 23 May 2015, gil.hur at sf dot snu.ac.kr wrote: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752 --- Comment #33 from Chung-Kil Hur gil.hur at sf dot snu.ac.kr --- Dear Richard, Thanks for the detailed response. I have a suggestion for a solution of the problem, which is based on my paper to appear at PLDI 2015. * A Formal C Memory Model Supporting Integer-Pointer Casts. Jeehoon Kang, Chung-Kil Hur, William Mansky, Dmitri Garbuzov, Steve Zdancewic, Viktor Vafeiadis. http://sf.snu.ac.kr/gil.hur/publications/intptrcast.pdf The suggestion is simple. You do not need to turn off the phiopt optimizations. We propose to slightly change the following assumption. PTA considers that all pointers coming from integer constants point to global memory only. Here, if you change this as follows, you can solve the problem. * All pointers coming from integer constants can point to only global memory and local variables whose addresses have been cast to integers. Ok, so you basically add a 2nd class of escaping. So in GCC PTA terms you'd add a new ESCAPE-like 'INTEGER' variable with INTEGER = NONLOCAL and add INTEGER = x constraints for each .. = (integer-type) x conversion and for the reverse ptr = (pointer-type) i add ptr = INTEGER Also, we expect that this would not decrease the optimization performance of GCC very much because those variables whose addresses have been cast to integers tend to be escaped (e.g. passed to a hash function, or stored in the memory). Well - the above basically makes _all_ pointers converted from integers point to non-local memory, it also basically globs all pointers converted from integers into a single equivalence class. Yes, this is right. So I think you underestimate the effect on optimization (but I may overestimate the effect on optimization of not simply making all pointers converted from integers point to all globals and all address-taken locals, aka ANYTHING in GCC PTA terms) Just one minor correction: all address-taken locals - all address-taken-and-cast-to-integer locals Yes, I agree. In order to understand the effect, we need some empirical evidence. I am interested in this direction. So, I wonder what benchmarks you usually use to check the effect of compiler optimizations. More specifically, are SPEC benchmarks enough? or do you use some other benchmarks too? Thanks!
[Bug tree-optimization/65752] Too strong optimizations int - pointer casts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752 --- Comment #37 from Chung-Kil Hur gil.hur at sf dot snu.ac.kr --- (In reply to rguent...@suse.de from comment #36) On Tue, 26 May 2015, gil.hur at sf dot snu.ac.kr wrote: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752 --- Comment #35 from Chung-Kil Hur gil.hur at sf dot snu.ac.kr --- (In reply to rguent...@suse.de from comment #34) On Sat, 23 May 2015, gil.hur at sf dot snu.ac.kr wrote: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752 --- Comment #33 from Chung-Kil Hur gil.hur at sf dot snu.ac.kr --- Dear Richard, Thanks for the detailed response. I have a suggestion for a solution of the problem, which is based on my paper to appear at PLDI 2015. * A Formal C Memory Model Supporting Integer-Pointer Casts. Jeehoon Kang, Chung-Kil Hur, William Mansky, Dmitri Garbuzov, Steve Zdancewic, Viktor Vafeiadis. http://sf.snu.ac.kr/gil.hur/publications/intptrcast.pdf The suggestion is simple. You do not need to turn off the phiopt optimizations. We propose to slightly change the following assumption. PTA considers that all pointers coming from integer constants point to global memory only. Here, if you change this as follows, you can solve the problem. * All pointers coming from integer constants can point to only global memory and local variables whose addresses have been cast to integers. Ok, so you basically add a 2nd class of escaping. So in GCC PTA terms you'd add a new ESCAPE-like 'INTEGER' variable with INTEGER = NONLOCAL and add INTEGER = x constraints for each .. = (integer-type) x conversion and for the reverse ptr = (pointer-type) i add ptr = INTEGER Also, we expect that this would not decrease the optimization performance of GCC very much because those variables whose addresses have been cast to integers tend to be escaped (e.g. passed to a hash function, or stored in the memory). Well - the above basically makes _all_ pointers converted from integers point to non-local memory, it also basically globs all pointers converted from integers into a single equivalence class. Yes, this is right. So I think you underestimate the effect on optimization (but I may overestimate the effect on optimization of not simply making all pointers converted from integers point to all globals and all address-taken locals, aka ANYTHING in GCC PTA terms) Just one minor correction: all address-taken locals - all address-taken-and-cast-to-integer locals Yes, I agree. In order to understand the effect, we need some empirical evidence. I am interested in this direction. So, I wonder what benchmarks you usually use to check the effect of compiler optimizations. More specifically, are SPEC benchmarks enough? or do you use some other benchmarks too? SPEC CPU tends to capture most of this though we also periodically check other benchmarks such as firefox and its few performance tests or similar big C++ programs. Thanks for the information!
[Bug tree-optimization/65752] Too strong optimizations int - pointer casts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752 --- Comment #33 from Chung-Kil Hur gil.hur at sf dot snu.ac.kr --- Dear Richard, Thanks for the detailed response. I have a suggestion for a solution of the problem, which is based on my paper to appear at PLDI 2015. * A Formal C Memory Model Supporting Integer-Pointer Casts. Jeehoon Kang, Chung-Kil Hur, William Mansky, Dmitri Garbuzov, Steve Zdancewic, Viktor Vafeiadis. http://sf.snu.ac.kr/gil.hur/publications/intptrcast.pdf The suggestion is simple. You do not need to turn off the phiopt optimizations. We propose to slightly change the following assumption. PTA considers that all pointers coming from integer constants point to global memory only. Here, if you change this as follows, you can solve the problem. * All pointers coming from integer constants can point to only global memory and local variables whose addresses have been cast to integers. Also, we expect that this would not decrease the optimization performance of GCC very much because those variables whose addresses have been cast to integers tend to be escaped (e.g. passed to a hash function, or stored in the memory).
[Bug tree-optimization/65752] Too strong optimizations int - pointer casts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752 --- Comment #29 from Chung-Kil Hur gil.hur at sf dot snu.ac.kr --- Dear Richard, This time, I think I constructed a real bug. Please have a look and correct me if I am wrong. = #include stdio.h int main() { int x = 0; uintptr_t xp = (uintptr_t) x; uintptr_t i; for (i = 0; i xp; i++) { } *(int*)xp = 15; printf(%d\n, x); } = This program prints 15 and I do not think this raises UB. Now I add an if-statement to the program. = #include stdio.h int main() { int x = 0; uintptr_t xp = (uintptr_t) x; uintptr_t i; for (i = 0; i xp; i++) { } /*** begin ***/ if (xp != i) { printf(hello\n); xp = i; } /*** end ***/ *(int*)xp = 15; printf(%d\n, x); } = This program just prints 0. Since hello is not printed, the if-statement is not executed. However, it prints a different result than before, which I think is a bug.
[Bug tree-optimization/65752] Too strong optimizations int - pointer casts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752 --- Comment #26 from Chung-Kil Hur gil.hur at sf dot snu.ac.kr --- Thanks for the detailed explanations. The C standard only guarantees that you can convert a pointer to uintptr_t and back, it doesn't guarantee that you can convert a modified uintptr_t back to a pointer that is valid. Thus, doing (int *)((xp + i) - j) is invoking undefined behavior. I didn't know about this rule. I thought this cast is valid because (xp+i)-j is even safely-derived. Could you give a reference for that rule in the standard? Thanks!
[Bug tree-optimization/65752] Too strong optimizations int - pointer casts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752 --- Comment #27 from Chung-Kil Hur gil.hur at sf dot snu.ac.kr --- (In reply to Chung-Kil Hur from comment #26) Thanks for the detailed explanations. The C standard only guarantees that you can convert a pointer to uintptr_t and back, it doesn't guarantee that you can convert a modified uintptr_t back to a pointer that is valid. Thus, doing (int *)((xp + i) - j) is invoking undefined behavior. I didn't know about this rule. I thought this cast is valid because (xp+i)-j is even safely-derived. Could you give a reference for that rule in the standard? Thanks! It seems that the following rule might be the one. = 7.20.1.4 Integer types capable of holding object pointers The following type designates a signed integer type with the property that any valid pointer to void can be converted to this type, then converted back to pointer to void, and the result will compare equal to the original pointer: intptr_t The following type designates an unsigned integer type with the property that any valid pointer to void can be converted to this type, then converted back to pointer to void, and the result will compare equal to the original pointer: uintptr_t These types are optional. = Right. This does not say that we are allowed to cast a modified integer back to a pointer. What I remember might be from the C++ standard, where safely derived integers are allowed to be cast back to pointers. Umm. This might also be implementation-defined. Anyway, thanks very much for taking your time to respond to my questions!! Best, Gil
[Bug tree-optimization/65752] Too strong optimizations int - pointer casts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752 --- Comment #15 from Chung-Kil Hur gil.hur at sf dot snu.ac.kr --- Hi Richard, Thanks for the explanation. But, what I wonder was how to justify such an optimization, rather than how it works. I have a better example. This might be a real bug of GCC. #include stdio.h int main() { int x = 0; uintptr_t pi = (uintptr_t) x; uintptr_t i, j; for (i = 0; i pi; i++) { } j = i; /* Note that the following if statement is never executed because j == pi. */ if (j != pi) { j = pi; } *(int*)((pi+i)-j) = 15; printf(%d\n, x); } This program prints out 0 instead of 15. Here, pi contains the address of the variable x; and i and j contain the same integer. So, it seems that (pi+i)-j should have a proper provenance of x and thus the variable x should be updated to 15. However, GCC seems to think that (pi+i)-j has no provenance. So, as a programmer, I wonder how I should calculate the provenance of an integer in order to see whether casting it to a pointer is valid or not. Thanks.
[Bug tree-optimization/65752] Too strong optimizations int - pointer casts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752 --- Comment #17 from Chung-Kil Hur gil.hur at sf dot snu.ac.kr --- Hi Richard, I modified the example further. #include stdio.h int main() { int x = 0; uintptr_t xp = (uintptr_t) x; uintptr_t i, j; for (i = 0; i xp; i++) { } j = i; /* The following if statement is never executed because j == xp */ if (j != xp) { printf(hello\n); j = xp; } *(int*)((xp+i)-j) = 15; printf(%d\n, x); } The above example does not print hello, so i can assume that j = xp is not executed. However, the program prints 0 instead of 15. Can you explain this?
[Bug tree-optimization/65752] Too strong optimizations int - pointer casts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752 --- Comment #21 from Chung-Kil Hur gil.hur at sf dot snu.ac.kr --- (In reply to Marek Polacek from comment #20) (In reply to Chung-Kil Hur from comment #19) (In reply to rguent...@suse.de from comment #18) On Tue, 19 May 2015, gil.hur at sf dot snu.ac.kr wrote: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752 --- Comment #17 from Chung-Kil Hur gil.hur at sf dot snu.ac.kr --- Hi Richard, I modified the example further. #include stdio.h int main() { int x = 0; uintptr_t xp = (uintptr_t) x; uintptr_t i, j; for (i = 0; i xp; i++) { } j = i; /* The following if statement is never executed because j == xp */ if (j != xp) { printf(hello\n); j = xp; } Here j is always xp and thus ... Why is j always xp? Since hello is not printed, j = xp; is not executed. Because that if (j != xp) guarantees it. OK. here is another modification. #include stdio.h int main() { int x = 0; uintptr_t xp = (uintptr_t) x; uintptr_t i, j; for (i = 0; i xp; i++) { } j = i; *(int*)j = 15; /* The following if statement is never executed because j == xp */ if (j != xp) { printf(hello\n); j = xp; } *(int*)((xp+i)-j) = 15; printf(%d\n, x); } This program just prints 0. So we know that *(int*)j = 15; is not executed and thus j == xp is not true. Then, can the following statement change j even if the printf is not executed? if (j != xp) { printf(hello\n); j = xp; } If not, how can j == xp suddenly hold?
[Bug tree-optimization/65752] Too strong optimizations int - pointer casts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752 --- Comment #22 from Chung-Kil Hur gil.hur at sf dot snu.ac.kr --- (In reply to Chung-Kil Hur from comment #21) (In reply to Marek Polacek from comment #20) (In reply to Chung-Kil Hur from comment #19) (In reply to rguent...@suse.de from comment #18) On Tue, 19 May 2015, gil.hur at sf dot snu.ac.kr wrote: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752 --- Comment #17 from Chung-Kil Hur gil.hur at sf dot snu.ac.kr --- Hi Richard, I modified the example further. #include stdio.h int main() { int x = 0; uintptr_t xp = (uintptr_t) x; uintptr_t i, j; for (i = 0; i xp; i++) { } j = i; /* The following if statement is never executed because j == xp */ if (j != xp) { printf(hello\n); j = xp; } Here j is always xp and thus ... Why is j always xp? Since hello is not printed, j = xp; is not executed. Because that if (j != xp) guarantees it. OK. here is another modification. #include stdio.h int main() { int x = 0; uintptr_t xp = (uintptr_t) x; uintptr_t i, j; for (i = 0; i xp; i++) { } j = i; *(int*)j = 15; /* The following if statement is never executed because j == xp */ if (j != xp) { printf(hello\n); j = xp; } *(int*)((xp+i)-j) = 15; printf(%d\n, x); } This program just prints 0. So we know that *(int*)j = 15; is not executed and thus j == xp is not true. Then, can the following statement change j even if the printf is not executed? if (j != xp) { printf(hello\n); j = xp; } If not, how can j == xp suddenly hold? One more thing. If you remove the if-statement, then it prints 15 with GCC -O2. Since hello is not printed, I think the if-statement is the same as no-op. Thus, removing the if-statement should not change the behavior of the program according to ISO C11. But, they print different values. Can you explain this?
[Bug tree-optimization/65752] Too strong optimizations int - pointer casts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752 --- Comment #19 from Chung-Kil Hur gil.hur at sf dot snu.ac.kr --- (In reply to rguent...@suse.de from comment #18) On Tue, 19 May 2015, gil.hur at sf dot snu.ac.kr wrote: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752 --- Comment #17 from Chung-Kil Hur gil.hur at sf dot snu.ac.kr --- Hi Richard, I modified the example further. #include stdio.h int main() { int x = 0; uintptr_t xp = (uintptr_t) x; uintptr_t i, j; for (i = 0; i xp; i++) { } j = i; /* The following if statement is never executed because j == xp */ if (j != xp) { printf(hello\n); j = xp; } Here j is always xp and thus ... Why is j always xp? Since hello is not printed, j = xp; is not executed. Is there some special semantics of C? If so, please let me know a reference. Thanks!
[Bug tree-optimization/65752] Too strong optimizations int - pointer casts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752 --- Comment #24 from Chung-Kil Hur gil.hur at sf dot snu.ac.kr --- (In reply to schwab from comment #23) gil.hur at sf dot snu.ac.kr gcc-bugzi...@gcc.gnu.org writes: Since hello is not printed, I think the if-statement is the same as no-op. Thus, removing the if-statement should not change the behavior of the program according to ISO C11. Unless you are invoking undefined behaviour. Andreas. == #include stdio.h int main() { int x = 0; uintptr_t xp = (uintptr_t) x; uintptr_t i, j; for (i = 0; i xp; i++) { } j = i; *(int*)((xp+i)-j) = 15; printf(%d\n, x); } = This prints 15. And I do not think there is any UB. Please correct me if I am wrong. Then, I add the if-statement. == #include stdio.h int main() { int x = 0; uintptr_t xp = (uintptr_t) x; uintptr_t i, j; for (i = 0; i xp; i++) { } j = i; /** begin ***/ if (j != xp) { printf(hello\n); j = xp; } /** end */ *(int*)((xp+i)-j) = 15; printf(%d\n, x); } = This prints 0 without printing hello. Thus, this raises no UB unless j != xp raises UB. If you think j != xp raises UB, please explain why and give some reference. Otherwise, I think it is a bug of GCC.
[Bug tree-optimization/65752] Too strong optimizations int - pointer casts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752 Chung-Kil Hur gil.hur at sf dot snu.ac.kr changed: What|Removed |Added CC||gil.hur at sf dot snu.ac.kr --- Comment #13 from Chung-Kil Hur gil.hur at sf dot snu.ac.kr --- Hi, I have the following modified code. #include stdio.h #include stdint.h #include limits.h int main() { int x = 0, *p = 0; uintptr_t i; uintptr_t j = (uintptr_t) x; uintptr_t k = j+j; uintptr_t l = 2*j - j - j; for (i = j+j-k+l; ; i++) { if (i == (uintptr_t)x) { p = (int*)i; break; } } *p = 15; printf(%d\n, x); } This example still prints out 0 instead of 15. In this example, it seems that the integer j+j-k+l has no provenance. It is unclear to me how the provenance is calculated. Is there any concrete rule for calculating provenance?