Hi all, We are working on making memory_order_consume not get promoted to memory_order_acquire. Here is a little background on the work we are doing https://gcc.gnu.org/ml/gcc/2019-07/msg00038.html
We are able to parse _Dependent_ptr from C front-end. The patch files are given here. https://github.com/AKG001/gcc/commit/2accdd2b43100abae937c714eb4c8e385940b5c7 https://github.com/AKG001/gcc/commit/fb4187bc3872a50880159232cf336f0a03505fa8 Currently, we are working over the pointers only. As discussed earlier, there are certain passes, on the tree and RTL level, that may break the dependencies specified by the user. We are interested to know about the problems that could arise during the RTL passes. For that, we have tried to skip the tree passes, by considering the _Dependent_ptr as volatile. The patch for this here. https://github.com/AKG001/gcc/commit/e4ffd77f62ace986c124a70b90a662c083c570ba We are trying to find all the passes where the dependencies could get broken. We have experimented on certain examples, and we have some doubts regarding an example. Hoping community people can help us. The example is figure 20 from here ( http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf). The example: (https://github.com/AKG001/rtl_opt/blob/master/p0190r4_fig20.c). The .optimized code: https://github.com/AKG001/rtl_opt/blob/master/p0190r4_fig20.c.231t.optimized The .expand code: https://github.com/AKG001/rtl_opt/blob/master/p0190r4_fig20.c.233r.expand The .cse1 code: https://github.com/AKG001/rtl_opt/blob/master/p0190r4_fig20.c.239r.cse1 The .final code: https://github.com/AKG001/rtl_opt/blob/master/p0190r4_fig20.c.317r.final In the .expand code, I believe there are no dependencies that gets broken. Hoping someone could also verify. But, in .cse1 code instruction at line at line 231, as shown below, 1. (insn 10 9 11 2 (set (reg:CCZ 17 flags) 2. (compare:CCZ (reg/f:DI 84 [ p.2_3 ]) 3. (const_int 0 [0]))) "p0190r4_fig20.c":44:6 8 {*cmpdi_ccno_1} 4. (nil)) and instruction at line 402, as shown below 5. (insn 10 9 11 2 (set (reg:CCZ 17 flags) 6. (compare:CCZ (reg:DI 82 [ _1 ]) 7. (const_int 0 [0]))) "p0190r4_fig20.c":44:6 8 {*cmpdi_ccno_1} 8. (expr_list:REG_DEAD (reg/f:DI 84 [ p.2_3 ]) 9. (nil))) dependencies get broken. The instruction starting at line 1 gets changed to instruction starting at line 5 and starts referring to variable _1 which is defined as " long unsigned int _1;" in the .optimized code in thread1(), which is a temporary variable. I believe, this breaks the dependencies specified by the user and for that we need to put some code inside cse.c file. Also, many versions for variable 'p' gets created shown in .optimized code, they all should have _dependent_ptr qualification which they don't have currently. I believe, simply bypassing the tree passes through volatile checks won't mark them as dependent pointer qualified. For this, I believe, we need to tweak the ssa generation pass(tree-ssa.c) somewhat. Thank you all and let me know if anyone finds me wrong anywhere. -Akshat