Hello, I have a problem with a transformation I'm working on and I would appreciate some help. The transformation I am working on removes fields in structs early during link-time. For the purposes of development and this example, my transformation deletes the field identified as "delete_me" from the struct identified as "astruct_s". These identifiers are hard coded in the transformation at the moment.
For example: ```c int main() { struct astruct_s { _Bool a; _Bool delete_me; _Bool c;}; // more } ``` should be equivalent to ```c int main() { struct astruct_s { _Bool a; _Bool c;}; // more } ``` as long as no instruction accesses field "delete me". I have succeeded in eliminating field "delete_me" from struct "astruct_s" and at the same time successfully calculating field offsets and array offsets for a subset of the C syntax. I am working on expanding the allowed syntax and at the same time creating tests to verify my assumptions/work is still producing correct results. I was starting work on supporting arrays of multiple dimensions, when I found an interesting edge case in my transformation. I was able to transform structs of size 2, 3, (but not 4), 5, 6, 7, (but not 8), 9, 10... This was the stack trace when the error was triggered: ``` a.c: In function ‘main’: a.c:11:19: internal compiler error: in convert_move, at expr.c:219 11 | struct astruct_s b = a[argc][argc]; | ^ 0xb8bac3 convert_move(rtx_def*, rtx_def*, int) /home/eochoa/code/gcc/gcc/expr.c:219 0xb9f5cf store_expr(tree_node*, rtx_def*, int, bool, bool) /home/eochoa/code/gcc/gcc/expr.c:5825 0xb9d913 expand_assignment(tree_node*, tree_node*, bool) /home/eochoa/code/gcc/gcc/expr.c:5509 0xa08bfb expand_gimple_stmt_1 /home/eochoa/code/gcc/gcc/cfgexpand.c:3746 0xa09047 expand_gimple_stmt /home/eochoa/code/gcc/gcc/cfgexpand.c:3844 0xa1170f expand_gimple_basic_block /home/eochoa/code/gcc/gcc/cfgexpand.c:5884 0xa134b7 execute /home/eochoa/code/gcc/gcc/cfgexpand.c:6539 Please submit a full bug report, ``` Looking at expr.c:219 I found the following assertions ```c /* Copy data from FROM to TO, where the machine modes are not the same. Both modes may be integer, or both may be floating, or both may be fixed-point. UNSIGNEDP should be nonzero if FROM is an unsigned type. This causes zero-extension instead of sign-extension. */ void convert_move (rtx to, rtx from, int unsignedp) { machine_mode to_mode = GET_MODE (to); machine_mode from_mode = GET_MODE (from); gcc_assert (to_mode != BLKmode); gcc_assert (from_mode != BLKmode); <-- crashes here ``` I started reading the gcc internals around machine modes: https://gcc.gnu.org/onlinedocs/gccint/Machine-Modes.html and tried the experiment where I first compiled a struct of size 2 (and delete field "delete_me"), then of size 3 and so on, and so on. I noticed that the TYPE_MODE for matches the machine mode. And that it varies with the size of the struct. (Which agrees with the definition of machine mode.) I originally thought that I needed to set TYPE_MODE myself, but if layout_type is called after deleting the field (which it is), then TYPE_MODE is correctly set somewhere within layout_type: https://github.com/gcc-mirror/gcc/blob/68697710fdd35077e8617f493044b0ea717fc01a/gcc/stor-layout.c#L2203 I verified that layout_type is setting the correct values for TYPE_MODE when transforming struct "astruct_s" by comparing the TYPE_MODE of different sizes without the transformation applied. When transforming structs, layout_type always returned a TYPE_MODE which matched the TYPE_MODE for unmodified structs with the same size as the transformed struct (post transformation). In other words: For variable "struct not_transformed b" without transformation I obtain the following relationship. Without transformation: | size | typemode | |------|----------| | 1 | 13 | | 2 | 14 | | 3 | 1 | | 4 | 15 | | 5 | 1 | | 6 | 1 | | 7 | 1 | | 8 | 16 | | 9 | 1 | With transformation (i.e. astruct_s b with a field named "delete_me") | size before | size after | typemode | |-------------|------------|----------| | 2 | 1 | 13 | | 3 | 2 | 14 | | 4 | 3 | 1 | | 5 | 4 | 15 | | 6 | 5 | 1 | | 7 | 6 | 1 | | 8 | 7 | 1 | | 9 | 8 | 16 | I have a similar result for variable "struct astructs b[]". Without modifications: | size | type_mode | |------|-----------| | 1 | 14 | | 2 | 15 | | 3 | 1 | | 4 | 16 | | 5 | 1 | | 6 | 1 | With deletion of a field: | old size | size | type_mode| |----------|------|----------| | 2 | 1 | 14 | | 3 | 2 | 15 | | 4 | 3 | 1 | | 5 | 4 | 16 | |6 | 5 | 1 | | 8 | 7 | 1 | | 9 | 8 | 17 | | 10 | 9 | 1 | So, going back to the error and the information that I had collected, I found out that for structs of size 3 (and arrays holding structs of size 3) the assigned TYPE_MODE for my machine should be BLKmode. E.g. ```c int main() { struct untransformed { _Bool a; _Bool c; _Bool d;}; struct untransformed b; // TYPE_MODE == BLKmode struct untransformed a[2]; // TYPE_MODE == BLKmode b = a[0]; } ``` So, when transforming structs of size 4, initially: ```c int main() { struct astruct_s { _Bool a; _Bool c; _Bool delete_me; _Bool d;}; struct astruct_s b; // TYPE_MODE != BLKmode struct astruct_s a[2]; // TYPE_MODE != BLKmode b = a[0]; } ``` However, after the struct is transformed, the TYPE_MODE becomes BLKmode. This means, that the assertion that gets triggered is correct. `from_mode` is indeed BLKmode and therefore the assertion gets triggered. "from_mode" should be BLKmode, that's something I want and expect. And the assertion that is not triggered `to_mode` is incorrect and should be triggered. This means to me that somehow we are triggering a different execution path and hitting an assertion that we should not have encountered in the first place. This leads me to believe that I have not changed a TYPE_MODE somewhere in the gimple code. Maybe specifically the variable "b" (since this is where the "to" of the expression `b = a[0]` should be. However, printing the gimple code after the transformation, shows that b is the new variable type with the correct TYPE_MODE: Before transformation ``` Executing structreorg main (int argc, char * * argv) { struct astruct_s a[2]; struct astruct_s b; int D.10221; <bb 2> : b = a[0]; b ={v} {CLOBBER}; a ={v} {CLOBBER}; _5 = 0; <bb 3> : <L0>: return _5; } ``` Some output of my pass: ``` modifying,astruct_s offset,astruct_reorged,a,0 offset,astruct_reorged,c,1 offset,astruct_reorged,d,2 old type_mode 15 new type_mode 1 // This is BLKmode new type,astruct_reorged modifying,astruct_s[] old type_mode 16 new type_mode 1 // This is BLKmode new type,astruct_reorged[] ``` We can also look at the offending expression more indepth. The type_mode's are unchanged here, but they are changed at the end. ``` b = a[0]; <rewrite_expr "b"> < type = astruct_s type_mode = 15> <rewrite_var_decl "b"> < type = astruct_s type_mode = 15> </ type = astruct_s type_mode = 15> </rewrite_var_decl "b"> </ type = astruct_s type_mode = 15> </rewrite_expr "b"> <rewrite_expr "a[0]"> < type = astruct_s type_mode = 15> <rewrite_array_ref "a[0]"> < type = astruct_s type_mode = 15> <rewrite_expr "a"> < type = astruct_s[] type_mode = 16> <rewrite_var_decl "a"> < type = astruct_s[] type_mode = 16> </ type = astruct_s[] type_mode = 16> </ type = astruct_s[] type_mode = 16> </rewrite_expr "a"> <rewrite_expr "0"> < type = integer_cst type_mode = 15> </ type = integer_cst type_mode = 15> </rewrite_expr "0"> </ type = astruct_reorged type_mode = 1> </rewrite_array_ref "a[0]"> </ type = astruct_reorged type_mode = 1> </rewrite_expr "a[0]"> // ...SNIP... <rewrite_expr "{CLOBBER}"> < type = astruct_s type_mode = 15> <rewrite_constructor "{CLOBBER}"> < type = astruct_s type_mode = 15> </ type = astruct_reorged type_mode = 1> </rewrite_constructor "{CLOBBER}"> </ type = astruct_reorged type_mode = 1> </rewrite_expr "{CLOBBER}"> // ...SNIP... // Here is where the type mode are definitely modified for // local variables rewriting,local_decl struct astruct_s a[2];, struct astruct_reorged a[2]; rewriting,local_decl struct astruct_s b;, struct astruct_reorged b; ``` After the pass finishes this is the gimple I see. ``` main (int argc, char * * argv) { struct astruct_reorged a[2]; struct astruct_reorged b; int D.10221; <bb 2> : b = a[0]; b ={v} {CLOBBER}; a ={v} {CLOBBER}; _5 = 0; <bb 3> : <L0>: return _5; } ``` So just to summarize, things changed include: * Variable's Type b * Variable's Type a * Expression's Type a[0] * {CLOBBER} expression's type I have also tried using GDB to get a better grasp on how to fix the problem. I use the following command to explore gcc's run time state in gdb. $HOME/code/gcc-inst/bin/gcc -flto -fipa-typelist -fdump-ipa-typelist a.c -wrapper gdb,--args I am able to see that the IPA passes are successfully executed, however, I am never able to trigger a breakpoint during RTL generation. This is how I use gdb: * I go to the third gdb instance to look at the linker in gdb * set catchpoints for fork and vfork * and look at the inferior process #5 which is where LTO is applied. * I've tried to set a breakpoint for symbols "execute" and I mostly just see all IPA passes, but I do not see pass_expand::execute. * I've also looked other inferior processes but I cannot set a breakpoint before the assertion is hit. GCC just exists normally. Can anyone help me understand what could possibly be happening? Some possibilities: * Another LTO uses summary information and changes the type back to non-BLKmode? (However, I also tried passing -flto-partition=none to avoid summaries.) * I am missing setting something in gimple which I do not know what that could be? (Printing gimple doesn't show all information, but I did try to set everything correctly). * I am failing to communicate this change to other link time opts? (I am changing the definition of this function as opposed to creating a clone and then dropping the previous definition). * Some other thing? Any help would be appreciated! Thanks -Erick