Re: [RFC] Fixing expansion of misaligned MEM_REFs on strict-alignment targets
On Fri, 6 Jan 2012, Martin Jambor wrote: Hi, I'm trying to teach our expander how to deal with misaligned MEM_REFs on strict alignment targets. We currently generate code which leads to bus error signals due to misaligned accesses. I admit my motivation is not any target in particular but simply being able to produce misaligned MEM_REFs in SRA, currently we work-around that by producing COMPONENT_REFs which causes quite a few headaches. Nevertheless, I started by following Richi's advice and set out to fix the following two simple testcases on a strict-alignment platform, a sparc64 in the compile farm. If I understood him correctly, Richi claimed they have been failing since forever: - test case 1: - extern void abort (); typedef unsigned int myint __attribute__((aligned(1))); /* even without the attributes we get bus error */ unsigned int __attribute__((noinline, noclone)) foo (myint *p) { return *p; } struct blah { char c; myint i; }; struct blah g; #define cst 0xdeadbeef int main (int argc, char **argv) { int i; g.i = cst; i = foo (g.i); if (i != cst) abort (); return 0; } - test case 2: - extern void abort (); typedef unsigned int myint __attribute__((aligned(1))); void __attribute__((noinline, noclone)) foo (myint *p, unsigned int i) { *p = i; } struct blah { char c; myint i; }; struct blah g; #define cst 0xdeadbeef int main (int argc, char **argv) { foo (g.i, cst); if (g.i != cst) abort (); return 0; } I dug in expr.c and found two places which handle misaligned MEM_REfs loads and stores respectively but only if there is a special movmisalign_optab operation available for the given mode. My approach therefore was to append calls to extract_bit_field and store_bit_field which seem to be the part of expander capable of dealing with misaligned memory accesses. The patch is below, it fixes both testcases on sparc64--linux-gnu. Is this approach generally the right thing to do? And of course, since my knowledge of RTL and expander is very limited I expect that there will by quite many suggestions about its various particular aspects. I have run the c and c++ testsuite with the second hunk in place without any problems, the same test of the whole patch is under way right now but it takes quite a lot of time, therefore most probably I won't have the results today. Of course I plan to do a bootstrap and at least Fortran checking on this platform too but that is really going to take some time and I'd like to hear any comments before that. One more question: I'd like to be able to handle misaligned loads of stores of SSE vectors this way too but then of course I cannot use STRICT_ALIGNMENT as the guard but need a more elaborate predicate. I assume it must already exist, which one is it? Thanks a lot in advance for any feedback, One issue that I am running into now is that we need to robustify/change expand_assignment quite a bit. I have a patch for SRA that makes sure to create properly aligned MEM_REFs but then we have, for example MEM[p].m = ... and in the handled_component_p path of expand_assignment we happily expand MEM[p] via expand_normal (for multiple use). That of course breaks, as doing so might go to the rvalue-producing movmisalign path, miscompiling the store. Even if we handle this case specially in expand_normal, the UNSPEC x86 for example might produce most certainly isn't safe for reuse. Similar for the path that would use extract_bit_field (and would need to use store_bitfield). Which means that, 1) we need to avoid to call expand_normal (tem) in those cases, and we probably need to fall back to a stack temporary for non-trivial cases? Note that the SRA plus SSE-mode aggregate is probably a latent pre-existing issue as get_object_or_type_alignment might discover misalignment if we happen to know about an explicit misalignment. So, I am going to try to address this issue first. Richard.
Re: [RFC] Fixing expansion of misaligned MEM_REFs on strict-alignment targets
On Tue, 24 Jan 2012, Richard Guenther wrote: One issue that I am running into now is that we need to robustify/change expand_assignment quite a bit. I have a patch for SRA that makes sure to create properly aligned MEM_REFs but then we have, for example MEM[p].m = ... and in the handled_component_p path of expand_assignment we happily expand MEM[p] via expand_normal (for multiple use). That of course breaks, as doing so might go to the rvalue-producing movmisalign path, miscompiling the store. Even if we handle this case specially in expand_normal, the UNSPEC x86 for example might produce most certainly isn't safe for reuse. Similar for the path that would use extract_bit_field (and would need to use store_bitfield). Which means that, 1) we need to avoid to call expand_normal (tem) in those cases, and we probably need to fall back to a stack temporary for non-trivial cases? Note that the SRA plus SSE-mode aggregate is probably a latent pre-existing issue as get_object_or_type_alignment might discover misalignment if we happen to know about an explicit misalignment. So, I am going to try to address this issue first. Like with the following, currently testing on x86_64-linux. The idea is to simply simulate a (piecewise) store into a pseudo (which we hopefully can handle well enough - almost all cases can happen right now, as we have MEM_REFs) and simply delay the misaligned move until the rvalue is ready. That fixes my current issue, but it might not scale for a possible store_bit_field path - I suppose that instead both optimize_bitfield_assignment_op and store_field have to handle the misalignment themselves (to_rtx is already a MEM with MEM_ALIGN indicating the non-mode alignment). Richard. 2012-01-24 Richard Guenther rguent...@suse.de PR tree-optimization/50444 * expr.c (expand_assignment): Handle misaligned bases consistently, even when wrapped inside component references. Index: gcc/expr.c === *** gcc/expr.c (revision 183470) --- gcc/expr.c (working copy) *** expand_assignment (tree to, tree from, b *** 4556,4564 { rtx to_rtx = 0; rtx result; - enum machine_mode mode; - unsigned int align; - enum insn_code icode; /* Don't crash if the lhs of the assignment was erroneous. */ if (TREE_CODE (to) == ERROR_MARK) --- 4556,4561 *** expand_assignment (tree to, tree from, b *** 4571,4647 if (operand_equal_p (to, from, 0)) return; - mode = TYPE_MODE (TREE_TYPE (to)); - if ((TREE_CODE (to) == MEM_REF -|| TREE_CODE (to) == TARGET_MEM_REF) -mode != BLKmode -((align = get_object_or_type_alignment (to)) - GET_MODE_ALIGNMENT (mode)) -((icode = optab_handler (movmisalign_optab, mode)) - != CODE_FOR_nothing)) - { - struct expand_operand ops[2]; - enum machine_mode address_mode; - rtx reg, op0, mem; - - reg = expand_expr (from, NULL_RTX, VOIDmode, EXPAND_NORMAL); - reg = force_not_mem (reg); - - if (TREE_CODE (to) == MEM_REF) - { - addr_space_t as - = TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (TREE_OPERAND (to, 0; - tree base = TREE_OPERAND (to, 0); - address_mode = targetm.addr_space.address_mode (as); - op0 = expand_expr (base, NULL_RTX, VOIDmode, EXPAND_NORMAL); - op0 = convert_memory_address_addr_space (address_mode, op0, as); - if (!integer_zerop (TREE_OPERAND (to, 1))) - { - rtx off - = immed_double_int_const (mem_ref_offset (to), address_mode); - op0 = simplify_gen_binary (PLUS, address_mode, op0, off); - } - op0 = memory_address_addr_space (mode, op0, as); - mem = gen_rtx_MEM (mode, op0); - set_mem_attributes (mem, to, 0); - set_mem_addr_space (mem, as); - } - else if (TREE_CODE (to) == TARGET_MEM_REF) - { - addr_space_t as - = TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (TREE_OPERAND (to, 0; - struct mem_address addr; - - get_address_description (to, addr); - op0 = addr_for_mem_ref (addr, as, true); - op0 = memory_address_addr_space (mode, op0, as); - mem = gen_rtx_MEM (mode, op0); - set_mem_attributes (mem, to, 0); - set_mem_addr_space (mem, as); - } - else - gcc_unreachable (); - if (TREE_THIS_VOLATILE (to)) - MEM_VOLATILE_P (mem) = 1; - - create_fixed_operand (ops[0], mem); - create_input_operand (ops[1], reg, mode); - /* The movmisalignmode pattern cannot fail, else the assignment would - silently be omitted. */ - expand_insn (icode, 2, ops); - return; - } - /* Assignment of a structure component needs special treatment if the structure component's rtx is not simply a MEM.
Re: [RFC] Fixing expansion of misaligned MEM_REFs on strict-alignment targets
Eric, do you have any objections in principle of handling get_object_or_type_alignment () GET_MODE_ALIGNMENT (mode) stores/loads this way? None, extract_bit_field/store_bit_field are the right devices for this purpose. What I have objections against is to have types that are less aligned than their modes, but I guess that I can look elsewhere when there is one. :-) -- Eric Botcazou
Re: [RFC] Fixing expansion of misaligned MEM_REFs on strict-alignment targets
On Tue, 17 Jan 2012, Eric Botcazou wrote: Eric, do you have any objections in principle of handling get_object_or_type_alignment () GET_MODE_ALIGNMENT (mode) stores/loads this way? None, extract_bit_field/store_bit_field are the right devices for this purpose. What I have objections against is to have types that are less aligned than their modes, but I guess that I can look elsewhere when there is one. :-) :-) Not sure if it would be easier to have integer types with BLKmode in GIMPLE though ... that surely will have more fallout. Well, another reason to look back at the first mem-ref branch where it had an explicit alignment operand (and thus no need to encode alignment info in the type). Thus, consider that type-with-alignment-less-than-its-mode an artifact of that. Richard.
Re: [RFC] Fixing expansion of misaligned MEM_REFs on strict-alignment targets
On Fri, 6 Jan 2012, Martin Jambor wrote: Hi, I'm trying to teach our expander how to deal with misaligned MEM_REFs on strict alignment targets. We currently generate code which leads to bus error signals due to misaligned accesses. I admit my motivation is not any target in particular but simply being able to produce misaligned MEM_REFs in SRA, currently we work-around that by producing COMPONENT_REFs which causes quite a few headaches. Nevertheless, I started by following Richi's advice and set out to fix the following two simple testcases on a strict-alignment platform, a sparc64 in the compile farm. If I understood him correctly, Richi claimed they have been failing since forever: - test case 1: - extern void abort (); typedef unsigned int myint __attribute__((aligned(1))); /* even without the attributes we get bus error */ unsigned int __attribute__((noinline, noclone)) foo (myint *p) { return *p; } struct blah { char c; myint i; }; struct blah g; #define cst 0xdeadbeef int main (int argc, char **argv) { int i; g.i = cst; i = foo (g.i); if (i != cst) abort (); return 0; } - test case 2: - extern void abort (); typedef unsigned int myint __attribute__((aligned(1))); void __attribute__((noinline, noclone)) foo (myint *p, unsigned int i) { *p = i; } struct blah { char c; myint i; }; struct blah g; #define cst 0xdeadbeef int main (int argc, char **argv) { foo (g.i, cst); if (g.i != cst) abort (); return 0; } I dug in expr.c and found two places which handle misaligned MEM_REfs loads and stores respectively but only if there is a special movmisalign_optab operation available for the given mode. My approach therefore was to append calls to extract_bit_field and store_bit_field which seem to be the part of expander capable of dealing with misaligned memory accesses. The patch is below, it fixes both testcases on sparc64--linux-gnu. Is this approach generally the right thing to do? And of course, since my knowledge of RTL and expander is very limited I expect that there will by quite many suggestions about its various particular aspects. I have run the c and c++ testsuite with the second hunk in place without any problems, the same test of the whole patch is under way right now but it takes quite a lot of time, therefore most probably I won't have the results today. Of course I plan to do a bootstrap and at least Fortran checking on this platform too but that is really going to take some time and I'd like to hear any comments before that. The idea is good (well, I suppose I suggested it ... ;)) One more question: I'd like to be able to handle misaligned loads of stores of SSE vectors this way too but then of course I cannot use STRICT_ALIGNMENT as the guard but need a more elaborate predicate. I assume it must already exist, which one is it? There is none :/ STRICT_ALIGNMENT would need to get a mode argument, but reality is that non-STRICT_ALIGNMENT targets at the moment need to have their movmisalign optab trigger for all cases that will not work when misaligned. --- 4572,4653 || TREE_CODE (to) == TARGET_MEM_REF) mode != BLKmode ((align = get_object_or_type_alignment (to)) !GET_MODE_ALIGNMENT (mode))) { enum machine_mode address_mode; ! rtx reg; reg = expand_expr (from, NULL_RTX, VOIDmode, EXPAND_NORMAL); reg = force_not_mem (reg); ! if ((icode = optab_handler (movmisalign_optab, mode)) ! != CODE_FOR_nothing) { + struct expand_operand ops[2]; + rtx op0, mem; + + if (TREE_CODE (to) == MEM_REF) + { + addr_space_t as + = TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (TREE_OPERAND (to, 1; + tree base = TREE_OPERAND (to, 0); + address_mode = targetm.addr_space.address_mode (as); + op0 = expand_expr (base, NULL_RTX, VOIDmode, EXPAND_NORMAL); + op0 = convert_memory_address_addr_space (address_mode, op0, as); + if (!integer_zerop (TREE_OPERAND (to, 1))) + { + rtx off + = immed_double_int_const (mem_ref_offset (to), address_mode); + op0 = simplify_gen_binary (PLUS, address_mode, op0, off); + } + op0 = memory_address_addr_space (mode, op0, as); + mem = gen_rtx_MEM (mode, op0); + set_mem_attributes (mem, to, 0); + set_mem_addr_space (mem, as); + } + else if (TREE_CODE (to) == TARGET_MEM_REF) + { + addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (to)); + struct mem_address addr; + + get_address_description (to, addr); + op0 = addr_for_mem_ref (addr, as, true); + op0 = memory_address_addr_space
Re: [RFC] Fixing expansion of misaligned MEM_REFs on strict-alignment targets
On Tue, 10 Jan 2012, Richard Guenther wrote: There is none :/ STRICT_ALIGNMENT would need to get a mode argument, The version of STRICT_ALIGNMENT with a mode argument is SLOW_UNALIGNED_ACCESS (from GCC's perspective, there isn't much difference between unaligned accesses don't work at all and unaligned accesses are very slow because they trap - we don't want to generate them in either case). Probably we should migrate STRICT_ALIGNMENT uses to call SLOW_UNALIGNED_ACCESS (or a version thereof converted to a target hook) with appropriate arguments, with a view to poisoning STRICT_ALIGNMENT. -- Joseph S. Myers jos...@codesourcery.com
[RFC] Fixing expansion of misaligned MEM_REFs on strict-alignment targets
Hi, I'm trying to teach our expander how to deal with misaligned MEM_REFs on strict alignment targets. We currently generate code which leads to bus error signals due to misaligned accesses. I admit my motivation is not any target in particular but simply being able to produce misaligned MEM_REFs in SRA, currently we work-around that by producing COMPONENT_REFs which causes quite a few headaches. Nevertheless, I started by following Richi's advice and set out to fix the following two simple testcases on a strict-alignment platform, a sparc64 in the compile farm. If I understood him correctly, Richi claimed they have been failing since forever: - test case 1: - extern void abort (); typedef unsigned int myint __attribute__((aligned(1))); /* even without the attributes we get bus error */ unsigned int __attribute__((noinline, noclone)) foo (myint *p) { return *p; } struct blah { char c; myint i; }; struct blah g; #define cst 0xdeadbeef int main (int argc, char **argv) { int i; g.i = cst; i = foo (g.i); if (i != cst) abort (); return 0; } - test case 2: - extern void abort (); typedef unsigned int myint __attribute__((aligned(1))); void __attribute__((noinline, noclone)) foo (myint *p, unsigned int i) { *p = i; } struct blah { char c; myint i; }; struct blah g; #define cst 0xdeadbeef int main (int argc, char **argv) { foo (g.i, cst); if (g.i != cst) abort (); return 0; } I dug in expr.c and found two places which handle misaligned MEM_REfs loads and stores respectively but only if there is a special movmisalign_optab operation available for the given mode. My approach therefore was to append calls to extract_bit_field and store_bit_field which seem to be the part of expander capable of dealing with misaligned memory accesses. The patch is below, it fixes both testcases on sparc64--linux-gnu. Is this approach generally the right thing to do? And of course, since my knowledge of RTL and expander is very limited I expect that there will by quite many suggestions about its various particular aspects. I have run the c and c++ testsuite with the second hunk in place without any problems, the same test of the whole patch is under way right now but it takes quite a lot of time, therefore most probably I won't have the results today. Of course I plan to do a bootstrap and at least Fortran checking on this platform too but that is really going to take some time and I'd like to hear any comments before that. One more question: I'd like to be able to handle misaligned loads of stores of SSE vectors this way too but then of course I cannot use STRICT_ALIGNMENT as the guard but need a more elaborate predicate. I assume it must already exist, which one is it? Thanks a lot in advance for any feedback, Martin *** /gcc/trunk/src/gcc/expr.c Mon Jan 2 11:47:54 2012 --- expr.c Fri Jan 6 16:39:34 2012 *** expand_assignment (tree to, tree from, b *** 4572,4630 || TREE_CODE (to) == TARGET_MEM_REF) mode != BLKmode ((align = get_object_or_type_alignment (to)) ! GET_MODE_ALIGNMENT (mode)) !((icode = optab_handler (movmisalign_optab, mode)) ! != CODE_FOR_nothing)) { - struct expand_operand ops[2]; enum machine_mode address_mode; ! rtx reg, op0, mem; reg = expand_expr (from, NULL_RTX, VOIDmode, EXPAND_NORMAL); reg = force_not_mem (reg); ! if (TREE_CODE (to) == MEM_REF) { addr_space_t as ! = TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (TREE_OPERAND (to, 1; tree base = TREE_OPERAND (to, 0); address_mode = targetm.addr_space.address_mode (as); op0 = expand_expr (base, NULL_RTX, VOIDmode, EXPAND_NORMAL); op0 = convert_memory_address_addr_space (address_mode, op0, as); - if (!integer_zerop (TREE_OPERAND (to, 1))) - { - rtx off - = immed_double_int_const (mem_ref_offset (to), address_mode); - op0 = simplify_gen_binary (PLUS, address_mode, op0, off); - } op0 = memory_address_addr_space (mode, op0, as); mem = gen_rtx_MEM (mode, op0); set_mem_attributes (mem, to, 0); set_mem_addr_space (mem, as); - } - else if (TREE_CODE (to) == TARGET_MEM_REF) - { - addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (to)); - struct mem_address addr; ! get_address_description (to, addr); ! op0 = addr_for_mem_ref (addr, as, true); ! op0 = memory_address_addr_space (mode, op0, as); ! mem = gen_rtx_MEM (mode, op0); ! set_mem_attributes (mem, to, 0); ! set_mem_addr_space (mem, as); } - else - gcc_unreachable (); - if (TREE_THIS_VOLATILE (to)) - MEM_VOLATILE_P (mem) = 1; - - create_fixed_operand (ops[0], mem); - create_input_operand