RE: [PATCH][GCC][mid-end] Fix PR85123 incorrect copies
Patch has been reverted as r259169 as requested. Sorry for the breakage, Tamar. > -Original Message- > From: Richard Biener <rguent...@suse.de> > Sent: Friday, April 6, 2018 12:36 > To: Eric Botcazou <ebotca...@adacore.com> > Cc: gcc-patches@gcc.gnu.org; Alan Modra <amo...@gmail.com>; Tamar > Christina <tamar.christ...@arm.com>; nd <n...@arm.com>; > l...@redhat.com; i...@airs.com; berg...@vnet.ibm.com > Subject: Re: [PATCH][GCC][mid-end] Fix PR85123 incorrect copies > > On Fri, 6 Apr 2018, Eric Botcazou wrote: > > > > I wonder if it's best to revert the original regression causing > > > patch and look for a proper solution in the GCC 9 timeframe? > > > > Probably the best course of action indeed, as changes in this area > > need to be thoroughly tested on various targets to cover all the cases. > > So please go ahead with that now. If we wind up with an obviously correct > solution we can reconsider later, backporting to GCC 8.2+ is possible as well. > > Thanks, > Richard.
RE: [PATCH][GCC][mid-end] Fix PR85123 incorrect copies
> -Original Message- > From: Richard Biener <rguent...@suse.de> > Sent: Friday, April 6, 2018 12:36 > To: Eric Botcazou <ebotca...@adacore.com> > Cc: gcc-patches@gcc.gnu.org; Alan Modra <amo...@gmail.com>; Tamar > Christina <tamar.christ...@arm.com>; nd <n...@arm.com>; > l...@redhat.com; i...@airs.com; berg...@vnet.ibm.com > Subject: Re: [PATCH][GCC][mid-end] Fix PR85123 incorrect copies > > On Fri, 6 Apr 2018, Eric Botcazou wrote: > > > > I wonder if it's best to revert the original regression causing > > > patch and look for a proper solution in the GCC 9 timeframe? > > > > Probably the best course of action indeed, as changes in this area > > need to be thoroughly tested on various targets to cover all the cases. > > So please go ahead with that now. If we wind up with an obviously correct > solution we can reconsider later, backporting to GCC 8.2+ is possible as well. I was going to suggest a more conservative approach of only doing it if there is no Padding correction. Since it seems to me the only targets that have an issue are those which use padding corrections. But I'll revert the patch then. > > Thanks, > Richard.
Re: [PATCH][GCC][mid-end] Fix PR85123 incorrect copies
On Fri, 6 Apr 2018, Eric Botcazou wrote: > > I wonder if it's best to revert the original regression causing patch > > and look for a proper solution in the GCC 9 timeframe? > > Probably the best course of action indeed, as changes in this area need to be > thoroughly tested on various targets to cover all the cases. So please go ahead with that now. If we wind up with an obviously correct solution we can reconsider later, backporting to GCC 8.2+ is possible as well. Thanks, Richard.
Re: [PATCH][GCC][mid-end] Fix PR85123 incorrect copies
> I wonder if it's best to revert the original regression causing patch > and look for a proper solution in the GCC 9 timeframe? Probably the best course of action indeed, as changes in this area need to be thoroughly tested on various targets to cover all the cases. -- Eric Botcazou
Re: [PATCH][GCC][mid-end] Fix PR85123 incorrect copies
On Fri, 6 Apr 2018, Alan Modra wrote: > On Thu, Apr 05, 2018 at 01:29:06PM +0100, Tamar Christina wrote: > > diff --git a/gcc/expr.c b/gcc/expr.c > > index > > 00660293f72e5441a6421a280b04c57fca2922b8..7daeb8c91d758edf0b3dc37f6927380b6f3df877 > > 100644 > > --- a/gcc/expr.c > > +++ b/gcc/expr.c > > @@ -2749,7 +2749,7 @@ copy_blkmode_to_reg (machine_mode mode_in, tree src) > > { > >int i, n_regs; > >unsigned HOST_WIDE_INT bitpos, xbitpos, padding_correction = 0, bytes; > > - unsigned int bitsize; > > + unsigned int bitsize = 0; > >rtx *dst_words, dst, x, src_word = NULL_RTX, dst_word = NULL_RTX; > >/* No current ABI uses variable-sized modes to pass a BLKmnode type. */ > >fixed_size_mode mode = as_a (mode_in); > > @@ -2782,7 +2782,7 @@ copy_blkmode_to_reg (machine_mode mode_in, tree src) > > > >n_regs = (bytes + UNITS_PER_WORD - 1) / UNITS_PER_WORD; > >dst_words = XALLOCAVEC (rtx, n_regs); > > - bitsize = BITS_PER_WORD; > > + > >if (targetm.slow_unaligned_access (word_mode, TYPE_ALIGN (TREE_TYPE > > (src > > bitsize = MIN (TYPE_ALIGN (TREE_TYPE (src)), BITS_PER_WORD); > > > > You calculate bitsize here, then override it in the loop? Doesn't > that mean strict align targets will use mis-aligned loads and stores? > > > @@ -2791,6 +2791,17 @@ copy_blkmode_to_reg (machine_mode mode_in, tree src) > > bitpos < bytes * BITS_PER_UNIT; > > bitpos += bitsize, xbitpos += bitsize) > > { > > + /* Find the largest integer mode that can be used to copy all or as > > +many bits as possible of the structure. */ > > + opt_scalar_int_mode mode_iter; > > + FOR_EACH_MODE_IN_CLASS (mode_iter, MODE_INT) > > + if (GET_MODE_BITSIZE (mode_iter.require ()) > > + <= ((bytes * BITS_PER_UNIT) - bitpos) > > + && GET_MODE_BITSIZE (mode_iter.require ()) <= BITS_PER_WORD) > > + bitsize = GET_MODE_BITSIZE (mode_iter.require ()); > > + else > > + break; > > + > > This isn't correct. Consider a 6 byte struct on a 4 byte word, 8 bit > byte, big-endian target when targetm.calls.return_in_msb is false. > > In this scenario, copy_blkmode_to_reg should return two registers, set > as if they had been loaded from two words in memory laid out as > follows (left to right increasing byte addresses): > ___ ___ > | 0 | 0 | s0 | s1 | | s2 | s3 | s4 | s5 | > ~~~ ~~~ > > So we will have xbitpos=16 first time around the loop. That means > your new code will attempt to store 32 bits into a bit-field starting > at bit 16 in the first 32-bit register, and of course fail. > > This scenario used to be handled correctly, at least when the struct > wasn't over-aligned. I wonder if it's best to revert the original regression causing patch and look for a proper solution in the GCC 9 timeframe? Thanks, Richard.
Re: [PATCH][GCC][mid-end] Fix PR85123 incorrect copies
On Thu, Apr 05, 2018 at 01:29:06PM +0100, Tamar Christina wrote: > diff --git a/gcc/expr.c b/gcc/expr.c > index > 00660293f72e5441a6421a280b04c57fca2922b8..7daeb8c91d758edf0b3dc37f6927380b6f3df877 > 100644 > --- a/gcc/expr.c > +++ b/gcc/expr.c > @@ -2749,7 +2749,7 @@ copy_blkmode_to_reg (machine_mode mode_in, tree src) > { >int i, n_regs; >unsigned HOST_WIDE_INT bitpos, xbitpos, padding_correction = 0, bytes; > - unsigned int bitsize; > + unsigned int bitsize = 0; >rtx *dst_words, dst, x, src_word = NULL_RTX, dst_word = NULL_RTX; >/* No current ABI uses variable-sized modes to pass a BLKmnode type. */ >fixed_size_mode mode = as_a (mode_in); > @@ -2782,7 +2782,7 @@ copy_blkmode_to_reg (machine_mode mode_in, tree src) > >n_regs = (bytes + UNITS_PER_WORD - 1) / UNITS_PER_WORD; >dst_words = XALLOCAVEC (rtx, n_regs); > - bitsize = BITS_PER_WORD; > + >if (targetm.slow_unaligned_access (word_mode, TYPE_ALIGN (TREE_TYPE > (src > bitsize = MIN (TYPE_ALIGN (TREE_TYPE (src)), BITS_PER_WORD); > You calculate bitsize here, then override it in the loop? Doesn't that mean strict align targets will use mis-aligned loads and stores? > @@ -2791,6 +2791,17 @@ copy_blkmode_to_reg (machine_mode mode_in, tree src) > bitpos < bytes * BITS_PER_UNIT; > bitpos += bitsize, xbitpos += bitsize) > { > + /* Find the largest integer mode that can be used to copy all or as > + many bits as possible of the structure. */ > + opt_scalar_int_mode mode_iter; > + FOR_EACH_MODE_IN_CLASS (mode_iter, MODE_INT) > + if (GET_MODE_BITSIZE (mode_iter.require ()) > + <= ((bytes * BITS_PER_UNIT) - bitpos) > + && GET_MODE_BITSIZE (mode_iter.require ()) <= BITS_PER_WORD) > + bitsize = GET_MODE_BITSIZE (mode_iter.require ()); > + else > + break; > + This isn't correct. Consider a 6 byte struct on a 4 byte word, 8 bit byte, big-endian target when targetm.calls.return_in_msb is false. In this scenario, copy_blkmode_to_reg should return two registers, set as if they had been loaded from two words in memory laid out as follows (left to right increasing byte addresses): ___ ___ | 0 | 0 | s0 | s1 | | s2 | s3 | s4 | s5 | ~~~ ~~~ So we will have xbitpos=16 first time around the loop. That means your new code will attempt to store 32 bits into a bit-field starting at bit 16 in the first 32-bit register, and of course fail. This scenario used to be handled correctly, at least when the struct wasn't over-aligned. -- Alan Modra Australia Development Lab, IBM
[PATCH][GCC][mid-end] Fix PR85123 incorrect copies
Hi All, This patch fixes the code generation for copy_blkmode_to_reg by calculating the bitsize per iteration doing the maximum copy allowed that does not read more than the amount of bits left to copy. This fixes the bad code generation reported and also still produces better code in most cases. For targets that don't support fast unaligned access it defaults to the old 1 byte copy (MIN alignment). This produces for the copying of a 3 byte structure: fun3: adrpx1, .LANCHOR0 add x1, x1, :lo12:.LANCHOR0 mov x0, 0 sub sp, sp, #16 ldrhw2, [x1, 16] ldrbw1, [x1, 18] add sp, sp, 16 bfi x0, x2, 0, 16 bfi x0, x1, 16, 8 ret whereas before it was producing fun3: adrpx0, .LANCHOR0 add x2, x0, :lo12:.LANCHOR0 sub sp, sp, #16 ldrhw1, [x0, #:lo12:.LANCHOR0] ldrbw0, [x2, 2] strhw1, [sp, 8] strbw0, [sp, 10] ldr w0, [sp, 8] add sp, sp, 16 ret Cross compiled on aarch64-none-elf and no issues Bootstrapped powerpc64-unknown-linux-gnu, x86_64-pc-linux-gnu, arm-none-linux-gnueabihf, aarch64-none-linux-gnu with no issues. Regtested aarch64-none-elf, x86_64-pc-linux-gnu, powerpc64-unknown-linux-gnu and arm-none-linux-gnueabihf and found no issues. Regression on powerpc (pr63594-2.c) is fixed now. OK for trunk? Thanks, Tamar gcc/ 2018-04-05 Tamar ChristinaPR middle-end/85123 * expr.c (copy_blkmode_to_reg): Fix wrong code gen. -- diff --git a/gcc/expr.c b/gcc/expr.c index 00660293f72e5441a6421a280b04c57fca2922b8..7daeb8c91d758edf0b3dc37f6927380b6f3df877 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -2749,7 +2749,7 @@ copy_blkmode_to_reg (machine_mode mode_in, tree src) { int i, n_regs; unsigned HOST_WIDE_INT bitpos, xbitpos, padding_correction = 0, bytes; - unsigned int bitsize; + unsigned int bitsize = 0; rtx *dst_words, dst, x, src_word = NULL_RTX, dst_word = NULL_RTX; /* No current ABI uses variable-sized modes to pass a BLKmnode type. */ fixed_size_mode mode = as_a (mode_in); @@ -2782,7 +2782,7 @@ copy_blkmode_to_reg (machine_mode mode_in, tree src) n_regs = (bytes + UNITS_PER_WORD - 1) / UNITS_PER_WORD; dst_words = XALLOCAVEC (rtx, n_regs); - bitsize = BITS_PER_WORD; + if (targetm.slow_unaligned_access (word_mode, TYPE_ALIGN (TREE_TYPE (src bitsize = MIN (TYPE_ALIGN (TREE_TYPE (src)), BITS_PER_WORD); @@ -2791,6 +2791,17 @@ copy_blkmode_to_reg (machine_mode mode_in, tree src) bitpos < bytes * BITS_PER_UNIT; bitpos += bitsize, xbitpos += bitsize) { + /* Find the largest integer mode that can be used to copy all or as + many bits as possible of the structure. */ + opt_scalar_int_mode mode_iter; + FOR_EACH_MODE_IN_CLASS (mode_iter, MODE_INT) + if (GET_MODE_BITSIZE (mode_iter.require ()) + <= ((bytes * BITS_PER_UNIT) - bitpos) + && GET_MODE_BITSIZE (mode_iter.require ()) <= BITS_PER_WORD) + bitsize = GET_MODE_BITSIZE (mode_iter.require ()); + else + break; + /* We need a new destination pseudo each time xbitpos is on a word boundary and when xbitpos == padding_correction (the first time through). */