Re: [PATCH][GCC][AArch64] Simplify movmem code by always doing overlapping copies when larger than 8 bytes.

2018-07-06 Thread Christophe Lyon
Hi Tamar, On Tue, 3 Jul 2018 at 19:13, James Greenhalgh wrote: > > On Tue, Jun 19, 2018 at 09:09:27AM -0500, Tamar Christina wrote: > > Hi All, > > > > OK. > > Thanks, > James > > > Thanks, > > Tamar > > > > gcc/ > > 2018-06-19 Tamar Christina > > > > * config/aarch64/aarch64.c

Re: [PATCH][GCC][AArch64] Simplify movmem code by always doing overlapping copies when larger than 8 bytes.

2018-07-03 Thread James Greenhalgh
On Tue, Jun 19, 2018 at 09:09:27AM -0500, Tamar Christina wrote: > Hi All, OK. Thanks, James > Thanks, > Tamar > > gcc/ > 2018-06-19 Tamar Christina > > * config/aarch64/aarch64.c (aarch64_expand_movmem): Fix mode size. > > gcc/testsuite/ > 2018-06-19 Tamar Christina > >

Re: [PATCH][GCC][AArch64] Simplify movmem code by always doing overlapping copies when larger than 8 bytes.

2018-06-20 Thread Tamar Christina
Hi James, Many thanks for the review! The 06/19/2018 22:23, James Greenhalgh wrote: > On Tue, Jun 19, 2018 at 09:09:27AM -0500, Tamar Christina wrote: > > Hi All, > > > > This changes the movmem code in AArch64 that does copy for data between 4 > > and 7 > > bytes to use the smallest possible

Re: [PATCH][GCC][AArch64] Simplify movmem code by always doing overlapping copies when larger than 8 bytes.

2018-06-19 Thread James Greenhalgh
On Tue, Jun 19, 2018 at 09:09:27AM -0500, Tamar Christina wrote: > Hi All, > > This changes the movmem code in AArch64 that does copy for data between 4 and > 7 > bytes to use the smallest possible mode capable of copying the remaining > bytes. > > This means that if we're copying 5 bytes we

[PATCH][GCC][AArch64] Simplify movmem code by always doing overlapping copies when larger than 8 bytes.

2018-06-19 Thread Tamar Christina
Hi All, This changes the movmem code in AArch64 that does copy for data between 4 and 7 bytes to use the smallest possible mode capable of copying the remaining bytes. This means that if we're copying 5 bytes we would issue an SImode and QImode load instead of two SImode loads. This does