Re: [RFC][PATCH] Remove a bad use of SLOW_UNALIGNED_ACCESS

Wilco Dijkstra Wed, 02 Nov 2016 06:44:53 -0700

Richard Biener wrote:
On Tue, Nov 1, 2016 at 10:39 PM, Wilco Dijkstra <wilco.dijks...@arm.com> wrote:


> > If bswap is false no byte swap is needed, so we found a native endian load
> > and it will always perform the optimization by inserting an unaligned load.
>
> Yes, the general agreement is that the expander can do best and thus we
> should canonicalize accesses to larger ones even for SLOW_UNALIGNED_ACCESS.
> The expander will generate the canonical best code (hopefully...).

Right, but there are cases where you have to choose between unaligned or aligned
accesses and you need to know whether the unaligned access is fast.

A good example is memcpy expansion, if you have fast unaligned accesses then you
should use them to deal with the last few bytes, but if they get expanded, 
using several
aligned accesses is much faster than a single unaligned access.

> > This apparently works on all targets, and doesn't cause alignment traps or
> > huge slowdowns via trap emulation claimed by SLOW_UNALIGNED_ACCESS.
> > So I'm at a loss what these macros are supposed to mean and how I can query
> > whether a backend supports fast unaligned access for a particular mode.
> >
> > What I actually want to write is something like:
> >
> >  if (!FAST_UNALIGNED_LOAD (mode, align)) return false;
> >
> > And know that it only accepts unaligned accesses that are efficient on the 
> > target.
> > Maybe we need a new hook like this and get rid of the old one?
>
> No, we don't need to other hook.
> 
> Note there is another similar user in gimple-fold.c when folding small
> memcpy/memmove
> to single load/store pairs (patch posted but not applied by me -- I've
> asked for strict-align
> target maintainer feedback but got none).

I didn't find it, do you have a link?

> Now - for bswap I'm only 99% sure that unaligned load + bswap is
> better than piecewise loads plus manual swap.

It depends on whether unaligned loads and bswap are expanded or not. Even if we 
assume the expansion is at least as efficient as doing it explicitly 
(definitely true
for modes larger than the native integer size - as we found out in PR77308!),
if both the unaligned load and bswap are expanded it seems better not to make 
the
transformation for modes up to the word size. But there is no way to find out as
SLOW_UNALIGNED_ACCESS must be true whenever STRICT_ALIGN is true.

> But generally I'm always in favor of removing SLOW_UNALIGNED_ACCESS /
> STRICT_ALIGNMENT checks from the GIMPLE side of the compiler.

I sort of agree because the purpose of these macros is unclear - the 
documentation
is insufficient and out of date. I do believe however we need an accurate way 
to find out
whether a target supports fast unaligned accesses as that is required to 
generate good
target code.

Wilco

Re: [RFC][PATCH] Remove a bad use of SLOW_UNALIGNED_ACCESS

Reply via email to