On Thu, Jun 23, 2022 at 9:26 AM H.J. Lu <hjl.to...@gmail.com> wrote:
>
> On Wed, Jun 22, 2022 at 11:03 PM Richard Biener
> <richard.guent...@gmail.com> wrote:
> >
> > On Wed, Jun 22, 2022 at 7:13 PM H.J. Lu <hjl.to...@gmail.com> wrote:
> > >
> > > On Wed, Jun 22, 2022 at 4:39 AM Richard Biener
> > > <richard.guent...@gmail.com> wrote:
> > > >
> > > > On Tue, Jun 21, 2022 at 11:03 PM H.J. Lu via Gcc-patches
> > > > <gcc-patches@gcc.gnu.org> wrote:
> > > > >
> > > > > When memchr is applied on a constant string of no more than the bytes 
> > > > > of
> > > > > a word, inline memchr by checking each byte in the constant string.
> > > > >
> > > > > int f (int a)
> > > > > {
> > > > >    return  __builtin_memchr ("eE", a, 2) != 0;
> > > > > }
> > > > >
> > > > > is simplified to
> > > > >
> > > > > int f (int a)
> > > > > {
> > > > >   return (char) a == 'e' || (char) a == 'E';
> > > > > }
> > > > >
> > > > > gcc/
> > > > >
> > > > >         PR tree-optimization/103798
> > > > >         * match.pd (__builtin_memchr (const_str, a, N)): Inline memchr
> > > > >         with constant strings of no more than the bytes of a word.
> > > >
> > > > Please do this in strlenopt or so, with match.pd you will end up moving
> > > > the memchr loads across possible aliasing stores to the point of the
> > > > comparison.
> > >
> > > strlenopt is run after many other passes.  The code won't be well 
> > > optimized.
> >
> > What followup optimizations do you expect?  That is, other builtins are only
>
> reassociation and dce turn
>
>   _5 = a_2(D) == 101;
>   _6 = a_2(D) == 69;
>   _1 = _5 | _6;
>   _4 = (int) _1;
>
> into
>
>   _7 = a_2(D) & -33;
>   _8 = _7 == 69;
>   _1 = _8;
>   _4 = (int) _1;
>
> > expanded inline at RTL expansion time?
>
> Some high level optimizations will be missed and
> TARGET_GIMPLE_FOLD_BUILTIN improves builtins
> codegen.
>
> > > Since we are only optimizing
> > >
> > > __builtin_memchr ("eE", a, 2) != 0;
> > >
> > > I don't see any aliasing store issues here.
> >
> > Ah, I failed to see the STRING_CST restriction.  Note that when optimizing 
> > for
> > size this doesn't look very good.
>
> True.
>
> > I would expect a target might produce some vector code for
> > memchr ("aAbBcCdDeE...", c, 9) != 0 by splatting 'c', doing
> > a v16qimode compare, masking off excess elements beyond length
> > and then comparing against zero or for == 0 against all-ones.
> >
> > The repetitive pattern result also suggests an implementation elsewhere,
> > if you think strlenopt is too late there would be forwprop as well.
>
> forwprop seems a good place.

The v2 patch is at

https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598022.html

Thanks.

-- 
H.J.

Reply via email to