On Fri, 2017-08-04 at 17:30 -0500, Segher Boessenkool wrote:
> On Fri, Aug 04, 2017 at 08:38:11PM +0000, Wilco Dijkstra wrote:
> > Richard Henderson wrote:    
> > > On 08/04/2017 05:59 AM, Prathamesh Kulkarni wrote:
> > > > For i386, it seems strcmp is expanded inline via cmpstr optab
> > > > by 
> > > > expand_builtin_strcmp if one of the strings is constant. Could
> > > > we similarly
> > > > define cmpstr pattern for AArch64?
> > > 
> > > Certainly that's possible.
> > 
> > I'd suggest to do it as a target independent way, this is not a
> > target specific
> > optimization and shouldn't be done in the target unless there are
> > special
> > strcmp instructions.
> 
> See rs6000-string.c; cc:ing Aaron.

I think I'd argue that even if there isn't an instruction that does
strcmp (i.e. repz cmpsb) you are still better off with target specific
code. For example, on power one needs to be aware of how the different
processors deal with unaligned loads and that power7 for example
doesn't like overlapping unaligned loads which are fine on power8. Also
instructions like cmpb are useful and I don't really see how a target
independent expansion would end up using that.

> > > For constant strings of small length (upto 3?), I was wondering
> > > if it'd be a
> > > good idea to manually unroll strcmp loop, similar to __strcmp_*
> > > macros in
> > > bits/string.h?>
> > > For eg in gimple-fold, transform
> > > x = __builtin_strcmp(s, "ab")
> > > to
> > > x = s[0] - 'a';
> > > if (x == 0)
> > > {
> > >    x = s[1] - 'b';
> > >    if (x == 0)
> > >      x = s[2];
> > > }

There was code to manually unroll strcmp/strncmp in
string/bits/string2.h in GLIBC but that was recently removed:

https://sourceware.org/git/?p=glibc.git;a=commit;h=f7db120f67d853e0cfa2
72fa39c8b9be67c0a935

Personally I'm glad to see it go not only because of the expansion into
individual comparisons of the first couple characters, but also because
it expanded strcmp/strncmp to __builtin_strcmp/strncmp which meant that
you could not disable the gcc expansions of strcmp/strncmp by using 
-fno-builtin-strcmp/strncmp.

> > 
> > If there is already code that does something similar (see comment
> > #1 in PR78809),
> > it could be easily adapted to handle more cases.
> > 
> > >   if (memcmp(s, "ab", 3) != 0)
> > > 
> > > to be implemented with cmp+ccmp+ccmp and one branch.
> > 
> > Even better would be wider loads if you either know the alignment
> > of s or it's max size
> > (although given the overhead of creating the return value that
> > works best for equality).
> 
> All those things are handled there, too.  Most things that can really
> help
> are quite target-specific, I think.

Same thing goes for memcmp, you really need to know how the target
handles aligned/unaligned and for example on power9 it is handy to be
able to use setb to produce the correct SImode result even if you just
did a DImode compare of 64 bits.

> 
> 
> Segher
> 

Aaron

-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain

Reply via email to