Jakub Jelinek wrote: > I still don't like this transformation and would very much prefer to see > using rawmemchr instead on targets that provide it, and also this is > something that IMHO should be done in the tree-ssa-strlen.c pass together > with the other optimizations in there. Similarly to stpcpy, which is also > non-standard (in POSIX, but not in C), we should just look at headers if > rawmemchr is defined with compatible prototype.
Can you quantify "don't like"? I benchmarked rawmemchr on a few targets and it's slower than strlen, so it's hard to guess what you don't like about it. Several targets don't even have an assembly implementation of rawmemchr, so looking at the header would not be sufficient to determine rawmemchr is fast, let alone as fast as strlen. The tree-ssa-strlen pass seems to optimize repeated calls to strlen, or strcpy after a strlen, so I'm not sure how this is related - this is a local transformation like the foldings in builtin.c/gimple-fold.c. > Also, strrchr (s, 0) should be folded to strchr (s, 0) or handled the same > like that one. GCC converts strrchr (s, 0) to strchr (s, 0) which then gets optimized. I checked this happens as expected with both versions of my patch. > And, while x = strchr (s, 0) to x = rawmemchr (s, 0) is a reasonable -Os > transformation, x = s + strlen (s) is not, it makes code usually larger > (especially because it increases register pressure across the call). Indeed, that's why my transformation is disabled with -Os. Wilco