https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88793

--- Comment #3 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
(In reply to Florian Weimer from comment #2)
> The startup overhead isn't the problem.  The asymptotic performance is
> really bad, too.  (I hope I didn't botch my test, though.  It's vaguely
> based on what's attached to the downstream bug.)
> 
> For len == 5000, I get a factor of 60 difference in favor of glibc 2.28's
> strlen.  For len == 30, it's still a factor of 11 in favor of strlen.  This
> is on a machine with a i7-8650U, so a fairly recent CPU with erms.

As noted in the referenced bug, erms does not accelerate scasb and cmpsb (only
movs and stos), so strlen and memcmp/strcmp are among the most extreme
examples. I wrongly assumed gcc did not use scasb to implement strlen inline.

I think it's fair to raise the question if gcc should not use scasb/cmpsb by
default (I thought there was a bug for that but apparently there isn't?).

I doubt it supports the original point about attribute-cold being
inappropriate. If gcc is making a poor decision in cold regions, it will be
making the same poor decision everywhere under -Os, and it's fair to demand
that such decisions are revisited and improved (-Os is not "minimize size at
all costs").

Reply via email to