Jeremy Hall <gcc.h...@gmail.com> writes:

> I wonder if its possible to improve the code generation for inline
> stringops when
> the length is known to be a multiple of 4 bytes?

The selection of the algorithm is fairly complex and depends on the
specific processor you are tuning for.  See decide_alg in
config/i386/i386.c.  There has been quite a lot of work in this area
based on benchmarking on a range of processors.  I'm sure there is
plenty of room for improvement, but it should be based on real
benchmarks doing memcpy of various sizes.

To be clear: the gcc@gcc.gnu.org mailing list is for discussion about
the development of gcc itself.  If you want to make suggestions for
improvements without digging into the gcc code, I recommend an
enhancement request at http://gcc.gnu.org/bugzilla/ , in a case like
this ideally with benchmarks.

On this mailing list we'll be happy to tell you what to modify to fix
the compiler yourself.  In this case, decide_alg.  Note in particular
the comments there that a loop performs better for small values.

Ian

Reply via email to