Greg Stein wrote:
Ouch...
-1 on this commit, pending further explanation. I'm loathe to see all of that extra code go in there for a dubious performance benefit. Are you *truly* saving any time? I see a while() loop in there copying data. Isn't it almost always faster to let the compiler produce an optimized memcpy() or somesuch?
As Cliff noted, the compiler can't do an inlined memcpy() in this case. Without advance knowledge of the size of the region to be copied, there's not much that a compiler can do here. Even if it inlined a strcpy(), it couldn't use multi-byte copy operations without a ton of checks for non-aligned addresses and short source or target strings.
Brian? Let's see some details/numbers here, or else please revert.
Here are numbers from form_header_field(), the primary user of this function in the httpd.
The original code did a strlen in the calling function, followed by an
apr_brigade_write(). The new code uses apr_brigade_puts().
Before: 292 cycles per call (includes form_header_field() plus its descendants)
After: 268 cycles per call
So the change yielded an 8% reduction in the cost of writing the HTTP response headers.
Even more importantly, though, the optimization in apr_brigade_puts() helps keep the application code simple ("application code" meaning the httpd in this case). I.e., it's better to do the optimization tricks in low-level library functions like this, so that the application developers don't have to do optimization tricks in their code.
--Brian