On Mon, Jan 28, 2013 at 05:07:10PM +0100, Marc Glisse wrote:
> There is no sqrt, x/(n*n) is just one mul and one div, whereas with
> the call I see one mul, 3 movs to prepare for the call, and the
> call.

Ah, you're talking about the checked in testcase, rather than the one I've
mentioned in the description whether the speed guard is desirable there or
not.  In the checked in testcase, the problem with code size is far earlier
than that, already during folding that
double u = x / (n * n);
is replaced by:
double u = x * __builtin_pow (n, -2.0e+0);
And this isn't something you can then size optimize in the pow folder on its
own, return pow (n, -2.0e); will be supposedly shorter than
return 1.0 / (n * n), the folding doesn't see that this is used in
multiplication which could be perhaps changed into division instead.

        Jakub

Reply via email to