I don't think there's anything wrong with specializing small powers. We 
already specialize small powers in openlibm's pow and even in inference.jl. 
If you want to compute x^2 at maximum speed, something, somewhere needs 
that when the power is 2 it should just call mulsd instead of doing 
whatever magic it does for arbitrary powers. Translating integer exponents 
to multiplies only kind of works for powers higher than 3 anyway; it will 
be faster but not as accurate as calling libm.

Another option would be to use a vector math library, which would make 
vectorized versions of everything faster. As far as I'm aware VML is still 
the only option that claims 1 ulp accuracy, which does indeed provide a 
substantial 
performance boost <https://github.com/simonster/VML.jl#performance> (and 
also seems to special case x^2). I don't think we could actually use it in 
Base, though; even if the VML license allows it, I don't think we could 
link against Rmath, UMFPACK, and whatever other GPL libraries if we did.

Simon

On Monday, September 8, 2014 4:02:34 PM UTC-4, Stefan Karpinski wrote:
>
> On Mon, Sep 8, 2014 at 7:48 PM, Tony Kelman <[email protected] 
> <javascript:>> wrote:
>
>>
>> Expectations based on the way Octave/Matlab work are really not 
>> applicable to a system that works completely differently.
>>
>
> That said, we do want to make this faster – but we don't want to cheat to 
> do it, and special casing small powers is kind of cheating. In this case it 
> may be worth it, but what we'd really like is a more general kind of 
> optimization here that has the same effect in this particular case.
>

Reply via email to