I don't think there's anything wrong with specializing small powers. We already specialize small powers in openlibm's pow and even in inference.jl. If you want to compute x^2 at maximum speed, something, somewhere needs that when the power is 2 it should just call mulsd instead of doing whatever magic it does for arbitrary powers. Translating integer exponents to multiplies only kind of works for powers higher than 3 anyway; it will be faster but not as accurate as calling libm.
Another option would be to use a vector math library, which would make vectorized versions of everything faster. As far as I'm aware VML is still the only option that claims 1 ulp accuracy, which does indeed provide a substantial performance boost <https://github.com/simonster/VML.jl#performance> (and also seems to special case x^2). I don't think we could actually use it in Base, though; even if the VML license allows it, I don't think we could link against Rmath, UMFPACK, and whatever other GPL libraries if we did. Simon On Monday, September 8, 2014 4:02:34 PM UTC-4, Stefan Karpinski wrote: > > On Mon, Sep 8, 2014 at 7:48 PM, Tony Kelman <[email protected] > <javascript:>> wrote: > >> >> Expectations based on the way Octave/Matlab work are really not >> applicable to a system that works completely differently. >> > > That said, we do want to make this faster – but we don't want to cheat to > do it, and special casing small powers is kind of cheating. In this case it > may be worth it, but what we'd really like is a more general kind of > optimization here that has the same effect in this particular case. >
