On Wednesday, November 18, 2015 02:30:01 PM Stefan Karpinski wrote:
> Those numbers don't include any compilation (the allocations are too low).
> I'm seeing a similar thing. They're just implemented in really different
> ways. maxabs uses mapreduce, which seems to be a chronic source of
> less-than-optimal performance.
Not the problem:
julia> function mymaxabs(x)
s = abs(x[1])
@inbounds @simd for I in eachindex(x)
s = max(s, abs(x[I]))
end
s
end
mymaxabs (generic function with 1 method)
julia> x = randn(100000);
# warmup suppressed
julia> @time maxabs(x)
0.000425 seconds (5 allocations: 176 bytes)
4.513240114499124
julia> @time mymaxabs(x)
0.000642 seconds (5 allocations: 176 bytes)
4.513240114499124
(It doesn't actually get SIMDed, though.)
I'm not entirely surprised. Multiplication is fast, and with 10^5 elements the
sqrt should not be the bottleneck.
--Tim
>
> On Wed, Nov 18, 2015 at 2:12 PM, Benjamin Deonovic <[email protected]>
>
> wrote:
> > Does norm use maxabs? If so this could be due to maxabs getting compiled.
> > try running both of the timed statements a second time.
> >
> > On Wednesday, November 18, 2015 at 10:41:48 AM UTC-6, Sisyphuss wrote:
> >> Interesting phenomenon: norm() is faster than maxabs()
> >>
> >> x = randn(100000)
> >> @time maxabs(x)
> >> @time norm(x)
> >>
> >>
> >> 0.000108 seconds (5 allocations: 176 bytes)
> >> 0.000040 seconds (5 allocations: 176 bytes)
> >>
> >> I have thought the contrary, for norm() requires N square and 1 square
> >> root; maxabs() requires 2N change of sign bit and N comparison.