On Wednesday, November 18, 2015 02:30:01 PM Stefan Karpinski wrote:
> Those numbers don't include any compilation (the allocations are too low).
> I'm seeing a similar thing. They're just implemented in really different
> ways. maxabs uses mapreduce, which seems to be a chronic source of
> less-than-optimal performance.

Not the problem:

julia> function mymaxabs(x)
           s = abs(x[1])
           @inbounds @simd for I in eachindex(x)
               s = max(s, abs(x[I]))
           end
           s
       end
mymaxabs (generic function with 1 method)

julia> x = randn(100000);

# warmup suppressed

julia> @time maxabs(x)
  0.000425 seconds (5 allocations: 176 bytes)
4.513240114499124

julia> @time mymaxabs(x)
  0.000642 seconds (5 allocations: 176 bytes)
4.513240114499124


(It doesn't actually get SIMDed, though.)

I'm not entirely surprised. Multiplication is fast, and with 10^5 elements the 
sqrt should not be the bottleneck.

--Tim

> 
> On Wed, Nov 18, 2015 at 2:12 PM, Benjamin Deonovic <[email protected]>
> 
> wrote:
> > Does norm use maxabs? If so this could be due to maxabs getting compiled.
> > try running both of the timed statements a second time.
> > 
> > On Wednesday, November 18, 2015 at 10:41:48 AM UTC-6, Sisyphuss wrote:
> >> Interesting phenomenon: norm() is faster than maxabs()
> >> 
> >> x = randn(100000)
> >> @time maxabs(x)
> >> @time norm(x)
> >> 
> >> 
> >> 0.000108 seconds (5 allocations: 176 bytes)
> >> 0.000040 seconds (5 allocations: 176 bytes)
> >> 
> >> I have thought the contrary, for norm() requires N square and 1 square
> >> root; maxabs() requires 2N change of sign bit and N comparison.

Reply via email to