On Fri, Nov 6, 2015 at 8:48 AM, Lionel du Peloux
<[email protected]> wrote:
>
> Yes, I should have done that before I posted ... I apologize ...
>
> However, as said in my previous answer, wrapping my bench in a function
> doesn't change the results. So I still don't know if I'm doing something
> wrong (and what ...) :

There are a few things that is happening. They have different effects
and I'll just list them below in no specific order (and I didn't
benchmark the effect for each of them individually) Please fine my
code and result here[1]

1. As noted, don't do it in global scope
2. Your `cpu` is `Vector{Any}` (this shouldn't have any effect and I'm
just pointing that out)
3. Disabling GC is not necessarily a good thing. The GC time is
usually pretty small and the cost of continiously allocating memory
can be more expensive. Calling the GC before each run is enough to
make the result more repeatable.
4. Your a is not type stable, so the `sqrt(a)` you have go though a
runtime dispatch, increasing the overhead.
5. only running one time might be too small for small array size. You
should measure the time of multiple run together for cheap functions
to avoid the overhead of measuring the time itself. (calling OS
routines to get time can certainly take  longer than cheap arithmetic
functions)
6*. If you look at my result, there's still a overhead for small size,
That's because the memory allocation is not really linear. If you
compare the `sqrt_*.png` (with sqrt) to `vector_*.png` (with
allocation), you can see that the two curve follows each other very
well. This is basically the overhead you have to pay with vectorized
operations without pre-allocated buffers.


[1] 
https://github.com/yuyichao/explore/tree/6b1f3e722ff094eb286abf7ff7f999de472adf3d/julia/bench_vec


>
> julia> using DataFrames
>
> julia> using Gadfly
>
> julia> function bench_cpu_regarding_n()
>            N = [   1,2,3,4,5,6,7,8,9,
>                    10,20,30,40,50,60,70,80,90,100,200,300,400,500,750,
>                    1_000,2_500,5_000,7_500,10_000,100_000,1_000_000]
>
>            cpu = []
>            for n in N
>                n==1 ? a = pi : a = rand(n)
>                sqrt(a)
>                gc()
>                gc_enable(false)
>                t = mean([@elapsed sqrt(a) for i=1:100])*(1e9/n)
>                gc_enable(true)
>                push!(cpu,t)
>                println(round(cpu[end]))
>            end
>
>            df = DataFrame()
>            df[:N] = N
>            df[:CPU] = cpu
>
>            path = Pkg.dir("MKL") * "/benchmark/"
>            p = Gadfly.plot(
>                            layer(df,x="N",y="CPU",Geom.line),
>                            Scale.x_log10,
>                            Guide.xlabel("n-element vector"),
>                            Guide.ylabel("CPU time in nsec/element"),
>                            Guide.title("CPU time for sqrt(X) where X =
> Float64[] with n elements"))
>            draw(PNG(path*"sqrt_cpu(n).png", 20cm, 20cm), p)
>            p
>        end
>
> bench_cpu_regarding_n (generic function with 1 method)
>
> julia> @time bench_cpu_regarding_n()
> 281.0
> 155.0
> 109.0
> 81.0
> 95.0
> 67.0
> 50.0
> 79.0
> 40.0
> 36.0
> 57.0
> 24.0
> 22.0
> 13.0
> 12.0
> ...
>
>

Reply via email to