On Fri, Nov 6, 2015 at 6:53 AM, Lionel du Peloux <[email protected]>
wrote:
>
> Hello,
>
> I'm trying to understand how basic functions on vectors perform, regarding
> the size (n) of a vector.
>
> When I track the elapsed time per element (in nano sec) regarding the
> vector size (n), I see that I need at least 100 elements in my vector to
> reach half the maximum speed (at 7ns/el for n=1e3).
>
> So my question is what am I measuring between n=1 to n= 100 and why the
> performance is drastically poorer in this region ?
> Is this the cost of calling the function ?
>
No
> Is this a problem with my profiling method ?
>
Yes.
Don't do any benchmark in global scope.
>
> Thanks,
> Lionel
>
> CPU(1) = 300 ns/el
> CPU(10) = 40ns/el
> CPU(100) = 12ns/el
> CPU(200) = 10ns/el
> CPU(1_000) = 7ns/el = max speed
>
>
>
>
>
>
>
>
> using Gadfly
> N = [ 1,2,3,4,5,6,7,8,9,
> 10,20,30,40,50,60,70,80,90,100,200,300,400,500,750,
> 1_000,2_500,5_000,7_500,10_000,100_000,1_000_000]
>
> cpu = []
> for n in N
> n==1 ? a = pi : a = rand(n)
> sqrt(a)
> gc()
> gc_enable(false)
> t = mean([@elapsed sqrt(a) for i=1:100])*(1e9/n)
> gc_enable(true)
> push!(cpu,t)
> end
>
> df = DataFrame()
> df[:N] = N
> df[:CPU] = cpu
>
> path = Pkg.dir("MKL") * "/benchmark/"
> p = Gadfly.plot(
> layer(df,x="N",y="CPU",Geom.line),
> Scale.x_log10,
> Guide.xlabel("n-element vector"),
> Guide.ylabel("CPU time in nsec/element"),
> Guide.title("CPU time for sqrt(X) where X = Float64[]
> with n elements"))
> draw(PNG(path*"sqrt_cpu(n).png", 20cm, 20cm), p)
> p
>
>
>
>
>
>
>
>