Hi Milan,

Did you run your benchmarks on 0.4 ?

Thanks,
Jan

Dňa štvrtok, 12. marca 2015 19:19:08 UTC+1 Milan Bouchet-Valat napísal(-a):
>
> Le jeudi 12 mars 2015 à 11:01 -0500, Tim Holy a écrit : 
> > This is something that many people (understandably) have a hard time 
> > appreciating, so I think this post should be framed and put up on the 
> julia 
> > wall. 
> > 
> > We go to considerable lengths to try to make code work efficiently in 
> the 
> > general case (check out subarray.jl and subarray2.jl in master some 
> time...), 
> > but sometimes there's no competing with a hand-rolled version for a 
> particular 
> > case. Folks should not be shy to implement such tricks in their own 
> code. 
> Though with the new array views in 0.4, the vectorized version should be 
> more efficient than in 0.3. I've tried it, and indeed it looks like 
> unrolling is not really needed, though it's still faster and uses less 
> RAM: 
>
> X = rand(100_000, 5) 
>
> function f1(X, i, j) 
>     for _ in 1:1000 
>         X[:, i], X[:, j] = X[:, j], X[:, i] 
>     end 
> end 
>
> function f2(X, i, j) 
>     for _ in 1:1000 
>         a = sub(X, :, i) 
>         b = sub(X, :, j) 
>         a[:], b[:] = b, a 
>     end 
> end 
>
> function f3(X, i, j) 
>     for _ in 1:1000 
>         @inbounds for k in 1:size(X, 1) 
>             X[k, i], X[k, j] = X[k, j], X[k, i] 
>         end 
>     end 
> end 
>
>
> julia> f1(X, 1, 5); f2(X, 1, 5); f3(X, 1, 5); 
>
> julia> @time f1(X, 1, 5) 
> elapsed time: 1.027090951 seconds (1526 MB allocated, 3.63% gc time in 
> 69 pauses with 0 full sweep) 
>
> julia> @time f2(X, 1, 5) 
> elapsed time: 0.172375013 seconds (390 kB allocated) 
>
> julia> @time f3(X, 1, 5) 
> elapsed time: 0.155069259 seconds (80 bytes allocated) 
>
>
> Regards 
>
> > --Tim 
> > 
> > On Thursday, March 12, 2015 07:49:49 AM Steven G. Johnson wrote: 
> > > As a general rule, with Julia one needs to unlearn the instinct (from 
> > > Matlab or Python) that "efficiency == clever use of library 
> functions", 
> > > which turns all optimization questions into "is there a built-in 
> function 
> > > for X" (and if the answer is "no" you are out of luck).   Loops are 
> fast, 
> > > and you can easily beat general-purpose library functions with your 
> own 
> > > special-purpose code. 
>
>

Reply via email to