Le vendredi 13 mars 2015 à 03:14 -0700, Ján Dolinský a écrit :
> Apparently it was 0.4 ... I tried your f2 on Julia v0.36 and it takes
> forever. f3 is however a blast!
> 
> Here are my timing on Julia v0.36:
> 
> @time f1(X, 1, 5)
> elapsed time: 2.210965858 seconds (1600177296 bytes allocated, 65.31%
> gc time)
> 
> @time f2(X, 1, 5)
> elapsed time: 53.146697892 seconds (22368945936 bytes allocated,
> 41.76% gc time)
> 
> @time f3(X, 1, 5)
> elapsed time: 0.142597211 seconds (80 bytes allocated)
> 
> 
> I assume function sub() in v0.4 is substantially different.
Yes, that relies on the new array views in 0.4. Now you'll have a reason
to update when the release is out!


Regards

> Thanks,
> Jan
> 
> Dňa piatok, 13. marca 2015 10:35:45 UTC+1 Ján Dolinský napísal(-a):
>         Hi Milan,
>         
>         Did you run your benchmarks on 0.4 ?
>         
>         Thanks,
>         Jan
>         
>         Dňa štvrtok, 12. marca 2015 19:19:08 UTC+1 Milan Bouchet-Valat
>         napísal(-a):
>                 Le jeudi 12 mars 2015 à 11:01 -0500, Tim Holy a
>                 écrit : 
>                 > This is something that many people (understandably)
>                 have a hard time 
>                 > appreciating, so I think this post should be framed
>                 and put up on the julia 
>                 > wall. 
>                 > 
>                 > We go to considerable lengths to try to make code
>                 work efficiently in the 
>                 > general case (check out subarray.jl and subarray2.jl
>                 in master some time...), 
>                 > but sometimes there's no competing with a
>                 hand-rolled version for a particular 
>                 > case. Folks should not be shy to implement such
>                 tricks in their own code. 
>                 Though with the new array views in 0.4, the vectorized
>                 version should be 
>                 more efficient than in 0.3. I've tried it, and indeed
>                 it looks like 
>                 unrolling is not really needed, though it's still
>                 faster and uses less 
>                 RAM: 
>                 
>                 X = rand(100_000, 5) 
>                 
>                 function f1(X, i, j) 
>                     for _ in 1:1000 
>                         X[:, i], X[:, j] = X[:, j], X[:, i] 
>                     end 
>                 end 
>                 
>                 function f2(X, i, j) 
>                     for _ in 1:1000 
>                         a = sub(X, :, i) 
>                         b = sub(X, :, j) 
>                         a[:], b[:] = b, a 
>                     end 
>                 end 
>                 
>                 function f3(X, i, j) 
>                     for _ in 1:1000 
>                         @inbounds for k in 1:size(X, 1) 
>                             X[k, i], X[k, j] = X[k, j], X[k, i] 
>                         end 
>                     end 
>                 end 
>                 
>                 
>                 julia> f1(X, 1, 5); f2(X, 1, 5); f3(X, 1, 5); 
>                 
>                 julia> @time f1(X, 1, 5) 
>                 elapsed time: 1.027090951 seconds (1526 MB allocated,
>                 3.63% gc time in 
>                 69 pauses with 0 full sweep) 
>                 
>                 julia> @time f2(X, 1, 5) 
>                 elapsed time: 0.172375013 seconds (390 kB allocated) 
>                 
>                 julia> @time f3(X, 1, 5) 
>                 elapsed time: 0.155069259 seconds (80 bytes
>                 allocated) 
>                 
>                 
>                 Regards 
>                 
>                 > --Tim 
>                 > 
>                 > On Thursday, March 12, 2015 07:49:49 AM Steven G.
>                 Johnson wrote: 
>                 > > As a general rule, with Julia one needs to unlearn
>                 the instinct (from 
>                 > > Matlab or Python) that "efficiency == clever use
>                 of library functions", 
>                 > > which turns all optimization questions into "is
>                 there a built-in function 
>                 > > for X" (and if the answer is "no" you are out of
>                 luck).   Loops are fast, 
>                 > > and you can easily beat general-purpose library
>                 functions with your own 
>                 > > special-purpose code. 
>                 

Reply via email to