Apparently it was 0.4 ... I tried your f2 on Julia v0.36 and it takes forever. f3 is however a blast!
Here are my timing on Julia v0.36: @time f1(X, 1, 5) elapsed time: 2.210965858 seconds (1600177296 bytes allocated, 65.31% gc time) @time f2(X, 1, 5) elapsed time: 53.146697892 seconds (22368945936 bytes allocated, 41.76% gc time) @time f3(X, 1, 5) elapsed time: 0.142597211 seconds (80 bytes allocated) I assume function sub() in v0.4 is substantially different. Thanks, Jan Dňa piatok, 13. marca 2015 10:35:45 UTC+1 Ján Dolinský napísal(-a): > > Hi Milan, > > Did you run your benchmarks on 0.4 ? > > Thanks, > Jan > > Dňa štvrtok, 12. marca 2015 19:19:08 UTC+1 Milan Bouchet-Valat napísal(-a): >> >> Le jeudi 12 mars 2015 à 11:01 -0500, Tim Holy a écrit : >> > This is something that many people (understandably) have a hard time >> > appreciating, so I think this post should be framed and put up on the >> julia >> > wall. >> > >> > We go to considerable lengths to try to make code work efficiently in >> the >> > general case (check out subarray.jl and subarray2.jl in master some >> time...), >> > but sometimes there's no competing with a hand-rolled version for a >> particular >> > case. Folks should not be shy to implement such tricks in their own >> code. >> Though with the new array views in 0.4, the vectorized version should be >> more efficient than in 0.3. I've tried it, and indeed it looks like >> unrolling is not really needed, though it's still faster and uses less >> RAM: >> >> X = rand(100_000, 5) >> >> function f1(X, i, j) >> for _ in 1:1000 >> X[:, i], X[:, j] = X[:, j], X[:, i] >> end >> end >> >> function f2(X, i, j) >> for _ in 1:1000 >> a = sub(X, :, i) >> b = sub(X, :, j) >> a[:], b[:] = b, a >> end >> end >> >> function f3(X, i, j) >> for _ in 1:1000 >> @inbounds for k in 1:size(X, 1) >> X[k, i], X[k, j] = X[k, j], X[k, i] >> end >> end >> end >> >> >> julia> f1(X, 1, 5); f2(X, 1, 5); f3(X, 1, 5); >> >> julia> @time f1(X, 1, 5) >> elapsed time: 1.027090951 seconds (1526 MB allocated, 3.63% gc time in >> 69 pauses with 0 full sweep) >> >> julia> @time f2(X, 1, 5) >> elapsed time: 0.172375013 seconds (390 kB allocated) >> >> julia> @time f3(X, 1, 5) >> elapsed time: 0.155069259 seconds (80 bytes allocated) >> >> >> Regards >> >> > --Tim >> > >> > On Thursday, March 12, 2015 07:49:49 AM Steven G. Johnson wrote: >> > > As a general rule, with Julia one needs to unlearn the instinct (from >> > > Matlab or Python) that "efficiency == clever use of library >> functions", >> > > which turns all optimization questions into "is there a built-in >> function >> > > for X" (and if the answer is "no" you are out of luck). Loops are >> fast, >> > > and you can easily beat general-purpose library functions with your >> own >> > > special-purpose code. >> >>
