Apparently it was 0.4 ... I tried your f2 on Julia v0.36 and it takes 
forever. f3 is however a blast!

Here are my timing on Julia v0.36:

@time f1(X, 1, 5)
elapsed time: 2.210965858 seconds (1600177296 bytes allocated, 65.31% gc 
time)

@time f2(X, 1, 5)
elapsed time: 53.146697892 seconds (22368945936 bytes allocated, 41.76% gc 
time)

@time f3(X, 1, 5)
elapsed time: 0.142597211 seconds (80 bytes allocated)


I assume function sub() in v0.4 is substantially different.

Thanks,
Jan

Dňa piatok, 13. marca 2015 10:35:45 UTC+1 Ján Dolinský napísal(-a):
>
> Hi Milan,
>
> Did you run your benchmarks on 0.4 ?
>
> Thanks,
> Jan
>
> Dňa štvrtok, 12. marca 2015 19:19:08 UTC+1 Milan Bouchet-Valat napísal(-a):
>>
>> Le jeudi 12 mars 2015 à 11:01 -0500, Tim Holy a écrit : 
>> > This is something that many people (understandably) have a hard time 
>> > appreciating, so I think this post should be framed and put up on the 
>> julia 
>> > wall. 
>> > 
>> > We go to considerable lengths to try to make code work efficiently in 
>> the 
>> > general case (check out subarray.jl and subarray2.jl in master some 
>> time...), 
>> > but sometimes there's no competing with a hand-rolled version for a 
>> particular 
>> > case. Folks should not be shy to implement such tricks in their own 
>> code. 
>> Though with the new array views in 0.4, the vectorized version should be 
>> more efficient than in 0.3. I've tried it, and indeed it looks like 
>> unrolling is not really needed, though it's still faster and uses less 
>> RAM: 
>>
>> X = rand(100_000, 5) 
>>
>> function f1(X, i, j) 
>>     for _ in 1:1000 
>>         X[:, i], X[:, j] = X[:, j], X[:, i] 
>>     end 
>> end 
>>
>> function f2(X, i, j) 
>>     for _ in 1:1000 
>>         a = sub(X, :, i) 
>>         b = sub(X, :, j) 
>>         a[:], b[:] = b, a 
>>     end 
>> end 
>>
>> function f3(X, i, j) 
>>     for _ in 1:1000 
>>         @inbounds for k in 1:size(X, 1) 
>>             X[k, i], X[k, j] = X[k, j], X[k, i] 
>>         end 
>>     end 
>> end 
>>
>>
>> julia> f1(X, 1, 5); f2(X, 1, 5); f3(X, 1, 5); 
>>
>> julia> @time f1(X, 1, 5) 
>> elapsed time: 1.027090951 seconds (1526 MB allocated, 3.63% gc time in 
>> 69 pauses with 0 full sweep) 
>>
>> julia> @time f2(X, 1, 5) 
>> elapsed time: 0.172375013 seconds (390 kB allocated) 
>>
>> julia> @time f3(X, 1, 5) 
>> elapsed time: 0.155069259 seconds (80 bytes allocated) 
>>
>>
>> Regards 
>>
>> > --Tim 
>> > 
>> > On Thursday, March 12, 2015 07:49:49 AM Steven G. Johnson wrote: 
>> > > As a general rule, with Julia one needs to unlearn the instinct (from 
>> > > Matlab or Python) that "efficiency == clever use of library 
>> functions", 
>> > > which turns all optimization questions into "is there a built-in 
>> function 
>> > > for X" (and if the answer is "no" you are out of luck).   Loops are 
>> fast, 
>> > > and you can easily beat general-purpose library functions with your 
>> own 
>> > > special-purpose code. 
>>
>>

Reply via email to