Finally, I found that Octave has an equivalent to sumabs2() called sumsq(). 
Just for sake of completeness here are the timings:

Octave
X = rand(7000);
tic; sumsq(X); toc;
Elapsed time is 0.0616651 seconds.

Julia v0.3
@time X = rand(7000,7000);
elapsed time: 0.285218597 seconds (392000160 bytes allocated)
@time sumabs2(X, 1);
elapsed time: 0.05705666 seconds (56496 bytes allocated)


Essentially speed is about the  same with Julia being a little faster.

It was however interesting to observe that @time X = rand(7000,7000);
is about 2.5 times slower in Julia 0.3 than it was in Julia 0.2 ...

in Julia (v0.2.1):
 @time X = rand(7000,7000);
elapsed time: 0.114418731 seconds (392000128 bytes allocated)
 

Jan

Dňa utorok, 9. septembra 2014 17:06:59 UTC+2 Ján Dolinský napísal(-a):
>
>  Hello Andreas,
>
> Thanks for the tip. I'll check it out. Thumbs up for the 0.4!
>
> Jan
>
>  On 09.09.2014 17:04, Andreas Noack wrote:
>  
> If you need the speed now you can try one of the package ArrayViews or 
> ArrayViewsAPL. It is something similar to the functionality in these 
> packages that we are trying to include in base.
>
>  Med venlig hilsen
>
> Andreas Noack
>  
> 2014-09-09 9:38 GMT-04:00 Ján Dolinský:
>
>>  OK, so basically there is nothing wrong with the syntax X[:,1001:end] ?   
>>
>>  d = sumabs2(X[:,1001:end], 1);
>>  and I should just wait until v0.4 is available (perhaps available soon 
>> in Julia Nightlies PPA).
>>
>> I did the benchmark with the floating point power function based on 
>> Simon's comment. Here are my results (after couple of repetitive 
>> iterations):
>>  @time X.^2;
>> elapsed time: 0.511988142 seconds (392000256 bytes allocated, 2.52% gc 
>> time)
>> @time X.^2.0;
>> elapsed time: 0.411791612 seconds (392000256 bytes allocated, 3.12% gc 
>> time)
>>  
>> Thanks, 
>> Jan Dolinsky
>>
>>   On 09.09.2014 14:06, Andreas Noack wrote:
>>  
>> The problem is that right now X[:,1001,end] makes a copy of the array. 
>> However,  in 0.4 this will instead be a view of the original matrix and 
>> therefore the computing time should be almost the same. 
>>
>>  It might also be worth repeating Simon's comment that the floating 
>> point power function has special handling of 2. The result is that
>>
>> julia> @time A.^2;
>> elapsed time: 1.402791357 seconds (200000256 bytes allocated, 5.90% gc 
>> time)
>>
>> julia> @time A.^2.0;
>> elapsed time: 0.554241105 seconds (200000256 bytes allocated, 15.04% gc 
>> time) 
>>
>>  I tend to agree with Simon that special casing of integer 2 would be 
>> reasonable.
>>  
>>  Med venlig hilsen
>>
>> Andreas Noack
>>  
>> 2014-09-09 4:24 GMT-04:00 Ján Dolinský:
>>
>>> Hello guys,
>>>
>>> Thanks a lot for the lengthy discussions. It helped me a lot to get a 
>>> feeling on what is Julia like. I did some more performance comparisons as 
>>> suggested by first two posts (thanks a lot for the tips). In the mean time 
>>> I upgraded to v0.3.
>>>  X = rand(7000,7000);
>>> @time d = sum(X.^2, 1);
>>> elapsed time: 0.573125833 seconds (392056672 bytes allocated, 2.25% gc 
>>> time)
>>> @time d = sum(X.*X, 1);
>>> elapsed time: 0.178715901 seconds (392057080 bytes allocated, 14.06% gc 
>>> time)
>>> @time d = sumabs2(X, 1);
>>> elapsed time: 0.067431808 seconds (56496 bytes allocated)
>>>  
>>> In Octave then
>>>  X = rand(7000);
>>> tic; d = sum(X.^2); toc;
>>> Elapsed time is 0.167578 seconds.
>>>  
>>> So the ultimate solution is the sumabs2 function which is a blast. I am 
>>> comming from Matlab/Octave and I would expect X.^2 to be fast "out of the 
>>> box" but nevertheless if I can get an excellent performance by learning 
>>> some new paradigms I will go for it.
>>>
>>> The above tests lead me to another question. I often need to calculate 
>>> the "self" dot product over a portion of a matrix, e.g.
>>>  @time d = sumabs2(X[:,1001:end], 1);
>>> elapsed time: 0.175333366 seconds (336048688 bytes allocated, 7.01% gc 
>>> time)
>>>  
>>> Apparently this is not a way to do it in Julia because working on a 
>>> smaller matrix of 7000x6000 gives more than double computing time and 
>>> furthermore it seems to allocate unnecessary memory.
>>>
>>> Best Regards,
>>> Jan
>>>
>>>
>>>
>>> Dňa pondelok, 8. septembra 2014 10:36:02 UTC+2 Ján Dolinský napísal(-a): 
>>>  
>>>> Hello,
>>>>
>>>> I am a new Julia user. I am trying to write a function for computing 
>>>> "self" dot product of all columns in a matrix, i.e. calculating a square 
>>>> of 
>>>> each element of a matrix and computing a column-wise sum. I am interested 
>>>> in a proper way of doing it because I often need to process large matrices.
>>>>
>>>> I first put a focus on calculating the squares. For testing purposes I 
>>>> use a matrix of random floats of size 7000x7000. All timings here are 
>>>> deducted after several repetitive runs.
>>>>
>>>> I used to do it in Octave (v3.8.1) a follows:
>>>>  tic; X = rand(7000); toc;
>>>> Elapsed time is 0.579093 seconds.
>>>> tic; XX = X.^2; toc;
>>>> Elapsed time is 0.114737 seconds.
>>>>  
>>>>
>>>> I tried to to the same in Julia (v0.2.1):
>>>>  @time X = rand(7000,7000);
>>>> elapsed time: 0.114418731 seconds (392000128 bytes allocated)
>>>> @time XX = X.^2;
>>>> elapsed time: 0.369641268 seconds (392000224 bytes allocated)
>>>>  
>>>> I was surprised to see that Julia is about 3 times slower when 
>>>> calculating a square than my original routine in Octave. I then read 
>>>> "Performance tips" and found out that one should use * instead of of 
>>>> raising to small integer powers, for example x*x*x instead of x^3. I 
>>>> therefore tested the following.
>>>>  @time XX = X.*X;
>>>> elapsed time: 0.146059577 seconds (392000968 bytes allocated)
>>>>  
>>>> This approach indeed resulted in a lot shorter computing time. It is 
>>>> still however a little slower than my code in Octave. Can someone advise 
>>>> on 
>>>> any performance tips ?
>>>>
>>>> I then finally do a sum over all columns of XX to get the "self" dot 
>>>> product but first I'd like to fix the squaring part.
>>>>
>>>> Thanks a lot. 
>>>> Best Regards,
>>>> Jan 
>>>>
>>>> p.s. In Julia manual I found a while ago an example of using @vectorize 
>>>> macro with a squaring function but can not find it any more. Perhaps the 
>>>> name of macro was different ... 
>>>>   
>>>>  
>>>    
>>  
>>    
>  
> 

Reply via email to