I recommend you read through this blog post - 
http://julialang.org/blog/2013/09/fast-numeric/
And 
maybe 
http://www.johnmyleswhite.com/notebook/2013/12/22/the-relationship-between-vectorized-and-devectorized-code/
 
too while you're at it, just replace "R" with "Octave" or "Matlab" and the 
conclusion is basically the same.

The macro you were looking for was probably @devec from the Devectorize.jl 
package.

If you don't actually need the intermediate matrix full of the 49 million 
squared values, it's a huge waste of resources to allocate memory to store 
them all. If you upgrade to Julia 0.3.0, there's a function sumabs2(x, dim) 
that computes sum(abs(x).^2, dim) without allocating intermediate temporary 
arrays for abs(x).^2. On my laptop sumabs2(X, 1) is about 6 times faster 
and allocates hardly any memory at all relative to sum(X.*X, 1) which needs 
to allocate almost 400 MB.


On Monday, September 8, 2014 1:36:02 AM UTC-7, Ján Dolinský wrote:
>
> Hello,
>
> I am a new Julia user. I am trying to write a function for computing 
> "self" dot product of all columns in a matrix, i.e. calculating a square of 
> each element of a matrix and computing a column-wise sum. I am interested 
> in a proper way of doing it because I often need to process large matrices.
>
> I first put a focus on calculating the squares. For testing purposes I use 
> a matrix of random floats of size 7000x7000. All timings here are deducted 
> after several repetitive runs.
>
> I used to do it in Octave (v3.8.1) a follows:
> tic; X = rand(7000); toc;
> Elapsed time is 0.579093 seconds.
> tic; XX = X.^2; toc;
> Elapsed time is 0.114737 seconds.
>
>
> I tried to to the same in Julia (v0.2.1):
> @time X = rand(7000,7000);
> elapsed time: 0.114418731 seconds (392000128 bytes allocated)
> @time XX = X.^2;
> elapsed time: 0.369641268 seconds (392000224 bytes allocated)
>
> I was surprised to see that Julia is about 3 times slower when calculating 
> a square than my original routine in Octave. I then read "Performance tips" 
> and found out that one should use * instead of of raising to small integer 
> powers, for example x*x*x instead of x^3. I therefore tested the 
> following.
> @time XX = X.*X;
> elapsed time: 0.146059577 seconds (392000968 bytes allocated)
>
> This approach indeed resulted in a lot shorter computing time. It is still 
> however a little slower than my code in Octave. Can someone advise on any 
> performance tips ?
>
> I then finally do a sum over all columns of XX to get the "self" dot 
> product but first I'd like to fix the squaring part.
>
> Thanks a lot. 
> Best Regards,
> Jan 
>
> p.s. In Julia manual I found a while ago an example of using @vectorize 
> macro with a squaring function but can not find it any more. Perhaps the 
> name of macro was different ... 
>   
>

Reply via email to