Without timing it, one potential performance issue is that taking slices of 
arrays makes copies rather than views. If you calculate things like column-wise 
distances without making copies, you should see a speedup.

But I think you'll need profiling to get the complete answer.

 -- John

On May 13, 2014, at 3:36 PM, Thomas Covert <[email protected]> wrote:

> I'm trying to wean myself off MATLAB and am trying to figure out how to do 
> things the julian way by porting some code that fits a Gaussian process.  
> Something my code does over and over again while fitting the GP is compute 
> the value of a covariance function - more or less a matrix of pairwise 
> distances passed through a kernel function.
> 
> I'd love for my julia code to run as fast (or faster) than what I'm currently 
> attaining in MATLAB, but right now this isn't happening.  below I've attached 
> julia and MATLAB code to do a simplified version of the task: (1) compute 3 
> matrices of the pairwise distances between all points in a vector for 3 
> vectors - call these M1, M2 and M3 (2) compute exp(theta0 - .5 * 
> (M1*exp(-2*theta1) + M2*exp(-2*theta2) + M3*exp(-2*theta3))).  The 2nd step 
> is the one that needs to be really fast, since I'm ultimately trying to 
> estimate the thetas, so the second step gets called many times.
> 
> My MATLAB code is certainly not optimized, but I've been fiddling with the 
> Julia code for a while, trying devectorization tricks all day.  On my machine 
> (mid-2009 macbook pro), the MATLAB code (including memory allocation, etc) 
> runs in about 1.2 seconds, with the key part (converting the distance 
> matrices into a covariance function evaluation) taking about .5 seconds.  The 
> julia code is slower: just the function evaluation of the second piece takes 
> 1.2 seconds.
> 
> Julia experts, what am I doing wrong?  I assume it has something to do with 
> being smarter about the order of operations across my multi-dimensional 
> array, but I'm not sure exactly how.  Note, the big dimension of this problem 
> (2699) is what I actually face in practice, so it seemed relevant to test it 
> here.
> 
> Thanks in advance for any advice!
> 
> MATLAB CODE
> % initialization
> tic
> da = randn(2699,3);
> Ds = zeros(size(da,1),size(da,1),size(da,2));
> toc
> 
> % equivalent code for make_Ds!
> tic
> for i = 1:size(da,2)
>     Ds(:,:,i) = bsxfun(@minus,da(:,i),da(:,i)') .^ 2;
> end
> toc
> 
> % more initialization
> tic
> K = zeros(size(da,1),size(da,1));
> theta = ones(5,1);
> toc
> 
> % equivalent code for make_K!
> tic
> for i = 1:size(Ds,3)
>     K = K + Ds(:,:,i) * exp(-2 * theta(i + 1));
> end
> K = exp(-2 * theta(1) - .5 * K);
> toc
> 
> JULIA CODE
> using Distance
> 
> ## FUNCTIONS ##
> # this function should just fill out a 3-dimensional array of pairwise
> # squared distances
> function make_Ds!(da::Array{Float64},Ds::Array{Float64})
>     for i = 1:size(da,2)
>         Ds[:,:,i] = pairwise(SqEuclidean(),da[:,i]');
>     end
> end
> 
> # this function adds up these squared differences, weights them by appropriate
> # length scales, and passes them through a normal kernel
> function make_K!(theta::Array{Float64},Ds::Array{Float64},K::Array{Float64})
>     for i = 1:size(Ds,3)
>         K += Ds[:,:,i] * exp(-2 * theta[i+1]);
>     end
>     K = exp(-2 * theta[1] .- .5 * K);
> end
> 
> ## CODE ##
> # this code generates 3 std normal vectors, computes the pairwise distances
> # between the points, and turns these pairwise distances into a covariance
> # function value
> 
> da = randn(2699,3);
> 
> Ds = zeros(size(da,1),size(da,1),size(da,2));
> make_Ds!(da,Ds);
> 
> K = zeros(Float64,size(da,1),size(da,1));
> make_K!(ones(5,1),Ds,K);
> 
> # run things through a second time to avoid accounting for compiler time
> tic(); make_Ds!(da,Ds); toc()
> K = zeros(Float64,size(da,1),size(da,1));
> tic(); make_K!(ones(5,1),Ds,K); toc()
> 

Reply via email to