Without timing it, one potential performance issue is that taking slices of arrays makes copies rather than views. If you calculate things like column-wise distances without making copies, you should see a speedup.
But I think you'll need profiling to get the complete answer. -- John On May 13, 2014, at 3:36 PM, Thomas Covert <[email protected]> wrote: > I'm trying to wean myself off MATLAB and am trying to figure out how to do > things the julian way by porting some code that fits a Gaussian process. > Something my code does over and over again while fitting the GP is compute > the value of a covariance function - more or less a matrix of pairwise > distances passed through a kernel function. > > I'd love for my julia code to run as fast (or faster) than what I'm currently > attaining in MATLAB, but right now this isn't happening. below I've attached > julia and MATLAB code to do a simplified version of the task: (1) compute 3 > matrices of the pairwise distances between all points in a vector for 3 > vectors - call these M1, M2 and M3 (2) compute exp(theta0 - .5 * > (M1*exp(-2*theta1) + M2*exp(-2*theta2) + M3*exp(-2*theta3))). The 2nd step > is the one that needs to be really fast, since I'm ultimately trying to > estimate the thetas, so the second step gets called many times. > > My MATLAB code is certainly not optimized, but I've been fiddling with the > Julia code for a while, trying devectorization tricks all day. On my machine > (mid-2009 macbook pro), the MATLAB code (including memory allocation, etc) > runs in about 1.2 seconds, with the key part (converting the distance > matrices into a covariance function evaluation) taking about .5 seconds. The > julia code is slower: just the function evaluation of the second piece takes > 1.2 seconds. > > Julia experts, what am I doing wrong? I assume it has something to do with > being smarter about the order of operations across my multi-dimensional > array, but I'm not sure exactly how. Note, the big dimension of this problem > (2699) is what I actually face in practice, so it seemed relevant to test it > here. > > Thanks in advance for any advice! > > MATLAB CODE > % initialization > tic > da = randn(2699,3); > Ds = zeros(size(da,1),size(da,1),size(da,2)); > toc > > % equivalent code for make_Ds! > tic > for i = 1:size(da,2) > Ds(:,:,i) = bsxfun(@minus,da(:,i),da(:,i)') .^ 2; > end > toc > > % more initialization > tic > K = zeros(size(da,1),size(da,1)); > theta = ones(5,1); > toc > > % equivalent code for make_K! > tic > for i = 1:size(Ds,3) > K = K + Ds(:,:,i) * exp(-2 * theta(i + 1)); > end > K = exp(-2 * theta(1) - .5 * K); > toc > > JULIA CODE > using Distance > > ## FUNCTIONS ## > # this function should just fill out a 3-dimensional array of pairwise > # squared distances > function make_Ds!(da::Array{Float64},Ds::Array{Float64}) > for i = 1:size(da,2) > Ds[:,:,i] = pairwise(SqEuclidean(),da[:,i]'); > end > end > > # this function adds up these squared differences, weights them by appropriate > # length scales, and passes them through a normal kernel > function make_K!(theta::Array{Float64},Ds::Array{Float64},K::Array{Float64}) > for i = 1:size(Ds,3) > K += Ds[:,:,i] * exp(-2 * theta[i+1]); > end > K = exp(-2 * theta[1] .- .5 * K); > end > > ## CODE ## > # this code generates 3 std normal vectors, computes the pairwise distances > # between the points, and turns these pairwise distances into a covariance > # function value > > da = randn(2699,3); > > Ds = zeros(size(da,1),size(da,1),size(da,2)); > make_Ds!(da,Ds); > > K = zeros(Float64,size(da,1),size(da,1)); > make_K!(ones(5,1),Ds,K); > > # run things through a second time to avoid accounting for compiler time > tic(); make_Ds!(da,Ds); toc() > K = zeros(Float64,size(da,1),size(da,1)); > tic(); make_K!(ones(5,1),Ds,K); toc() >
