[julia-users] speed question from a recovering MATLAB addict

Thomas Covert Tue, 13 May 2014 15:39:33 -0700

I'm trying to wean myself off MATLAB and am trying to figure out how to do 
things the julian way by porting some code that fits a Gaussian process. 
 Something my code does over and over again while fitting the GP is compute 
the value of a covariance function - more or less a matrix of pairwise 
distances passed through a kernel function.


I'd love for my julia code to run as fast (or faster) than what I'm 
currently attaining in MATLAB, but right now this isn't happening.  below 
I've attached julia and MATLAB code to do a simplified version of the task: 
(1) compute 3 matrices of the pairwise distances between all points in a 
vector for 3 vectors - call these M1, M2 and M3 (2) compute exp(theta0 - .5 
* (M1*exp(-2*theta1) + M2*exp(-2*theta2) + M3*exp(-2*theta3))).  The 2nd 
step is the one that needs to be really fast, since I'm ultimately trying 
to estimate the thetas, so the second step gets called many times.

My MATLAB code is certainly not optimized, but I've been fiddling with the 
Julia code for a while, trying devectorization tricks all day.  On my 
machine (mid-2009 macbook pro), the MATLAB code (including memory 
allocation, etc) runs in about 1.2 seconds, with the key part (converting 
the distance matrices into a covariance function evaluation) taking about 
.5 seconds.  The julia code is slower: just the function evaluation of the 
second piece takes 1.2 seconds.

Julia experts, what am I doing wrong?  I assume it has something to do with 
being smarter about the order of operations across my multi-dimensional 
array, but I'm not sure exactly how.  Note, the big dimension of this 
problem (2699) is what I actually face in practice, so it seemed relevant 
to test it here.

Thanks in advance for any advice!

MATLAB CODE
% initialization
tic
da = randn(2699,3);
Ds = zeros(size(da,1),size(da,1),size(da,2));
toc

% equivalent code for make_Ds!
tic
for i = 1:size(da,2)
    Ds(:,:,i) = bsxfun(@minus,da(:,i),da(:,i)') .^ 2;
end
toc

% more initialization
tic
K = zeros(size(da,1),size(da,1));
theta = ones(5,1);
toc

% equivalent code for make_K!
tic
for i = 1:size(Ds,3)
    K = K + Ds(:,:,i) * exp(-2 * theta(i + 1));
end
K = exp(-2 * theta(1) - .5 * K);
toc

JULIA CODE
using Distance

## FUNCTIONS ##
# this function should just fill out a 3-dimensional array of pairwise
# squared distances
function make_Ds!(da::Array{Float64},Ds::Array{Float64})
    for i = 1:size(da,2)
        Ds[:,:,i] = pairwise(SqEuclidean(),da[:,i]');
    end
end

# this function adds up these squared differences, weights them by 
appropriate
# length scales, and passes them through a normal kernel
function make_K!(theta::Array{Float64},Ds::Array{Float64},K::Array{Float64})
    for i = 1:size(Ds,3)
        K += Ds[:,:,i] * exp(-2 * theta[i+1]);
    end
    K = exp(-2 * theta[1] .- .5 * K);
end

## CODE ##
# this code generates 3 std normal vectors, computes the pairwise distances
# between the points, and turns these pairwise distances into a covariance
# function value

da = randn(2699,3);

Ds = zeros(size(da,1),size(da,1),size(da,2));
make_Ds!(da,Ds);

K = zeros(Float64,size(da,1),size(da,1));
make_K!(ones(5,1),Ds,K);

# run things through a second time to avoid accounting for compiler time
tic(); make_Ds!(da,Ds); toc()
K = zeros(Float64,size(da,1),size(da,1));
tic(); make_K!(ones(5,1),Ds,K); toc()

[julia-users] speed question from a recovering MATLAB addict

Reply via email to