Re: [julia-users] speed question from a recovering MATLAB addict

John Myles White Tue, 13 May 2014 16:50:34 -0700

Since K is an array, you need to do K[:] to talk about its contents. At the 
moment, you can rebinding K to new values on each iteration, which won't mutate 
the original value.


 -- John

On May 13, 2014, at 4:48 PM, Thomas Covert <[email protected]> wrote:

> I've now realized an even more confusing problem: the julia function 
> make_K!() that I wrote doesn't seem to actually mutate its third argument but 
> I can't figure out why.  after a call to make_K!() the storage variable K 
> still seems to be zeroed out.  any idea why this is?
> 
> On Tuesday, May 13, 2014 5:43:41 PM UTC-5, John Myles White wrote:
> Without timing it, one potential performance issue is that taking slices of 
> arrays makes copies rather than views. If you calculate things like 
> column-wise distances without making copies, you should see a speedup. 
> 
> But I think you'll need profiling to get the complete answer. 
> 
>  -- John 
> 
> On May 13, 2014, at 3:36 PM, Thomas Covert <[email protected]> wrote: 
> 
> > I'm trying to wean myself off MATLAB and am trying to figure out how to do 
> > things the julian way by porting some code that fits a Gaussian process.  
> > Something my code does over and over again while fitting the GP is compute 
> > the value of a covariance function - more or less a matrix of pairwise 
> > distances passed through a kernel function. 
> > 
> > I'd love for my julia code to run as fast (or faster) than what I'm 
> > currently attaining in MATLAB, but right now this isn't happening.  below 
> > I've attached julia and MATLAB code to do a simplified version of the task: 
> > (1) compute 3 matrices of the pairwise distances between all points in a 
> > vector for 3 vectors - call these M1, M2 and M3 (2) compute exp(theta0 - .5 
> > * (M1*exp(-2*theta1) + M2*exp(-2*theta2) + M3*exp(-2*theta3))).  The 2nd 
> > step is the one that needs to be really fast, since I'm ultimately trying 
> > to estimate the thetas, so the second step gets called many times. 
> > 
> > My MATLAB code is certainly not optimized, but I've been fiddling with the 
> > Julia code for a while, trying devectorization tricks all day.  On my 
> > machine (mid-2009 macbook pro), the MATLAB code (including memory 
> > allocation, etc) runs in about 1.2 seconds, with the key part (converting 
> > the distance matrices into a covariance function evaluation) taking about 
> > .5 seconds.  The julia code is slower: just the function evaluation of the 
> > second piece takes 1.2 seconds. 
> > 
> > Julia experts, what am I doing wrong?  I assume it has something to do with 
> > being smarter about the order of operations across my multi-dimensional 
> > array, but I'm not sure exactly how.  Note, the big dimension of this 
> > problem (2699) is what I actually face in practice, so it seemed relevant 
> > to test it here. 
> > 
> > Thanks in advance for any advice! 
> > 
> > MATLAB CODE 
> > % initialization 
> > tic 
> > da = randn(2699,3); 
> > Ds = zeros(size(da,1),size(da,1),size(da,2)); 
> > toc 
> > 
> > % equivalent code for make_Ds! 
> > tic 
> > for i = 1:size(da,2) 
> >     Ds(:,:,i) = bsxfun(@minus,da(:,i),da(:,i)') .^ 2; 
> > end 
> > toc 
> > 
> > % more initialization 
> > tic 
> > K = zeros(size(da,1),size(da,1)); 
> > theta = ones(5,1); 
> > toc 
> > 
> > % equivalent code for make_K! 
> > tic 
> > for i = 1:size(Ds,3) 
> >     K = K + Ds(:,:,i) * exp(-2 * theta(i + 1)); 
> > end 
> > K = exp(-2 * theta(1) - .5 * K); 
> > toc 
> > 
> > JULIA CODE 
> > using Distance 
> > 
> > ## FUNCTIONS ## 
> > # this function should just fill out a 3-dimensional array of pairwise 
> > # squared distances 
> > function make_Ds!(da::Array{Float64},Ds::Array{Float64}) 
> >     for i = 1:size(da,2) 
> >         Ds[:,:,i] = pairwise(SqEuclidean(),da[:,i]'); 
> >     end 
> > end 
> > 
> > # this function adds up these squared differences, weights them by 
> > appropriate 
> > # length scales, and passes them through a normal kernel 
> > function 
> > make_K!(theta::Array{Float64},Ds::Array{Float64},K::Array{Float64}) 
> >     for i = 1:size(Ds,3) 
> >         K += Ds[:,:,i] * exp(-2 * theta[i+1]); 
> >     end 
> >     K = exp(-2 * theta[1] .- .5 * K); 
> > end 
> > 
> > ## CODE ## 
> > # this code generates 3 std normal vectors, computes the pairwise distances 
> > # between the points, and turns these pairwise distances into a covariance 
> > # function value 
> > 
> > da = randn(2699,3); 
> > 
> > Ds = zeros(size(da,1),size(da,1),size(da,2)); 
> > make_Ds!(da,Ds); 
> > 
> > K = zeros(Float64,size(da,1),size(da,1)); 
> > make_K!(ones(5,1),Ds,K); 
> > 
> > # run things through a second time to avoid accounting for compiler time 
> > tic(); make_Ds!(da,Ds); toc() 
> > K = zeros(Float64,size(da,1),size(da,1)); 
> > tic(); make_K!(ones(5,1),Ds,K); toc() 
> > 
>

Re: [julia-users] speed question from a recovering MATLAB addict

Reply via email to