Re: [julia-users] speed question from a recovering MATLAB addict

Thomas Covert Tue, 13 May 2014 16:59:24 -0700

Gotcha - that does fix the problem.  How does changing the line 
K += Ds[:,:,i] * exp(-2 * theta[i+1]);
to
K[:,:] += Ds[:,:,i] * exp(-2 * theta[i+1]);


differ?  Until I tell the function to refer to a specific piece of K, the 
function doesn't do any work on K?  Still confused.

On Tuesday, May 13, 2014 6:50:11 PM UTC-5, John Myles White wrote:
>
> Since K is an array, you need to do K[:] to talk about its contents. At 
> the moment, you can rebinding K to new values on each iteration, which 
> won't mutate the original value.
>
>  -- John
>
> On May 13, 2014, at 4:48 PM, Thomas Covert <[email protected]<javascript:>> 
> wrote:
>
> I've now realized an even more confusing problem: the julia function 
> make_K!() that I wrote doesn't seem to actually mutate its third argument 
> but I can't figure out why.  after a call to make_K!() the storage variable 
> K still seems to be zeroed out.  any idea why this is?
>
> On Tuesday, May 13, 2014 5:43:41 PM UTC-5, John Myles White wrote:
>>
>> Without timing it, one potential performance issue is that taking slices 
>> of arrays makes copies rather than views. If you calculate things like 
>> column-wise distances without making copies, you should see a speedup. 
>>
>> But I think you'll need profiling to get the complete answer. 
>>
>>  -- John 
>>
>> On May 13, 2014, at 3:36 PM, Thomas Covert <[email protected]> wrote: 
>>
>> > I'm trying to wean myself off MATLAB and am trying to figure out how to 
>> do things the julian way by porting some code that fits a Gaussian process. 
>>  Something my code does over and over again while fitting the GP is compute 
>> the value of a covariance function - more or less a matrix of pairwise 
>> distances passed through a kernel function. 
>> > 
>> > I'd love for my julia code to run as fast (or faster) than what I'm 
>> currently attaining in MATLAB, but right now this isn't happening.  below 
>> I've attached julia and MATLAB code to do a simplified version of the task: 
>> (1) compute 3 matrices of the pairwise distances between all points in a 
>> vector for 3 vectors - call these M1, M2 and M3 (2) compute exp(theta0 - .5 
>> * (M1*exp(-2*theta1) + M2*exp(-2*theta2) + M3*exp(-2*theta3))).  The 2nd 
>> step is the one that needs to be really fast, since I'm ultimately trying 
>> to estimate the thetas, so the second step gets called many times. 
>> > 
>> > My MATLAB code is certainly not optimized, but I've been fiddling with 
>> the Julia code for a while, trying devectorization tricks all day.  On my 
>> machine (mid-2009 macbook pro), the MATLAB code (including memory 
>> allocation, etc) runs in about 1.2 seconds, with the key part (converting 
>> the distance matrices into a covariance function evaluation) taking about 
>> .5 seconds.  The julia code is slower: just the function evaluation of the 
>> second piece takes 1.2 seconds. 
>> > 
>> > Julia experts, what am I doing wrong?  I assume it has something to do 
>> with being smarter about the order of operations across my 
>> multi-dimensional array, but I'm not sure exactly how.  Note, the big 
>> dimension of this problem (2699) is what I actually face in practice, so it 
>> seemed relevant to test it here. 
>> > 
>> > Thanks in advance for any advice! 
>> > 
>> > MATLAB CODE 
>> > % initialization 
>> > tic 
>> > da = randn(2699,3); 
>> > Ds = zeros(size(da,1),size(da,1),size(da,2)); 
>> > toc 
>> > 
>> > % equivalent code for make_Ds! 
>> > tic 
>> > for i = 1:size(da,2) 
>> >     Ds(:,:,i) = bsxfun(@minus,da(:,i),da(:,i)') .^ 2; 
>> > end 
>> > toc 
>> > 
>> > % more initialization 
>> > tic 
>> > K = zeros(size(da,1),size(da,1)); 
>> > theta = ones(5,1); 
>> > toc 
>> > 
>> > % equivalent code for make_K! 
>> > tic 
>> > for i = 1:size(Ds,3) 
>> >     K = K + Ds(:,:,i) * exp(-2 * theta(i + 1)); 
>> > end 
>> > K = exp(-2 * theta(1) - .5 * K); 
>> > toc 
>> > 
>> > JULIA CODE 
>> > using Distance 
>> > 
>> > ## FUNCTIONS ## 
>> > # this function should just fill out a 3-dimensional array of pairwise 
>> > # squared distances 
>> > function make_Ds!(da::Array{Float64},Ds::Array{Float64}) 
>> >     for i = 1:size(da,2) 
>> >         Ds[:,:,i] = pairwise(SqEuclidean(),da[:,i]'); 
>> >     end 
>> > end 
>> > 
>> > # this function adds up these squared differences, weights them by 
>> appropriate 
>> > # length scales, and passes them through a normal kernel 
>> > function 
>> make_K!(theta::Array{Float64},Ds::Array{Float64},K::Array{Float64}) 
>> >     for i = 1:size(Ds,3) 
>> >         K += Ds[:,:,i] * exp(-2 * theta[i+1]); 
>> >     end 
>> >     K = exp(-2 * theta[1] .- .5 * K); 
>> > end 
>> > 
>> > ## CODE ## 
>> > # this code generates 3 std normal vectors, computes the pairwise 
>> distances 
>> > # between the points, and turns these pairwise distances into a 
>> covariance 
>> > # function value 
>> > 
>> > da = randn(2699,3); 
>> > 
>> > Ds = zeros(size(da,1),size(da,1),size(da,2)); 
>> > make_Ds!(da,Ds); 
>> > 
>> > K = zeros(Float64,size(da,1),size(da,1)); 
>> > make_K!(ones(5,1),Ds,K); 
>> > 
>> > # run things through a second time to avoid accounting for compiler 
>> time 
>> > tic(); make_Ds!(da,Ds); toc() 
>> > K = zeros(Float64,size(da,1),size(da,1)); 
>> > tic(); make_K!(ones(5,1),Ds,K); toc() 
>> > 
>>
>>
>

Re: [julia-users] speed question from a recovering MATLAB addict

Reply via email to