Here's the Java code.
import java.util.Random;
public class LeastSquaresError
{
public static void main(String [] args)
{
int N = 100000;
int K = 100;
double rate = 1e-2;
int ITERATIONS = 100;
double [] y = new double[N];
double [] x = new double[N*K];
double [] w = new double[K];
Random rand = new Random();
for(int n=0;n<N;n++)
{
y[n] = rand.nextDouble();
for(int k=0;k<K;k++)
{
x[n*K + k] = rand.nextDouble();
}
}
for(int k=0;k<K;k++)
{
w[k] = 0.0;
}
long t1 = System.currentTimeMillis();
for(int i=0;i<ITERATIONS;i++)
{
for(int n=0;n<N;n++)
{
double y_hat = 0.0;
for(int k=0;k<K;k++)
{
y_hat += w[k] * x[n*K + k];
}
for(int k=0;k<K;k++)
{
w[k] += rate * (y[n] - y_hat) * x[n*K + k];
}
}
}
long t2 = System.currentTimeMillis();
double elapsed = (double)(t2-t1)/1000.0;
System.out.println(String.format("Time elapsed: %e", elapsed));
}
}
On Sunday, April 27, 2014 2:46:19 PM UTC+8, Elliot Saba wrote:
>
> It might also help to see the equivalent Java code, to make sure that
> we're actually doing the same things. Ivar's comment about temporaries is
> spot on; sometimes it's the things about the language that we take for
> granted that are killing us performance-wise, so it's always best to make
> sure we're comparing apples to oranges.
> -E
>
>
> On Sat, Apr 26, 2014 at 11:44 PM, Ivar Nesje <[email protected]<javascript:>
> > wrote:
>
>> The clue is to structure it more like a c/java program and less like a
>> matlab script. Mathworks has made great efforts to be able to run poorly
>> structured programs fast. Julia focuses on generating fast machine code,
>> but we currently don't optimize well for the common case where global
>> variables don't change their type, so we and up doing the slow multiple
>> dispatch lookup at every step of the loop, instead of only once at compile
>> time.
>>
>> Solution: wrap the code in a function, so that Julia can analyze the
>> types.
>>
>> To get really high performance, it is worth noting that Julia don't have
>> a fast garbage collector. (Nobody really does, but many are apparently
>> faster than ours). It will often be useful to reduce the number of
>> temporarily allocated objects, so that GC kicks in less often.
>>
>> Solution: devectorize your code and manipulate arrays in place, to reduce
>> the number of temporary arrays that are needed.
>>
>
>