Re: [math] Performance optimization in the vector classes

luc . maisonobe Mon, 02 Nov 2009 07:41:45 -0800

----- "Jake Mannix" <[email protected]> a écrit :

> Hey all,
> 
>   In digging through ArrayRealVector and OpenMapRealVector, while
> trying to
> draw up patches for MATH-312 and MATH-314, I found a number of
> performance (
> and other) issues:
> 
> 1) in add(double[]), subtract(double[]), and mapXXX methods, the idiom
> is:
> 
>   double[] v = new double[data.length];
>   for(int i=0; i<v.length; i++) {
>     v[i] = data[i] + otherVector[i]; // for example, for add()
>   }
>   return new ArrayRealVector(v);
> 
> if this were instead:
> 
>   double[] v = data.clone();
>   for(int i=0; i<v.length; i++) {
>     v[i] += otherVector[i];
>   }
>   return new ArrayRealVector(v, false);   // <- ack!  not passing
> false here
> forces a clone() call when you least need it!


This is the reason why the boolean is there.

> 
> The difference of doing only one array access inside the loop alone
> saved
> 30% on profiling tests on my laptop with 1M long double arrays.  I
> don't
> even want to think about the cost of making superflous array copies as
> well.

Fine, good catch. Let's implement it.

> 
> 2) In OpenMapRealVector, any case where two OpenMapRealVectors are
> combined
> together, iterating over the smaller should be preferred, but this is
> never
> checked for.

I think we discussed about similar things in the last few months, and simply 
forgot to do it.

> 
> 3) what's with the idiom of:
> 
>   try { return methodName( (ArrayRealVector) v); } catch
> (ClassCastException
> cce) {
> 
> instead of the standard
> 
>   if(v instanceof ArrayRealVector) { return methodName(
> (ArrayRealVector)
> v); } else {
> 
> Is there something I'm missing on why the former shows up all over
> the
> place?  Maybe this isn't a performance issue, but I just noticed it.

In the cast, the check for the class is done internally, and can be optimised 
by the compiler, particularly when this method is inlined. So this is an 
attempt to avoid one if and to let the compiler directly call the specific 
methods for ArrayRealVector which is much faster than the generic one. I 
understand it is a bit weird.

> 
> 4) ArrayRealVector#unitVector() calls norm() once, and then it calls
> it
> again.  That's clearly unnecessary.

You are right. the lat line should simply use the norm local variable which in 
fact was introduced just for this reason.

> Also, why the special case for
> ArithmeticExcption here, for the zero norm case?  Why not just let
> java just
> try to divide, and if it divides by zero, well, it'll throw an
> ArithmeticException itself...  seems like the whole method should just
> be
> implemented as " return mapDivide(norm()); "

No. A division by zero does not lead to ArithmeticException when dealing with 
double variables, it leads to a NaN being produced. ArithmeticException is 
triggered when an integer division by zero occurs.

> 
> I've fixed many of these in my patch on MATH-312, but there's another
> version of that I need to post once I correctly handle the sparse
> nonzero
> defaultValue case.   I also wasn't sure if you guys usually liked to
> separate your fixes into a bunch of little ones or not.  It's just
> hard for
> me to keep a dozen different patches in sync my end without
> conflicting with
> myself now and again.

We really prefer separate patches as we commit them separately.
This allows having a clear history in the svn repository.

Luc

> 
>   -jake

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [math] Performance optimization in the vector classes

Reply via email to