At Wed, 2 Jan 2008 15:44:40 +0100, Riccardo Lucchese wrote: > Anyway on calls like ddot this can be 2 times faster than now, and > as in my tests simple functions like ddot don't take any advantage > in using sse ecc.. (especially due to the calls overhead). > > Maybe I'm all wrong :) ? Any other ideas?
To do the equivalent of 'NOSIZECHECK' I would call the cblas_ routines directly, instead of the gsl_blas routines. This will eliminate the size-checking overhead. If function call overhead is still a problem (i.e. 'FORCEINLINE') then your vectors must be quite small, otherwise the computation of the BLAS routine itself will dominate. BLAS was really designed for large vectors. In the small vector case some equivalent inline functions could help. James Bergstra wrote some sample inline BLAS routines a while back but I believe he decided they did not give enough performance advantage unless the vector was extremely small (like a few elements). I think there could be some cases (like DDOT as you say) where it could be worthwhile though. It would make sense to write such functions as an alternative to any cblas_ library. With small vectors one is usually working with a fixed rather than variable length, which is another distinction from the usual BLAS routines. -- Brian Gough Network Theory Ltd, Publishing Free Software Manuals --- http://www.network-theory.co.uk/ _______________________________________________ Help-gsl mailing list [email protected] http://lists.gnu.org/mailman/listinfo/help-gsl
