On Tue, Jan 3, 2012 at 17:48, Barry Smith <bsmith at mcs.anl.gov> wrote:
> Yes the Blas norm is often a good bit (much) slower than the Blas dot for > the reason Jack points out. This is a very real measurable result using > blas obtained from the Fortran reference that has not been optimized (by > taking out the stability crap) It seems silly to optimize for the reference BLAS. If the concern is just this routine and just on x86-64, I would be inclined to write a simple vectorized implementation (probably using SSE intrinsics) that still includes the stability stuff. Whatever the case, I'm not a fan of replacing nrm2() with dot(). -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20120103/b82351ae/attachment.html>
