On Tue, Jan 3, 2012 at 17:48, Barry Smith <bsmith at mcs.anl.gov> wrote:

> Yes the Blas norm is often a good bit (much) slower than the Blas dot for
> the reason Jack points out. This is a very real measurable result using
> blas obtained from the Fortran reference that has not been optimized (by
> taking out the stability crap)


It seems silly to optimize for the reference BLAS. If the concern is just
this routine and just on x86-64, I would be inclined to write a simple
vectorized implementation (probably using SSE intrinsics) that still
includes the stability stuff.

Whatever the case, I'm not a fan of replacing nrm2() with dot().
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20120103/b82351ae/attachment.html>

Reply via email to