It is possible, though unlikely that the BLAS dot could be faster than the BLAS nrm2, though I am skeptical. The reason is that the result of dnrm2 on a vector u is more stable than the square root of the inner product of u with itself via ddot, as it scales the temporary products of the norm to make the computation more accurate: http://www.netlib.org/blas/dnrm2.f
Thus, if you don't care about accuracy, then it is _possible_ that ddot would be faster, but i doubt it, and it is likely a bad idea to give up on some stability. Jack On Tue, Jan 3, 2012 at 4:33 PM, Jed Brown <jedbrown at mcs.anl.gov> wrote: > http://petsc.cs.iit.edu/petsc/petsc-dev/rev/a8a483b98169 > > This baffles me. I can think of no good reason for this, which gives me > the impression that we are optimizing for an implementation quirk. If you > have evidence that the performance of BLAS dot() is better than nrm2() > across platforms and implementations, then we are witnessing a major > implementation failure and people need to be shamed. > > Aliasing is also *explicitly disallowed* by Fortran, so the result of > > BLASdot_(&bn,xx,&one,xx,&one); > > is not defined. > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20120103/2d6fbf81/attachment.html>
