On Tue, Jan 3, 2012 at 21:37, Barry Smith <bsmith at mcs.anl.gov> wrote:
> Come on, 95% of all Fortran users wouldn't even understand the above > sentence. Has anyone tried just unrolling the loop four times in C or Fortran, with a separate "counter" for each stripe? The reference implementation will force this to be totally sequential. All we have to do is hit the memory bandwidth limit, which should be pretty easy. Did you have a stand-alone benchmark or were you just measuring with -log_summary? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20120104/24c03b47/attachment.html>
