The sparse matrix is MUCH to big for the cache so has to stream
through from memory,
thus the huge number of "cache misses". This same performance issue
occurs on
all modern systems.
Barry
On Oct 13, 2008, at 7:12 AM, Christoph Statz wrote:
> Dear PETSc-users,
>
> i'm trying to work with PETSc on a ccNuma-system, where i am
> confronted with severe performance problems.
> Is there anyone using PETSc on e.g. a SGI Altix System?
> Which are the best kernels to use on cache coherent systems?
> The fortran kernels produces many cache misses (in functions like
> fsolve and fmatmul) slowing down a 3GFLOP/s machine to about
> 200MFLOP/s .
> Has anyone any advice to increase speed on ccNuma-system?
>
> Sincerly,
>
> Christoph Statz
>
> --
> Christoph Statz
>
> Institut f?r Nachrichtentechnik
> Technische Universit?t Dresden
> 01062 Dresden
>
> Email: christoph.statz at mailbox.tu-dresden.de
> Phone: +49 351 463 32287
>
>
>