On Mon, Jul 21, 2008 at 4:35 PM, Anders Logg <[EMAIL PROTECTED]> wrote: > On Mon, Jul 21, 2008 at 04:03:11PM -0500, Matthew Knepley wrote: >> On Mon, Jul 21, 2008 at 3:55 PM, Matthew Knepley <[EMAIL PROTECTED]> wrote: >> > On Mon, Jul 21, 2008 at 3:50 PM, Garth N. Wells <[EMAIL PROTECTED]> wrote: >> >> >> >> >> >> Anders Logg wrote: >> >>> On Mon, Jul 21, 2008 at 01:48:23PM +0100, Garth N. Wells wrote: >> >>>> >> >>>> Anders Logg wrote: >> >>>>> I have updated the assembly benchmark to include also MTL4, see >> >>>>> >> >>>>> bench/fem/assembly/ >> >>>>> >> >>>>> Here are the current results: >> >>>>> >> >>>>> Assembly benchmark | Elasticity3D PoissonP1 PoissonP2 PoissonP3 >> >>>>> THStokes2D NSEMomentum3D StabStokes2D >> >>>>> ------------------------------------------------------------------------------------------------------------- >> >>>>> uBLAS | 9.0789 0.45645 3.8042 8.0736 >> >>>>> 14.937 9.2507 3.8455 >> >>>>> PETSc | 7.7758 0.42798 3.5483 7.3898 >> >>>>> 13.945 8.1632 3.258 >> >>>>> Epetra | 8.9516 0.45448 3.7976 8.0679 >> >>>>> 15.404 9.2341 3.8332 >> >>>>> MTL4 | 8.9729 0.45554 3.7966 8.0759 >> >>>>> 14.94 9.2568 3.8658 >> >>>>> Assembly | 7.474 0.43673 3.7341 8.3793 >> >>>>> 14.633 7.6695 3.3878 >> >>>>> >> >> >> >> >> >> I specified in MTL4Matrix maximum 30 nonzeroes per row, and the results >> >> change quite a bit, >> >> >> >> Assembly benchmark | Elasticity3D PoissonP1 PoissonP2 PoissonP3 >> >> THStokes2D NSEMomentum3D StabStokes2D >> >> >> >> ------------------------------------------------------------------------------------------------------------- >> >> uBLAS | 7.1881 0.32748 2.7633 5.8311 >> >> 10.968 7.0735 2.8184 >> >> PETSc | 5.7868 0.30673 2.5489 5.2344 >> >> 9.8896 6.069 2.3661 >> >> MTL4 | 2.8641 0.18339 1.6628 2.6811 >> >> 2.8519 3.4843 0.85029 >> >> Assembly | 5.5564 0.30896 2.6858 5.9675 >> >> 10.622 5.7144 2.4519 >> >> >> >> >> >> MTL4 is a lot faster in all cases. >> >> Okay, if you run KSP ex2 (Poisson 2D) and add a logging stage that >> times assembly (I checked it in to petsc-dev) >> then 1M unknowns takes about 1s >> >> Matrix Object: >> type=seqaij, rows=1000000, cols=1000000 >> total: nonzeros=4996000, allocated nonzeros=5000000 >> not using I-node routines >> Summary of Stages: ----- Time ------ ----- Flops ----- --- >> Messages --- -- Message Lengths -- -- Reductions -- >> Avg %Total Avg %Total counts >> %Total Avg %Total counts %Total >> 0: Main Stage: 1.4997e+00 56.3% 3.8891e+08 100.0% 0.000e+00 >> 0.0% 0.000e+00 0.0% 2.200e+01 51.2% >> 1: Assembly: 1.1648e+00 43.7% 0.0000e+00 0.0% 0.000e+00 >> 0.0% 0.000e+00 0.0% 0.000e+00 0.0% >> >> I just cut the solve off. Thus all thos enumber are extemely fishy. >> >> Matt > > We shouldn't trust those numbers just yet. Some of it may be Python > overhead (calling the FFC JIT compiler etc). > > Does 1M unknowns mean a unit square divided into 2x1000x1000 right > triangles?
Its FD Poisson, which gives the same sparsity and values as P1 Poisson, so its a 1000x1000 quadrilateral grid. This was just to time insertion. Matt > -- > Anders > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.6 (GNU/Linux) > > iD8DBQFIhQEeTuwUCDsYZdERAlmBAJ0ZTjPyunGVW+Jf6ea1Md4o2vu9/QCcCOVU > taZu3BUB8Lt5qUGryTs6VIE= > =9Jiu > -----END PGP SIGNATURE----- > > _______________________________________________ > DOLFIN-dev mailing list > [email protected] > http://www.fenics.org/mailman/listinfo/dolfin-dev > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener _______________________________________________ DOLFIN-dev mailing list [email protected] http://www.fenics.org/mailman/listinfo/dolfin-dev
