[EMAIL PROTECTED] wrote: > Sounds amazing! > > I'd like to see that code although I can not promise you to > much response during my holiday, which is starting tomorrow. > > Have you compared matrix vector product with vector products using uBlas > or PETSc ?
Will do. More good news: After some discussion about insertion operations on the MTL list, Peter Gottschling (the lead developer of MTL4) implemented some optimizations based on some other bechmarks I ran. For insertion into very sparse matrices (like Poisson) I got a further 30% speedup. I could put the MTL4 experimental stuff in the sandbox. Does that sound good? I'm going on vacation too. /Dag > > Kent > > >> Hello! >> >> In light of the long and interesting discussion we had a while ago about >> assembler performance I decided to try to squeeze more out of the uBlas >> backend. This was not very successful. >> >> However, I've been following the development of MTL4 >> (http://www.osl.iu.edu/research/mtl/mtl4/) with a keen eye on the >> interesting insertion scheme they provide. I implemented a backend -- >> without sparsity pattern computation -- for the dolfin assembler and here >> are some first benchmarks results: >> >> Incomp Navier Stokes on 50x50x50 unit cube >> >> MTL -------------------------------------------------------- >> assembly time: 8.510000 >> reassembly time: 6.750000 >> vecor assembly time: 6.070000 >> >> memory: 230 mb >> >> UBLAS ------------------------------------------------------ >> assembly time: 23.030000 >> reassembly time: 12.140000 >> vector assembly time: 6.030000 >> >> memory: 642 mb >> >> Poisson on 2000x2000 unit square >> >> MTL -------------------------------------------------------- >> assembly time: 9.520000 >> reassembly time: 6.650000 >> assembly time: 4.730000 >> vector linear solve: 0.000000 >> >> memory: 452 mb >> >> UBLAS ------------------------------------------------------ >> assembly time: 15.400000 >> reassembly time: 7.520000 >> vector assembly time: 5.020000 >> >> memory: 1169 mb >> >> Conclusions? MTL is more than twice as fast and allocates less than half >> the memory (since there is no sparsity pattern computation) across a set >> of forms I've tested. >> >> The code is not perfectly done yet, but I'd still be happy to share it >> with whoever wants to mess around with it. >> >> Cheers! >> >> /Dag >> >> _______________________________________________ >> DOLFIN-dev mailing list >> [email protected] >> http://www.fenics.org/mailman/listinfo/dolfin-dev >> > >
signature.asc
Description: OpenPGP digital signature
_______________________________________________ DOLFIN-dev mailing list [email protected] http://www.fenics.org/mailman/listinfo/dolfin-dev
