Hello! In light of the long and interesting discussion we had a while ago about assembler performance I decided to try to squeeze more out of the uBlas backend. This was not very successful.
However, I've been following the development of MTL4 (http://www.osl.iu.edu/research/mtl/mtl4/) with a keen eye on the interesting insertion scheme they provide. I implemented a backend -- without sparsity pattern computation -- for the dolfin assembler and here are some first benchmarks results: Incomp Navier Stokes on 50x50x50 unit cube MTL -------------------------------------------------------- assembly time: 8.510000 reassembly time: 6.750000 vecor assembly time: 6.070000 memory: 230 mb UBLAS ------------------------------------------------------ assembly time: 23.030000 reassembly time: 12.140000 vector assembly time: 6.030000 memory: 642 mb Poisson on 2000x2000 unit square MTL -------------------------------------------------------- assembly time: 9.520000 reassembly time: 6.650000 assembly time: 4.730000 vector linear solve: 0.000000 memory: 452 mb UBLAS ------------------------------------------------------ assembly time: 15.400000 reassembly time: 7.520000 vector assembly time: 5.020000 memory: 1169 mb Conclusions? MTL is more than twice as fast and allocates less than half the memory (since there is no sparsity pattern computation) across a set of forms I've tested. The code is not perfectly done yet, but I'd still be happy to share it with whoever wants to mess around with it. Cheers! /Dag _______________________________________________ DOLFIN-dev mailing list [email protected] http://www.fenics.org/mailman/listinfo/dolfin-dev
