I'll check. It's just really painful to rebuild with ufc changes... Is it really necessary to rebuild all of dolfin after ufc changes? The dolfin build system is not really doing its job in this situation.
Martin On 9 May 2014 21:55, Anders Logg <[email protected]> wrote: > On Fri, May 09, 2014 at 03:27:20PM +0200, Martin Sandve Alnæs wrote: > > Hi all, > > I've implemented selective local evaluation of coefficient functions in > the > > assembler depending on which functions each integral depends on. It's > currently > > in branches called > > martinal/topic-add-enabled-coefficients-per-integral > > in ufl, ffc and dolfin (must be used together). > > Note that this changes ufc interface so everything must be recompiled. > > > > To show the performance improvement, here's a simple benchmark script, > > assembling two forms (called a and b) that depend on one and two > coefficients > > (f and (f and g) respectively) but yield the exact same integral and > assembly > > result when assembled without any subdomains (the dx(1) term in form b > is never > > executed). Each form is assembled twice for semi-robust timing and I > first ran > > the script to keep the jit out of the picture. (Performance numbers > below the > > code). > > > > > > from dolfin import * > > import time > > > > n = 60 > > mesh = UnitCubeMesh(n, n, n) > > V = FunctionSpace(mesh, "Lagrange", 1) > > f = Function(V) > > g = Function(V) > > > > a = f*dx() > > b = f*dx() + g*dx(1) > > > > t1 = time.time() > > A1 = assemble(a) > > t2 = time.time() > > A2 = assemble(a) > > t3 = time.time() > > > > print "A1:", (t2-t1) > > print "A2:", (t3-t2) > > > > t1 = time.time() > > B1 = assemble(b) > > t2 = time.time() > > B2 = assemble(b) > > t3 = time.time() > > > > print "B1:", (t2-t1) > > print "B2:", (t3-t2) > > > > > > Resulting time to assemble with current master: > > > > A1: 0.467525005341 > > A2: 0.465034008026 > > B1: 0.882906198502 > > B2: 0.830652952194 > > > > Note how the additional coefficient in form b gives very significant > overhead > > for this simple functional even though it's never used in the > computations. > > > > The time to assemble with the new branches: > > > > A1: 0.531542062759 > > A2: 0.530611991882 > > B1: 0.540424108505 > > B2: 0.535769939423 > > > > Note two things: > > The performance is a bit lower for the simple case. It might be possible > to > > optimize this. > > The performance is the same for both cases, significantly faster for > form b > > because the function g is never restricted. > > > > > > The cases that will benefit from this feature performance wise are forms > with > > two or more integrals involving different coefficients. > > > > The cases that will have a small regression performance wise are forms > with > > only one integral, with no coefficients, or where all integrals use the > same > > coefficients. The relative performance regression is most noticable for > simple > > forms such as mass and stiffness matrices. > > > > There are multiple future features that depend on this functionality: > > - it allows for functions that cannot be evaluated everywhere to be > called only > > in their valid domain (examples are functions only living on subdomains, > a > > partially overlapping mesh, or the boundary). > > - possible refactoring of preprocessing in ufl to reduce the amount of > symbolic > > processing done for forms that are already in the jit cache. > > > > The functionality is obviously highly beneficial, so is it ok if I push > it now > > even with the performance regression for simple forms? > > Could you first check what the performance regression is (if any) for > assembling a standard right-hand side vector f*dx and Poisson > stiffness matrix? > > Perhaps this is only noticeable for functionals. > > -- > Anders >
_______________________________________________ fenics mailing list [email protected] http://fenicsproject.org/mailman/listinfo/fenics
