Great, thanks for checking. -- Anders
On Mon, May 12, 2014 at 11:06:20AM +0200, Martin Sandve Alnæs wrote: > The functional here is the same as in the previous email, just on a newer > faster laptop. On this computer I don't see any slowdown for the functional > either. > > Functional (a=f*dx, b=f*dx+g*dx(1)): > Before: > A: 0.261495113373 > B: 0.431048870087 > After: > A: 0.251796007156 > B: 0.249910116196 > > Linear form (a=f*v*dx, b=f*v*dx+g*v*dx(1)): > Before: > A: 0.302114009857 > B: 0.478947877884 > After: > A: 0.292839050293 > B: 0.29376912117 > > Bilinear form (a=f*v*u*dx, b=f*v*u*dx+g*v*u*dx(1)): > Before: > A: 0.665670156479 > B: 0.849056959152 > After: > A: 0.648552894592 > B: 0.685650110245 > > I'll go ahead with merging then. > > Martin > > > > On 12 May 2014 10:27, Martin Sandve Alnæs <[email protected]> wrote: > > I'll check. It's just really painful to rebuild with ufc changes... Is it > really necessary to rebuild all of dolfin after ufc changes? The dolfin > build system is not really doing its job in this situation. > > Martin > > > On 9 May 2014 21:55, Anders Logg <[email protected]> wrote: > > On Fri, May 09, 2014 at 03:27:20PM +0200, Martin Sandve Alnæs wrote: > > Hi all, > > I've implemented selective local evaluation of coefficient functions > in the > > assembler depending on which functions each integral depends on. > It's > currently > > in branches called > > martinal/topic-add-enabled-coefficients-per-integral > > in ufl, ffc and dolfin (must be used together). > > Note that this changes ufc interface so everything must be > recompiled. > > > > To show the performance improvement, here's a simple benchmark > script, > > assembling two forms (called a and b) that depend on one and two > coefficients > > (f and (f and g) respectively) but yield the exact same integral and > assembly > > result when assembled without any subdomains (the dx(1) term in form > b is never > > executed). Each form is assembled twice for semi-robust timing and I > first ran > > the script to keep the jit out of the picture. (Performance numbers > below the > > code). > > > > > > from dolfin import * > > import time > > > > n = 60 > > mesh = UnitCubeMesh(n, n, n) > > V = FunctionSpace(mesh, "Lagrange", 1) > > f = Function(V) > > g = Function(V) > > > > a = f*dx() > > b = f*dx() + g*dx(1) > > > > t1 = time.time() > > A1 = assemble(a) > > t2 = time.time() > > A2 = assemble(a) > > t3 = time.time() > > > > print "A1:", (t2-t1) > > print "A2:", (t3-t2) > > > > t1 = time.time() > > B1 = assemble(b) > > t2 = time.time() > > B2 = assemble(b) > > t3 = time.time() > > > > print "B1:", (t2-t1) > > print "B2:", (t3-t2) > > > > > > Resulting time to assemble with current master: > > > > A1: 0.467525005341 > > A2: 0.465034008026 > > B1: 0.882906198502 > > B2: 0.830652952194 > > > > Note how the additional coefficient in form b gives very significant > overhead > > for this simple functional even though it's never used in the > computations. > > > > The time to assemble with the new branches: > > > > A1: 0.531542062759 > > A2: 0.530611991882 > > B1: 0.540424108505 > > B2: 0.535769939423 > > > > Note two things: > > The performance is a bit lower for the simple case. It might be > possible to > > optimize this. > > The performance is the same for both cases, significantly faster for > form b > > because the function g is never restricted. > > > > > > The cases that will benefit from this feature performance wise are > forms with > > two or more integrals involving different coefficients. > > > > The cases that will have a small regression performance wise are > forms with > > only one integral, with no coefficients, or where all integrals use > the same > > coefficients. The relative performance regression is most noticable > for simple > > forms such as mass and stiffness matrices. > > > > There are multiple future features that depend on this > functionality: > > - it allows for functions that cannot be evaluated everywhere to be > called only > > in their valid domain (examples are functions only living on > subdomains, a > > partially overlapping mesh, or the boundary). > > - possible refactoring of preprocessing in ufl to reduce the amount > of symbolic > > processing done for forms that are already in the jit cache. > > > > The functionality is obviously highly beneficial, so is it ok if I > push it now > > even with the performance regression for simple forms? > > Could you first check what the performance regression is (if any) for > assembling a standard right-hand side vector f*dx and Poisson > stiffness matrix? > > Perhaps this is only noticeable for functionals. > > > > > _______________________________________________ fenics mailing list [email protected] http://fenicsproject.org/mailman/listinfo/fenics
