I think we'll try this soon. The idea is that if you have 1000 variables that all have multiple components (say, higher-order monomial on elements). Then we would like all the components to be together so that we can sum each scalar shape function times 1000 components and have it vectorize. As it is now you get all the components for one variable, then the next, then the next.... so there isn't an efficient way to multiply all phi_0 by comp_0 then phi_1 by comp_1 for all variables.
I'll open a ticket so we don't collectively forget :-) Thanks, Derek On Fri, Apr 26, 2019 at 8:09 AM Stogner, Roy H <royst...@ices.utexas.edu> wrote: > > On Thu, 25 Apr 2019, Derek Gaston wrote: > > > This is an email from 3 years ago... no one responded :-) > > Man, and I was just starting to feel proud of myself for starting to > catch up on *months*-old issues... > > > This is coming up again because we're looking at "array variables" > > again... and this would be a large optimization. > > Are you sure? IIRC one of Paul's students experimented years ago with > what I had *thought* would be the lowest-hanging fruit on > vectorization, switching the order of the shape-function and > quadrature-point indices in our FE element-local arrays, but then > reported only minimal speedup on assembly: single-digit percentages, > not double. > > > Any comments? > > Well, there's certainly no harm in trying. I'm done digging about in > DofObject for the #2095 changes, and those were actually surprisingly > orthogonal to the dof_number code anyway, so even if I suddenly hanker > to finish #1438 it might not step on any toes. > > We currently have a ton of code that assumes dof_number is sorted > first by owning processor_id, but other than that we're flexible (e.g. > variable vs node sorting) and we should be able to become more > flexible still without breaking anything. > > > I'm working on some low-level optimization stuff... and one of the > > things I want to do is more vectorization when computing the value > > of variables and when computing residuals, etc. I'm using the > > variable groups stuff to be able to do large vector operations. To > > that end... I think that the current choice for dof-ordering within > > variable groups could be changed to be more amenable to > > vectorization. > > > > Currently DofObject uses dof numbering based on this ordering for > variable groups: > > > > id = base + var_in_vg*ncomp + comp > > > > The problem with this is that I would like to do a vector operation that > is like this: > > > > phi_i * all_dofs_in_var_group_corresponding_to_i > > > > With any FE types that have more than one component the above ordering > means that the dofs corresponding to that shape function are spread > > out in memory (i.e. they're NOT contiguous) and that would preclude > vectorization of the above operation. > > So you're doing operations directly on the DoFs, not on evaluations at > quadrature points? > > > Instead, if we use a dof ordering like this: > > > > id = base + comp*n_var_in_vg + var_in_vg > > > > All of the dofs that need to multiply the same shape function would be > contiguous and easily vectorized. > > > > I don't think this change would effect anyone. We've never guaranteed > this ordering (and it's fairly new anyway)... I think everyone is > > probably using the API instead of thinking of raw memory access like > this (And I know I probably should be too... but I've been doing it > > that way for over 10 years and I have a few applications that have > hundreds to tens-of-thousands of variables now that could really use this > > optimization). > > The most obvious catch here is that dof_number is so far into inner > loops that my usual "make as much stuff runtime-selectable as > possible" demand is completely overridden by performance concerns; > this would have to be a configure-time option IMHO. > > If you want to do it yourself I don't see any objections; if you'd > like me to take first crack then start up an issue and assign me so I > don't forget about the idea again? > --- > Roy
_______________________________________________ Libmesh-devel mailing list Libmesh-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libmesh-devel