Hi all:
I first want to make a clarification about the 10% speedup I originally
reported. That speedup was for the quadrature point loop only, not the
total runtime. For my original test case, the overall runtime was approx
6.5sec and the QP loop was 0.1sec (not 1sec as was originally given). That
may not have been entirely clear looking back, so sorry for any confusion.
I played around with adding more variables to the problem, but I hit a
roadblock. Still using example 3, for a 15x15 mesh and 300 variables, the
total runtime was 123sec. The build_sparsity() function took up 116sec, or
about 94% of the total runtime, while the QP loop only grew to about
0.23sec. I was hoping to increase the prominence of the QP loop and get a
better reading on the speedup gained by altering the shape functions, but
even achieveing a 1sec QP loop time would require a total runtime that is
far too great for my patience.
I also looked into a FEMSystem example, namely fem_system/ex4. In addition
to replacing the shape functions, I also brought in the code that evaluates
'grad_T' and used my shape functions class as well. Doing both of these
achieved a speedup of about 12% on the QP loop, which accounts for only
4.5% of the total runtime.
At this point, it seems like achieving a meaningful speedup will take more
time than I currently have at the moment. I have to switch to another
project in GRINS before the semester starts back up, so it looks like this
is as far as I can take it at the moment.
Thanks for all the suggestions and help!
~Tim
On Tue, Jan 5, 2016 at 5:09 PM, Jed Brown <j...@jedbrown.org> wrote:
> Derek Gaston <fried...@gmail.com> writes:
> > 4. Loading up dense matrices of shape function evaluations so that all
> > variable's values and gradients (of the same FEType) can be computed
> > simultaneously with one single mat*vec. This creates perfectly
> vectorizable
> > operations. I tried using Eigen for this as well as the stuff in libMesh
> > and even our own ColumnMajorMatrix implementation. It never made much
> > difference though... At most a 2x speedup at the extreme end of things.
>
> Is it a mat*vec or mat*mat? If you have several variables of the same
> type, you can stack them as multiple columns instead of blowing up the
> matrix.
>
> Moreover, if you were to rearrange the metric terms, you could evaluate
> several elements with a single mat*mat. (It's actually fewer flops than
> pulling metric terms into shape function, unless it can be amortized
> over many fields.) Also note that if you rearrange metric terms in this
> way, the left "mat" has tensor product structure when the reference
> element and quadrature does, so the cost can be further reduced. It's a
> 3x win for 3D elasticity or viscous flow on Q2 elements.
>
--
*~Tim Adowski*
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
Libmesh-devel mailing list
Libmesh-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libmesh-devel