Tim,
Thanks for the work on this... and for reporting back with some results.
When you want to work on this again I can definitely help you nail down a
good test problem.
Derek
On Mon, Jan 11, 2016 at 10:02 AM Tim Adowski <trado...@buffalo.edu> wrote:
> Hi all:
>
> I first want to make a clarification about the 10% speedup I originally
> reported. That speedup was for the quadrature point loop only, not the
> total runtime. For my original test case, the overall runtime was approx
> 6.5sec and the QP loop was 0.1sec (not 1sec as was originally given). That
> may not have been entirely clear looking back, so sorry for any confusion.
>
> I played around with adding more variables to the problem, but I hit a
> roadblock. Still using example 3, for a 15x15 mesh and 300 variables, the
> total runtime was 123sec. The build_sparsity() function took up 116sec, or
> about 94% of the total runtime, while the QP loop only grew to about
> 0.23sec. I was hoping to increase the prominence of the QP loop and get a
> better reading on the speedup gained by altering the shape functions, but
> even achieveing a 1sec QP loop time would require a total runtime that is
> far too great for my patience.
>
> I also looked into a FEMSystem example, namely fem_system/ex4. In addition
> to replacing the shape functions, I also brought in the code that evaluates
> 'grad_T' and used my shape functions class as well. Doing both of these
> achieved a speedup of about 12% on the QP loop, which accounts for only
> 4.5% of the total runtime.
>
> At this point, it seems like achieving a meaningful speedup will take more
> time than I currently have at the moment. I have to switch to another
> project in GRINS before the semester starts back up, so it looks like this
> is as far as I can take it at the moment.
>
> Thanks for all the suggestions and help!
>
> ~Tim
>
> On Tue, Jan 5, 2016 at 5:09 PM, Jed Brown <j...@jedbrown.org> wrote:
>
>> Derek Gaston <fried...@gmail.com> writes:
>> > 4. Loading up dense matrices of shape function evaluations so that all
>> > variable's values and gradients (of the same FEType) can be computed
>> > simultaneously with one single mat*vec. This creates perfectly
>> vectorizable
>> > operations. I tried using Eigen for this as well as the stuff in libMesh
>> > and even our own ColumnMajorMatrix implementation. It never made much
>> > difference though... At most a 2x speedup at the extreme end of things.
>>
>> Is it a mat*vec or mat*mat? If you have several variables of the same
>> type, you can stack them as multiple columns instead of blowing up the
>> matrix.
>>
>> Moreover, if you were to rearrange the metric terms, you could evaluate
>> several elements with a single mat*mat. (It's actually fewer flops than
>> pulling metric terms into shape function, unless it can be amortized
>> over many fields.) Also note that if you rearrange metric terms in this
>> way, the left "mat" has tensor product structure when the reference
>> element and quadrature does, so the cost can be further reduced. It's a
>> 3x win for 3D elasticity or viscous flow on Q2 elements.
>>
>
>
>
> --
> *~Tim Adowski*
>
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
Libmesh-devel mailing list
Libmesh-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libmesh-devel