Tim,
Thanks for the work on this... and for reporting back with some results.
When you want to work on this again I can definitely help you nail down a
good test problem.
Derek
On Mon, Jan 11, 2016 at 10:02 AM Tim Adowski wrote:
> Hi all:
>
> I first want to make a clarification about the 10%
Hi all:
I first want to make a clarification about the 10% speedup I originally
reported. That speedup was for the quadrature point loop only, not the
total runtime. For my original test case, the overall runtime was approx
6.5sec and the QP loop was 0.1sec (not 1sec as was originally given). That
Derek Gaston writes:
> 4. Loading up dense matrices of shape function evaluations so that all
> variable's values and gradients (of the same FEType) can be computed
> simultaneously with one single mat*vec. This creates perfectly vectorizable
> operations. I tried using Eigen for this as well as t
On Tue, Jan 5, 2016 at 8:43 AM, Paul T. Bauman wrote:
>
>
> On Tue, Jan 5, 2016 at 12:30 AM, Roy Stogner
> wrote:
>
>>
>> On Mon, 4 Jan 2016, Tim Adowski wrote:
>>
>> > However, all versions of GCC were unable to vectorize the Ke loop
>> > due to "bad data ref", and both Intel versions required
On Tue, Jan 5, 2016 at 9:33 AM, John Peterson wrote:
>
>
> I wonder if you would have more luck doing your vectorization within
> FEMSystem rather than on "old style" codes like ex3? In FEMSystem, the
> framework has more control over the loops, iteration counters, and variable
> storage, so it m
On Mon, Jan 4, 2016 at 8:58 PM, Tim Adowski wrote:
> Hello libMesh-devel:
>
> Using the START_LOG/STOP_LOG macros to time the QP loop, I got the
> following results for the 1D shape function implementation:
> Original: 1028ms (average)
> Modified with Fe loop vectorized: 920ms (average)
> Differe
On Tue, Jan 5, 2016 at 2:31 AM, Derek Gaston wrote:
> Maybe I never discussed these results anywhere... because I can't find any
> discussion of them :-)
>
> Here are the timing results:
> https://docs.google.com/spreadsheets/d/1dS0MjyTYRQ_o8pBsuGncJtSYxEYqKw42Chr8ougD1ek/edit?usp=docslist_api
On Tue, Jan 5, 2016 at 1:50 AM, Derek Gaston wrote:
>
>
> I did a bunch of work on this myself a few years ago... all I was
> attempting to speed up was just variable value evaluation... not Re/Ke
> evaluation as a whole. Let me see if I can dig up what I did..(I'll do some
> searching and send an
On Tue, Jan 5, 2016 at 12:30 AM, Roy Stogner
wrote:
>
> On Mon, 4 Jan 2016, Tim Adowski wrote:
>
> > However, all versions of GCC were unable to vectorize the Ke loop
> > due to "bad data ref", and both Intel versions required "#pragma
> > ivdep" in order to vectorize the Ke loop.
>
> One last th
Maybe I never discussed these results anywhere... because I can't find any
discussion of them :-)
Here are the timing results:
https://docs.google.com/spreadsheets/d/1dS0MjyTYRQ_o8pBsuGncJtSYxEYqKw42Chr8ougD1ek/edit?usp=docslist_api
I'll attempt to explain a little...
All I was timing was variab
Yes, try to do the vector indexing yourself first to see if it's the
operator calls that are throwing things off.
I did a bunch of work on this myself a few years ago... all I was
attempting to speed up was just variable value evaluation... not Re/Ke
evaluation as a whole. Let me see if I can dig
On Mon, 4 Jan 2016, Tim Adowski wrote:
> However, all versions of GCC were unable to vectorize the Ke loop
> due to "bad data ref", and both Intel versions required "#pragma
> ivdep" in order to vectorize the Ke loop.
One last thought: is it possible that what is confusing gcc isn't your
class,
On Mon, 4 Jan 2016, Tim Adowski wrote:
> Using the START_LOG/STOP_LOG macros to time the QP loop, I got the following
> results for the 1D shape function implementation:
> Original: 1028ms (average)
> Modified with Fe loop vectorized: 920ms (average)
> Difference: ~10%
> So there is a significan
Hello libMesh-devel:
I have been working to replace the current std::vector >
implementation of the shape functions, with the goal of achieving
vectorization when the shape function values are used.
​However, I have only had partial success, and I am all out of ideas. I
would appreciate any feedba
14 matches
Mail list logo