[Libmesh-devel] Shape function vectorization

2016-01-04 Thread Tim Adowski
Hello libMesh-devel: I have been working to replace the current std::vector > implementation of the shape functions, with the goal of achieving vectorization when the shape function values are used. ​However, I have only had partial success, and I am all out of ideas. I would appreciate any feedba

Re: [Libmesh-devel] Shape function vectorization

2016-01-04 Thread Roy Stogner
On Mon, 4 Jan 2016, Tim Adowski wrote: > Using the START_LOG/STOP_LOG macros to time the QP loop, I got the following > results for the 1D shape function implementation: > Original: 1028ms (average) > Modified with Fe loop vectorized: 920ms (average) > Difference: ~10% > So there is a significan

Re: [Libmesh-devel] Shape function vectorization

2016-01-04 Thread Roy Stogner
On Mon, 4 Jan 2016, Tim Adowski wrote: > However, all versions of GCC were unable to vectorize the Ke loop > due to "bad data ref", and both Intel versions required "#pragma > ivdep" in order to vectorize the Ke loop. One last thought: is it possible that what is confusing gcc isn't your class,

Re: [Libmesh-devel] Shape function vectorization

2016-01-04 Thread Derek Gaston
Yes, try to do the vector indexing yourself first to see if it's the operator calls that are throwing things off. I did a bunch of work on this myself a few years ago... all I was attempting to speed up was just variable value evaluation... not Re/Ke evaluation as a whole. Let me see if I can dig

Re: [Libmesh-devel] Shape function vectorization

2016-01-04 Thread Derek Gaston
Maybe I never discussed these results anywhere... because I can't find any discussion of them :-) Here are the timing results: https://docs.google.com/spreadsheets/d/1dS0MjyTYRQ_o8pBsuGncJtSYxEYqKw42Chr8ougD1ek/edit?usp=docslist_api I'll attempt to explain a little... All I was timing was variab