Mani Chandra <[email protected]> writes: > Hi Everyone, > > Would it be a good idea to arrange the data in fastest direction in the > following manner for the ease of aligned loads and vector operations? > > Total grid points = 4n > 0, n, 2n, 3n, 1, n+1, 2n+1, 3n+1 and so on > Ref: "Tuning a Finite Difference Computation for Parallel Vector Processors" > http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6341495
This paper does not solve a problem. It performs AVX optimization for repeated application of a 1D stencil update using an s-step technique. There are many reasons why the present form of this work is impractical, including that it cannot be used with preconditioning, nor with standard Krylov methods, and that generalizing to multiple spatial dimensions, variable coefficients, etc., throws a wrench in the mix. I recommend skipping this premature optimization of 1% of the actual problem and actually get your science done. There are more practical ways to optimize when the time comes.
pgphjucudZjWY.pgp
Description: PGP signature
