"iwaddington ." <[email protected]> writes: > Thanks Jed, so if I understood correctly in fact there is nothing to be > done, since I was already recomputing the staggered values when passing > from one node to its neighbour.
I think if you benchmark and don't do something spuriously wasteful, you'll find that computing the averages is not really a bottleneck. What happens if you replace the harmonic average with an arithmetic average (trivially cheap)? If that's a bottleneck, you can unroll-and-jam your stencil operation and vectorize the division.
pgpwOJkk99ySf.pgp
Description: PGP signature
