Folks, I am wondering how PETSc's BuildSystem currently chooses the memory alignment to use. In the example file I provide in the repo for the Cori KNL nodes, I specify '--with-memalign=64' to match the cache line width. If I don't do this, then configure chooses a 16 byte alignment. Do people think that we should try to make a better effort to choose a more appropriate alignment?
I believe that all modern x86 CPUs use a 64 byte cache width, so should we be defaulting to that? I don't know how much this matters on a Xeon processor, but it can be important on Xeon Phi. --Richard
