> working out dispatch in MatCreate_XXX() instead of for each function. Or use compiler extensions for multiversioned functions (I recall GCC has something similar): https://clang.llvm.org/docs/AttributeReference.html#target <https://clang.llvm.org/docs/AttributeReference.html#target>
Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) Cell: (312) 694-3391 > On Feb 14, 2021, at 17:07, Jed Brown <[email protected]> wrote: > > Barry Smith <[email protected]> writes: > >>> On Feb 14, 2021, at 12:13 PM, Jed Brown <[email protected]> wrote: >>> >>> Barry Smith <[email protected]> writes: >>> >>>>> This is a reasonable message to print on the screen, but I don’t think >>>>> this is a reasonable flag to impose by default. >>>>> You are basically asking all package managers to add a new flag >>>>> (-march=generic) which was previously not needed. >>>> >>>> This is a tough constraint, package managers should not have to do >>>> anything to get portability but users get great performance without >>>> needing to be sophisticated. Seems to put the burden on the >>>> unsophisticated folks (all users) and not on the sophisticated folks >>>> (packages). >>> >>> The sophisticated folks are us, the upstream developers who know what needs >>> to be optimized for specific hardware (src/mat/impls/) and what does not >>> (most of the rest of PETSc). We've been typical self-centered scientific >>> software developers who just assume people who care about performance will >>> build a custom library for each machine they run on. >> >> I'm still bit confused. We need to build fat binaries otherwise the stuff >> in src/mat/impls/ will either be not portable or will be generic and >> potentially slower, right? Are you saying it is not worth building fat >> binaries for most of the source? So we mark either by directory or file >> where we want fat binaries built? > > Yes, binaries only need multiple versions for specific parts (kernels that > benefit from vectorization). That keeps them smaller and the build fast. We > can do a better job than "fat" compiler flags by working out dispatch in > MatCreate_XXX() instead of for each function.
