Pierre Jolivet <[email protected]> writes:
>>>> Expecting PETSc users to automatically add -march= is not realistic. I
>>>> will try to rig something up in configure where if the user does not
>>>> provide march something reasonable is selected.
>>> A softer (yet trivial to implement) option might also be to just alert the
>>> user that these flags exist in the usual message about using default
>>> optimization flags. Something like this would encourage users to do what
>>> Jed is doing:
>>>
>>> ***** WARNING: Using default optimization C flags -g -O3
>>> You might consider manually setting optimal optimization flags for your
>>> system with
>>> COPTFLAGS="optimization flags" see config/examples/arch-*-opt.py for
>>> examples.
>>> In particular, you may want to supply specific flags (e.g. -march=native)
>>> to take advantage of higher-performance instructions.
>>
>> I think this is a reasonable thing to do.
>
> This is a reasonable message to print on the screen, but I don’t think this
> is a reasonable flag to impose by default.
> You are basically asking all package managers to add a new flag
> (-march=generic) which was previously not needed.
>
> I’m crossing my fingers Jed has a clever way of "making portable binaries
> that run-time detected when to use newer instructions where it matters”,
> because -march=native by default is just not practical when deploying
> software.
immintrin.h provides
if (_may_i_use_cpu_feature(_FEATURE_FMA | _FEATURE_AVX2) {
fancy_version_that_needs_fma_and_avx2();
} else {
fallback_version();
}
https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_may_i_use&expand=3677,3677
I believe this function is slightly expensive because it probably calls the
CPUID instruction each time. BLIS has code to cache the result and query
features with simple bitwise math.
https://github.com/flame/blis/blob/master/frame/base/bli_cpuid.h
https://github.com/flame/blis/blob/master/frame/base/bli_cpuid.c
Of course this bit of dispatch should typically be done at object creation
time, not every iteration.