> On Feb 13, 2021, at 11:58 PM, Jed Brown <[email protected]> wrote:
>
> I usually configure --with-debugging=0 COPTFLAGS='-O2 -march=native' or
> similar. There's a tension here between optimizing aggressively for the
> current machine and making binaries that work on other machines. Most
> configure systems default to making somewhat portable binaries, so that's a
> principal of least surprise. (Though you're no novice and seem to have been
> surprised anyway.)
>
> I'd kinda prefer if we recommended making portable binaries that run-time
> detected when to use newer instructions where it matters.
How do we do this? What can we put in configure to do this.
Yes, I never paid attention to the AVX nonsense over the years and never
realized that Intel and Gnu (and hence PETSc) both compile by default for
machines I used in my twenties.
Expecting PETSc users to automatically add -march= is not realistic. I will
try to rig something up in configure where if the user does not provide march
something reasonable is selected.
Barry
>
> Barry Smith <[email protected]> writes:
>
>> Shouldn't configure be setting something appropriate for this
>> automatically? This is nuts, it means when users do a ./configure make
>> unless they pass weird arguments they sure as heck don't know about to the
>> compiler they won't get any of the glory that they expect and that has been
>> in almost all Intel systems forever.
>>
>> Barry
>>
>> I run ./configure --with-debugging=0 and I get none of the stuff added by
>> Intel for 15+ years?
>>
>>
>>> On Feb 13, 2021, at 11:26 PM, Jed Brown <[email protected]> wrote:
>>>
>>> Use -march=native or similar. The default target is basic x86_64, which has
>>> only SSE2.
>>>
>>> Barry Smith <[email protected]> writes:
>>>
>>>> PETSc source has code like defined(__AVX2__) in the source but it does not
>>>> seem to be able to find any of these macros (icc or gcc) on the petsc-02
>>>> system
>>>>
>>>> Are these macros supposed to be defined? How does on get them to be
>>>> defined? Why are they not define? What am I doing wrong?
>>>>
>>>> Keep reading
>>>>
>>>> $ lscpu
>>>> Architecture: x86_64
>>>> CPU op-mode(s): 32-bit, 64-bit
>>>> Byte Order: Little Endian
>>>> CPU(s): 64
>>>> On-line CPU(s) list: 0-63
>>>> Thread(s) per core: 2
>>>> Core(s) per socket: 16
>>>> Socket(s): 2
>>>> NUMA node(s): 2
>>>> Vendor ID: GenuineIntel
>>>> CPU family: 6
>>>> Model: 85
>>>> Model name: Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz
>>>> Stepping: 7
>>>> CPU MHz: 1000.603
>>>> CPU max MHz: 2301.0000
>>>> CPU min MHz: 1000.0000
>>>> BogoMIPS: 4600.00
>>>> Virtualization: VT-x
>>>> L1d cache: 32K
>>>> L1i cache: 32K
>>>> L2 cache: 1024K
>>>> L3 cache: 22528K
>>>> NUMA node0 CPU(s): 0-15,32-47
>>>> NUMA node1 CPU(s): 16-31,48-63
>>>> Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
>>>> mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall
>>>> nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl
>>>> xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl
>>>> vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2
>>>> x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm
>>>> abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single intel_ppin
>>>> ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept
>>>> vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm mpx rdt_a
>>>> avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd
>>>> avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc
>>>> cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts pku ospke avx512_vnni
>>>> md_clear flush_l1d arch_capabilities
>>>>
>>>> Test program
>>>>
>>>> #if defined(__FMA__)
>>>> #error FMA
>>>> #endif
>>>>
>>>> #if defined(__AVX512F__)
>>>> #error AVX512F
>>>> #endif
>>>>
>>>> #if defined(__AVX2__)
>>>> #error AVX2
>>>> #endif
>>>>
>>>>
>>>> icc mytest.c
>>>> /usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o: In
>>>> function `_start':
>>>> (.text+0x20): undefined reference to `main'