> >> If CPU just skips this instruction we will lost all the prefetching > >> optimizations > >> because all the calls will be replaced by non-existent 'prefetchwt1'. > > > > [Bhanu] I would be worried if core generates an exception treating it as > > illegal instruction. Instead pipeline units treat this as NOP if it > doesn't support it. > > So the micro optimizations doesn't really do any thing on the processors > > that doesn't support it. > > This could be an issue. If someday we'll have real performance optimization > based on OPCH_HTW prefetch, we will have prefetchwt1 on system that supports > it and NOP on others even if they have usual prefetchw which could provide > performance improvement too. > > As I understand, checking of '-mprefetchwt1' is equal to checking compiler > version. It doesn't check anything about supporting of this instruction in > CPU. > This could end up with non-working performance optimizations and even > degradation on systems that supports usual prefetches but not prefetchwt1 > (useless NOPs degrades performance if they are on a hot path). > > IMHO, This compiler option should be passed only if CPU really supports it. > I guess, the maximum that we can do is add a note into performance > optimization > guide that '-mprefetchwt1' could be passed via CFLAGS if user sure that it > supported by target CPU.
That is my thinking as well. The people/organizations building OVS packages for deployment have the responsibility to specify the minimum requirements on the target architecture and feed that into the compiler using CFLAGS. That may well be leaning towards the lower end of capabilities to maximize compatibility and sacrifice some performance on high-end CPUs. The specialized prefetch macros should be mapped to the best available target instructions by the compiler and/or conditional compile directives based on the CFLAGS architecture settings. We would gather all these target-specific compiler optimization guidelines in the advanced DPDK documentation of OVS. Of course developers or benchmark testers are free to use -march=native or similar at their discretion in their local test beds for best possible performance. BR, Jan _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
