>>On Mon, Dec 04, 2017 at 08:16:47PM +0000, Bhanuprakash Bodireddy wrote:
>>> Processors support prefetch instruction in anticipation of write but
>>> compilers(gcc) won't use them unless explicitly asked to do so even
>>> with '-march=native' specified.
>>>
>>> [Problem]
>>>   Case A:
>>>     OVS_PREFETCH_CACHE(addr, OPCH_HTW)
>>>        __builtin_prefetch(addr, 1, 3)
>>>          leaq    -112(%rbp), %rax        [Assembly]
>>>          prefetchw  (%rax)
>>>
>>>   Case B:
>>>     OVS_PREFETCH_CACHE(addr, OPCH_LTW)
>>>        __builtin_prefetch(addr, 1, 1)
>>>          leaq    -112(%rbp), %rax        [Assembly]
>>>          prefetchw  (%rax)             <***problem***>
>>>
>>>   Inspite of specifying -march=native and using Low Temporal
>>Write(OPCH_LTW),
>>>   the compiler generates 'prefetchw' instruction instead of 'prefetchwt1'
>>>   instruction available on processor.
>>>
>>> [Solution]
>>>   Include -mprefetchwt1
>>>
>>>   Case B:
>>>     OVS_PREFETCH_CACHE(addr, OPCH_LTW)
>>>        __builtin_prefetch(addr, 1, 1)
>>>          leaq    -112(%rbp), %rax        [Assembly]
>>>          prefetchwt1  (%rax)
>>>
>>> [Testing]
>>>   $ ./boot.sh
>>>   $ ./configure
>>>      checking target hint for cgcc... x86_64
>>>      checking whether gcc accepts -mprefetchwt1... yes
>>>   $ make -j
>>>
>>> Signed-off-by: Bhanuprakash Bodireddy
>>> <bhanuprakash.bodireddy at intel.com>
>>
>>Does this have any effect if the architecture or CPU configured for use does
>>not support prefetchwt1?
> 
> That's a good question and I spent reasonable time today to figure this out.
> I have Haswell, Broadwell and Skylake CPUs and they all support this 
> instruction.

Hmm. I have 2 different Broadwell machines (Xeon E5 v4 and i7-6800K) and both 
of them
doesn't have prefetchwt1 instruction according to cpuid:
        
        PREFETCHWT1                              = false

This means that introducing of this change will break binary compatibility even 
between
CPUs of the same generation, i.e. I will not be able to run on my system 
binaries
compiled on yours.

If it's true I prefer to not have this change.

Anyway adding of this change will make compiling a generic binary for a 
different
platforms impossible if your build server supports prefetchwt1. There should be
way to disable this arch specific compiler flag even if it supported on my 
current
platform.

Best regards, Ilya Maximets.

> But I found that this instruction isn't enabled by default even with 
> march=native and so need to explicitly enable this.
> 
> Coming to your question, there won't be side effects on using OPCH_LTW.
> On Processors that *doesn't* support PREFETCHW and PREFETCHWT1 the compiler 
> generates a 'prefetcht1' instruction.
> On processors that support PREFETCHW the compiler generates 'prefetchw' 
> instruction.
> On processors that support PREFETCHW & PREFETCHWT1, the compiler generates 
> 'prefetchwt1' instruction with -mprefetchwt1 explicitly enabled.
> 
>>If it could lead to that situation, then this does not
>>seem like the right thing to do, and we might want to fall back to
>>recommending use of the option when the person building knows that the
>>software will run on a machine with prefetchwt1.
> 
> According to above on processors that doesn't have this instruction support, 
> 'prefetchnt1' instruction would be generated and doesn't have side effects.
> I verified this using https://gcc.godbolt.org/  and carefully checking the 
> instructions generated for different compiler versions and march flags.
> 
> - Bhanuprakash.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to