>On Mon, Dec 04, 2017 at 08:16:47PM +0000, Bhanuprakash Bodireddy wrote:
>> Processors support prefetch instruction in anticipation of write but
>> compilers(gcc) won't use them unless explicitly asked to do so even
>> with '-march=native' specified.
>>
>> [Problem]
>>   Case A:
>>     OVS_PREFETCH_CACHE(addr, OPCH_HTW)
>>        __builtin_prefetch(addr, 1, 3)
>>          leaq    -112(%rbp), %rax        [Assembly]
>>          prefetchw  (%rax)
>>
>>   Case B:
>>     OVS_PREFETCH_CACHE(addr, OPCH_LTW)
>>        __builtin_prefetch(addr, 1, 1)
>>          leaq    -112(%rbp), %rax        [Assembly]
>>          prefetchw  (%rax)             <***problem***>
>>
>>   Inspite of specifying -march=native and using Low Temporal
>Write(OPCH_LTW),
>>   the compiler generates 'prefetchw' instruction instead of 'prefetchwt1'
>>   instruction available on processor.
>>
>> [Solution]
>>   Include -mprefetchwt1
>>
>>   Case B:
>>     OVS_PREFETCH_CACHE(addr, OPCH_LTW)
>>        __builtin_prefetch(addr, 1, 1)
>>          leaq    -112(%rbp), %rax        [Assembly]
>>          prefetchwt1  (%rax)
>>
>> [Testing]
>>   $ ./boot.sh
>>   $ ./configure
>>      checking target hint for cgcc... x86_64
>>      checking whether gcc accepts -mprefetchwt1... yes
>>   $ make -j
>>
>> Signed-off-by: Bhanuprakash Bodireddy
>> <[email protected]>
>
>Does this have any effect if the architecture or CPU configured for use does
>not support prefetchwt1?

That's a good question and I spent reasonable time today to figure this out.
I have Haswell, Broadwell and Skylake CPUs and they all support this 
instruction.  But I found that this instruction isn't enabled by default even 
with march=native and so need to explicitly enable this.

Coming to your question, there won't be side effects on using OPCH_LTW.
On Processors that *doesn't* support PREFETCHW and PREFETCHWT1 the compiler 
generates a 'prefetcht1' instruction.
On processors that support PREFETCHW the compiler generates 'prefetchw' 
instruction.
On processors that support PREFETCHW & PREFETCHWT1, the compiler generates 
'prefetchwt1' instruction with -mprefetchwt1 explicitly enabled.

>If it could lead to that situation, then this does not
>seem like the right thing to do, and we might want to fall back to
>recommending use of the option when the person building knows that the
>software will run on a machine with prefetchwt1.

According to above on processors that doesn't have this instruction support, 
'prefetchnt1' instruction would be generated and doesn't have side effects.
I verified this using https://gcc.godbolt.org/  and carefully checking the 
instructions generated for different compiler versions and march flags.

- Bhanuprakash.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to