Hi Akira,
> On Oct 13, 2023, at 11:07 PM, Akira Yokosawa <[email protected]> wrote:
>
> Hi Joel,
>
>> On 2023/10/13 10:22, Joel Fernandes (Google) wrote:
>> smp_mb() uses lock;add for x86 in the linux kernel. Add information
>> about the same.
>>
>> Cc: [email protected]
>> Signed-off-by: Joel Fernandes (Google) <[email protected]>
>> ---
>> Not even build tested, just focused on the content and to keep my promise I'd
>> send this out (better than never sending it) ;-). I appreciate maintainers of
>> perfbook taking this forward ;-). Thanks!
>
> I've just tested this...
> And it failed to build.
>
> I think I'll post a v2 which will build, with some wordsmithing
> I can think of.
>
> A few quick comments below.
Thank you very much for your help! I looked through the v2 and everything LGTM.
- Joel
>
>>
>> bib/hw.bib | 8 ++++++++
>
> bib/memorymodel.bib looks like a suitable destination.
>
>> memorder/memorder.tex | 8 ++++++++
>> 2 files changed, 16 insertions(+)
>>
>> diff --git a/bib/hw.bib b/bib/hw.bib
>> index b0885e74..b1dfd119 100644
>> --- a/bib/hw.bib
>> +++ b/bib/hw.bib
>> @@ -1159,3 +1159,11 @@ Luis Stevens and Anoop Gupta and John Hennessy",
>>
>> note="\url{https://github.com/google/fuzzing/blob/master/docs/silifuzz.pdf}",
>> }
>>
>> +@unpublished{Tsirkin2017,
>> + Author="Michael S. Tsirkin",
>> + Title="locking/x86: Use LOCK ADD for smp_mb() instead of MFENCE",
> "_" in title needs an escape.
>
>> + month="November",
>> + day="10",
>> + year="2017",
>> +
>> note="\url{https://lore.kernel.org/all/[email protected]/}",
>> +}
>> diff --git a/memorder/memorder.tex b/memorder/memorder.tex
>> index 5c978fbe..b28ac4f0 100644
>> --- a/memorder/memorder.tex
>> +++ b/memorder/memorder.tex
>> @@ -6081,6 +6081,14 @@ A few older variants of the x86 CPU have a mode bit
>> that enables out-of-order
>> stores, and for these CPUs, \co{smp_wmb()} must also be defined to
>> be \co{lock;addl}.
>>
>> +A 2017 kernel commit by Michael S. Tsirkin replaced \co{mfence} with
>> +\co{lock add} in \co{smp_mb()}, achieving a 60 percent performance
>> +boost~\cite{Tsirkin2017}. The change used a 4-byte negative offset from
> ^
> perfbook's LaTeX source convention needs a line break at the end of a
> sentence.
>
>> +the \co{SP} to avoid slowness due to false data-dependencies,
>> +instead of directly modifying the \co{SP}. \co{clflush} users still
>> +need to use \co{mfence} for ordering, so they have been converted to use
>> +\co{mb} instead of \co{smp_mb}, which uses an \co{mfence} as before.
>> +
>> Although newer x86 implementations accommodate self-modifying code
>> without any special instructions, to be fully compatible with
>> past and potential future x86 implementations, a given CPU must
>
> Anyway, please wait for my v2.
>
> Thanks, Akira