RE: [PATCH] refcount_t: documentation for memory ordering differences

2017-12-05 Thread Reshetova, Elena
 On Wed, Nov 29, 2017 at 4:36 AM, Elena Reshetova
>  wrote:
> > Some functions from refcount_t API provide different
> > memory ordering guarantees that their atomic counterparts.
> > This adds a document outlining these differences.
> >
> > Signed-off-by: Elena Reshetova 
> 
> Thanks for the improvements!
> 
> I have some markup changes to add, but I'll send that as a separate patch.

Thank you Kees! I guess I was too minimal on my markup use, so doc was pretty 
plain 
before. I have just joined your changes with mine and put both of our sign-off
to the resulting patch. I think this way it is easier for reviewers since 
ultimately
content is the same. 
I will now fix one more thing Randy noticed and then send it to linux-doc and 
Jon Corbet. 

Best Regards,
Elena.
> 
> Acked-by: Kees Cook 
> 
> -Kees
> 
> > ---
> >  Documentation/core-api/index.rst  |   1 +
> >  Documentation/core-api/refcount-vs-atomic.rst | 129
> ++
> >  2 files changed, 130 insertions(+)
> >  create mode 100644 Documentation/core-api/refcount-vs-atomic.rst
> >
> > diff --git a/Documentation/core-api/index.rst 
> > b/Documentation/core-api/index.rst
> > index d5bbe03..d4d54b0 100644
> > --- a/Documentation/core-api/index.rst
> > +++ b/Documentation/core-api/index.rst
> > @@ -14,6 +14,7 @@ Core utilities
> > kernel-api
> > assoc_array
> > atomic_ops
> > +   refcount-vs-atomic
> > cpu_hotplug
> > local_ops
> > workqueue
> > diff --git a/Documentation/core-api/refcount-vs-atomic.rst
> b/Documentation/core-api/refcount-vs-atomic.rst
> > new file mode 100644
> > index 000..5619d48
> > --- /dev/null
> > +++ b/Documentation/core-api/refcount-vs-atomic.rst
> > @@ -0,0 +1,129 @@
> > +===
> > +refcount_t API compared to atomic_t
> > +===
> > +
> > +The goal of refcount_t API is to provide a minimal API for implementing
> > +an object's reference counters. While a generic architecture-independent
> > +implementation from lib/refcount.c uses atomic operations underneath,
> > +there are a number of differences between some of the refcount_*() and
> > +atomic_*() functions with regards to the memory ordering guarantees.
> > +This document outlines the differences and provides respective examples
> > +in order to help maintainers validate their code against the change in
> > +these memory ordering guarantees.
> > +
> > +memory-barriers.txt and atomic_t.txt provide more background to the
> > +memory ordering in general and for atomic operations specifically.
> > +
> > +Relevant types of memory ordering
> > +=
> > +
> > +**Note**: the following section only covers some of the memory
> > +ordering types that are relevant for the atomics and reference
> > +counters and used through this document. For a much broader picture
> > +please consult memory-barriers.txt document.
> > +
> > +In the absence of any memory ordering guarantees (i.e. fully unordered)
> > +atomics & refcounters only provide atomicity and
> > +program order (po) relation (on the same CPU). It guarantees that
> > +each atomic_*() and refcount_*() operation is atomic and instructions
> > +are executed in program order on a single CPU.
> > +This is implemented using READ_ONCE()/WRITE_ONCE() and
> > +compare-and-swap primitives.
> > +
> > +A strong (full) memory ordering guarantees that all prior loads and
> > +stores (all po-earlier instructions) on the same CPU are completed
> > +before any po-later instruction is executed on the same CPU.
> > +It also guarantees that all po-earlier stores on the same CPU
> > +and all propagated stores from other CPUs must propagate to all
> > +other CPUs before any po-later instruction is executed on the original
> > +CPU (A-cumulative property). This is implemented using smp_mb().
> > +
> > +A RELEASE memory ordering guarantees that all prior loads and
> > +stores (all po-earlier instructions) on the same CPU are completed
> > +before the operation. It also guarantees that all po-earlier
> > +stores on the same CPU and all propagated stores from other CPUs
> > +must propagate to all other CPUs before the release operation
> > +(A-cumulative property). This is implemented using smp_store_release().
> > +
> > +A control dependency (on success) for refcounters guarantees that
> > +if a reference for an object was successfully obtained (reference
> > +counter increment or addition happened, function returned true),
> > +then further stores are ordered against this operation.
> > +Control dependency on stores are not implemented using any explicit
> > +barriers, but rely on CPU not to speculate on stores. This is only
> > +a single CPU relation and provides no guarantees for other CPUs.
> > +
> > +
> > +Comparison of functions
> > +===
> > +
> > +case 1) - non-"Read/Modify/Write" (RMW) ops
> > 

RE: [PATCH] refcount_t: documentation for memory ordering differences

2017-12-05 Thread Reshetova, Elena
 On Wed, Nov 29, 2017 at 4:36 AM, Elena Reshetova
>  wrote:
> > Some functions from refcount_t API provide different
> > memory ordering guarantees that their atomic counterparts.
> > This adds a document outlining these differences.
> >
> > Signed-off-by: Elena Reshetova 
> 
> Thanks for the improvements!
> 
> I have some markup changes to add, but I'll send that as a separate patch.

Thank you Kees! I guess I was too minimal on my markup use, so doc was pretty 
plain 
before. I have just joined your changes with mine and put both of our sign-off
to the resulting patch. I think this way it is easier for reviewers since 
ultimately
content is the same. 
I will now fix one more thing Randy noticed and then send it to linux-doc and 
Jon Corbet. 

Best Regards,
Elena.
> 
> Acked-by: Kees Cook 
> 
> -Kees
> 
> > ---
> >  Documentation/core-api/index.rst  |   1 +
> >  Documentation/core-api/refcount-vs-atomic.rst | 129
> ++
> >  2 files changed, 130 insertions(+)
> >  create mode 100644 Documentation/core-api/refcount-vs-atomic.rst
> >
> > diff --git a/Documentation/core-api/index.rst 
> > b/Documentation/core-api/index.rst
> > index d5bbe03..d4d54b0 100644
> > --- a/Documentation/core-api/index.rst
> > +++ b/Documentation/core-api/index.rst
> > @@ -14,6 +14,7 @@ Core utilities
> > kernel-api
> > assoc_array
> > atomic_ops
> > +   refcount-vs-atomic
> > cpu_hotplug
> > local_ops
> > workqueue
> > diff --git a/Documentation/core-api/refcount-vs-atomic.rst
> b/Documentation/core-api/refcount-vs-atomic.rst
> > new file mode 100644
> > index 000..5619d48
> > --- /dev/null
> > +++ b/Documentation/core-api/refcount-vs-atomic.rst
> > @@ -0,0 +1,129 @@
> > +===
> > +refcount_t API compared to atomic_t
> > +===
> > +
> > +The goal of refcount_t API is to provide a minimal API for implementing
> > +an object's reference counters. While a generic architecture-independent
> > +implementation from lib/refcount.c uses atomic operations underneath,
> > +there are a number of differences between some of the refcount_*() and
> > +atomic_*() functions with regards to the memory ordering guarantees.
> > +This document outlines the differences and provides respective examples
> > +in order to help maintainers validate their code against the change in
> > +these memory ordering guarantees.
> > +
> > +memory-barriers.txt and atomic_t.txt provide more background to the
> > +memory ordering in general and for atomic operations specifically.
> > +
> > +Relevant types of memory ordering
> > +=
> > +
> > +**Note**: the following section only covers some of the memory
> > +ordering types that are relevant for the atomics and reference
> > +counters and used through this document. For a much broader picture
> > +please consult memory-barriers.txt document.
> > +
> > +In the absence of any memory ordering guarantees (i.e. fully unordered)
> > +atomics & refcounters only provide atomicity and
> > +program order (po) relation (on the same CPU). It guarantees that
> > +each atomic_*() and refcount_*() operation is atomic and instructions
> > +are executed in program order on a single CPU.
> > +This is implemented using READ_ONCE()/WRITE_ONCE() and
> > +compare-and-swap primitives.
> > +
> > +A strong (full) memory ordering guarantees that all prior loads and
> > +stores (all po-earlier instructions) on the same CPU are completed
> > +before any po-later instruction is executed on the same CPU.
> > +It also guarantees that all po-earlier stores on the same CPU
> > +and all propagated stores from other CPUs must propagate to all
> > +other CPUs before any po-later instruction is executed on the original
> > +CPU (A-cumulative property). This is implemented using smp_mb().
> > +
> > +A RELEASE memory ordering guarantees that all prior loads and
> > +stores (all po-earlier instructions) on the same CPU are completed
> > +before the operation. It also guarantees that all po-earlier
> > +stores on the same CPU and all propagated stores from other CPUs
> > +must propagate to all other CPUs before the release operation
> > +(A-cumulative property). This is implemented using smp_store_release().
> > +
> > +A control dependency (on success) for refcounters guarantees that
> > +if a reference for an object was successfully obtained (reference
> > +counter increment or addition happened, function returned true),
> > +then further stores are ordered against this operation.
> > +Control dependency on stores are not implemented using any explicit
> > +barriers, but rely on CPU not to speculate on stores. This is only
> > +a single CPU relation and provides no guarantees for other CPUs.
> > +
> > +
> > +Comparison of functions
> > +===
> > +
> > +case 1) - non-"Read/Modify/Write" (RMW) ops
> > +---
> > +
> > +Function changes:
> > +   

RE: [PATCH] refcount_t: documentation for memory ordering differences

2017-12-04 Thread Reshetova, Elena
 On 11/29/2017 04:36 AM, Elena Reshetova wrote:
> > Some functions from refcount_t API provide different
> > memory ordering guarantees that their atomic counterparts.
> > This adds a document outlining these differences.
> >
> > Signed-off-by: Elena Reshetova 
> > ---
> >  Documentation/core-api/index.rst  |   1 +
> >  Documentation/core-api/refcount-vs-atomic.rst | 129
> ++
> >  2 files changed, 130 insertions(+)
> >  create mode 100644 Documentation/core-api/refcount-vs-atomic.rst
> 
> > diff --git a/Documentation/core-api/refcount-vs-atomic.rst
> b/Documentation/core-api/refcount-vs-atomic.rst
> > new file mode 100644
> > index 000..5619d48
> > --- /dev/null
> > +++ b/Documentation/core-api/refcount-vs-atomic.rst
> > @@ -0,0 +1,129 @@
> > +===
> > +refcount_t API compared to atomic_t
> > +===
> > +
> > +The goal of refcount_t API is to provide a minimal API for implementing
> > +an object's reference counters. While a generic architecture-independent
> > +implementation from lib/refcount.c uses atomic operations underneath,
> > +there are a number of differences between some of the refcount_*() and
> > +atomic_*() functions with regards to the memory ordering guarantees.
> > +This document outlines the differences and provides respective examples
> > +in order to help maintainers validate their code against the change in
> > +these memory ordering guarantees.
> > +
> > +memory-barriers.txt and atomic_t.txt provide more background to the
> > +memory ordering in general and for atomic operations specifically.
> > +
> > +Relevant types of memory ordering
> > +=
> > +
> > +**Note**: the following section only covers some of the memory
> > +ordering types that are relevant for the atomics and reference
> > +counters and used through this document. For a much broader picture
> > +please consult memory-barriers.txt document.
> > +
> > +In the absence of any memory ordering guarantees (i.e. fully unordered)
> > +atomics & refcounters only provide atomicity and
> > +program order (po) relation (on the same CPU). It guarantees that
> > +each atomic_*() and refcount_*() operation is atomic and instructions
> > +are executed in program order on a single CPU.
> > +This is implemented using READ_ONCE()/WRITE_ONCE() and
> > +compare-and-swap primitives.
> > +
> > +A strong (full) memory ordering guarantees that all prior loads and
> > +stores (all po-earlier instructions) on the same CPU are completed
> > +before any po-later instruction is executed on the same CPU.
> > +It also guarantees that all po-earlier stores on the same CPU
> > +and all propagated stores from other CPUs must propagate to all
> > +other CPUs before any po-later instruction is executed on the original
> > +CPU (A-cumulative property). This is implemented using smp_mb().
> 
> I don't know what "A-cumulative property" means, and google search didn't
> either.
> 
> Is it non-cumulative, similar to typical vs. atypical, where atypical
> roughly means non-typical.  Or is it accumlative (something being
> accumulated, summed up, gathered up)?
> 
> Or is it something else.. TBD?


Sorry, I should have mentioned also explicitly in this document where the terms 
are
coming from. I have mentioned in cover letter, but failed to say here. 
I will fix it. 

Thank you for catching! I see that reply was already given to this by Andrea. 

Best Regards,
Elena


> 
> > +A RELEASE memory ordering guarantees that all prior loads and
> > +stores (all po-earlier instructions) on the same CPU are completed
> > +before the operation. It also guarantees that all po-earlier
> > +stores on the same CPU and all propagated stores from other CPUs
> > +must propagate to all other CPUs before the release operation
> > +(A-cumulative property). This is implemented using smp_store_release().
> 
> thanks.
> --
> ~Randy


RE: [PATCH] refcount_t: documentation for memory ordering differences

2017-12-04 Thread Reshetova, Elena
 On 11/29/2017 04:36 AM, Elena Reshetova wrote:
> > Some functions from refcount_t API provide different
> > memory ordering guarantees that their atomic counterparts.
> > This adds a document outlining these differences.
> >
> > Signed-off-by: Elena Reshetova 
> > ---
> >  Documentation/core-api/index.rst  |   1 +
> >  Documentation/core-api/refcount-vs-atomic.rst | 129
> ++
> >  2 files changed, 130 insertions(+)
> >  create mode 100644 Documentation/core-api/refcount-vs-atomic.rst
> 
> > diff --git a/Documentation/core-api/refcount-vs-atomic.rst
> b/Documentation/core-api/refcount-vs-atomic.rst
> > new file mode 100644
> > index 000..5619d48
> > --- /dev/null
> > +++ b/Documentation/core-api/refcount-vs-atomic.rst
> > @@ -0,0 +1,129 @@
> > +===
> > +refcount_t API compared to atomic_t
> > +===
> > +
> > +The goal of refcount_t API is to provide a minimal API for implementing
> > +an object's reference counters. While a generic architecture-independent
> > +implementation from lib/refcount.c uses atomic operations underneath,
> > +there are a number of differences between some of the refcount_*() and
> > +atomic_*() functions with regards to the memory ordering guarantees.
> > +This document outlines the differences and provides respective examples
> > +in order to help maintainers validate their code against the change in
> > +these memory ordering guarantees.
> > +
> > +memory-barriers.txt and atomic_t.txt provide more background to the
> > +memory ordering in general and for atomic operations specifically.
> > +
> > +Relevant types of memory ordering
> > +=
> > +
> > +**Note**: the following section only covers some of the memory
> > +ordering types that are relevant for the atomics and reference
> > +counters and used through this document. For a much broader picture
> > +please consult memory-barriers.txt document.
> > +
> > +In the absence of any memory ordering guarantees (i.e. fully unordered)
> > +atomics & refcounters only provide atomicity and
> > +program order (po) relation (on the same CPU). It guarantees that
> > +each atomic_*() and refcount_*() operation is atomic and instructions
> > +are executed in program order on a single CPU.
> > +This is implemented using READ_ONCE()/WRITE_ONCE() and
> > +compare-and-swap primitives.
> > +
> > +A strong (full) memory ordering guarantees that all prior loads and
> > +stores (all po-earlier instructions) on the same CPU are completed
> > +before any po-later instruction is executed on the same CPU.
> > +It also guarantees that all po-earlier stores on the same CPU
> > +and all propagated stores from other CPUs must propagate to all
> > +other CPUs before any po-later instruction is executed on the original
> > +CPU (A-cumulative property). This is implemented using smp_mb().
> 
> I don't know what "A-cumulative property" means, and google search didn't
> either.
> 
> Is it non-cumulative, similar to typical vs. atypical, where atypical
> roughly means non-typical.  Or is it accumlative (something being
> accumulated, summed up, gathered up)?
> 
> Or is it something else.. TBD?


Sorry, I should have mentioned also explicitly in this document where the terms 
are
coming from. I have mentioned in cover letter, but failed to say here. 
I will fix it. 

Thank you for catching! I see that reply was already given to this by Andrea. 

Best Regards,
Elena


> 
> > +A RELEASE memory ordering guarantees that all prior loads and
> > +stores (all po-earlier instructions) on the same CPU are completed
> > +before the operation. It also guarantees that all po-earlier
> > +stores on the same CPU and all propagated stores from other CPUs
> > +must propagate to all other CPUs before the release operation
> > +(A-cumulative property). This is implemented using smp_store_release().
> 
> thanks.
> --
> ~Randy


Re: [PATCH] refcount_t: documentation for memory ordering differences

2017-12-03 Thread Randy Dunlap
On 12/02/2017 10:20 PM, Andrea Parri wrote:
> On Fri, Dec 01, 2017 at 12:34:23PM -0800, Randy Dunlap wrote:
>> On 11/29/2017 04:36 AM, Elena Reshetova wrote:
>>> Some functions from refcount_t API provide different
>>> memory ordering guarantees that their atomic counterparts.
>>> This adds a document outlining these differences.
>>>
>>> Signed-off-by: Elena Reshetova 
>>> ---
>>>  Documentation/core-api/index.rst  |   1 +
>>>  Documentation/core-api/refcount-vs-atomic.rst | 129 
>>> ++
>>>  2 files changed, 130 insertions(+)
>>>  create mode 100644 Documentation/core-api/refcount-vs-atomic.rst
>>
>>> diff --git a/Documentation/core-api/refcount-vs-atomic.rst 
>>> b/Documentation/core-api/refcount-vs-atomic.rst
>>> new file mode 100644
>>> index 000..5619d48
>>> --- /dev/null
>>> +++ b/Documentation/core-api/refcount-vs-atomic.rst
>>> @@ -0,0 +1,129 @@
>>> +===
>>> +refcount_t API compared to atomic_t
>>> +===
>>> +
>>> +The goal of refcount_t API is to provide a minimal API for implementing
>>> +an object's reference counters. While a generic architecture-independent
>>> +implementation from lib/refcount.c uses atomic operations underneath,
>>> +there are a number of differences between some of the refcount_*() and
>>> +atomic_*() functions with regards to the memory ordering guarantees.
>>> +This document outlines the differences and provides respective examples
>>> +in order to help maintainers validate their code against the change in
>>> +these memory ordering guarantees.
>>> +
>>> +memory-barriers.txt and atomic_t.txt provide more background to the
>>> +memory ordering in general and for atomic operations specifically.
>>> +
>>> +Relevant types of memory ordering
>>> +=
>>> +
>>> +**Note**: the following section only covers some of the memory
>>> +ordering types that are relevant for the atomics and reference
>>> +counters and used through this document. For a much broader picture
>>> +please consult memory-barriers.txt document.
>>> +
>>> +In the absence of any memory ordering guarantees (i.e. fully unordered)
>>> +atomics & refcounters only provide atomicity and
>>> +program order (po) relation (on the same CPU). It guarantees that
>>> +each atomic_*() and refcount_*() operation is atomic and instructions
>>> +are executed in program order on a single CPU.
>>> +This is implemented using READ_ONCE()/WRITE_ONCE() and
>>> +compare-and-swap primitives.
>>> +
>>> +A strong (full) memory ordering guarantees that all prior loads and
>>> +stores (all po-earlier instructions) on the same CPU are completed
>>> +before any po-later instruction is executed on the same CPU.
>>> +It also guarantees that all po-earlier stores on the same CPU
>>> +and all propagated stores from other CPUs must propagate to all
>>> +other CPUs before any po-later instruction is executed on the original
>>> +CPU (A-cumulative property). This is implemented using smp_mb().
>>
>> I don't know what "A-cumulative property" means, and google search didn't
>> either.
> 
> The description above seems to follow the (informal) definition given in:
> 
>   
> https://github.com/aparri/memory-model/blob/master/Documentation/explanation.txt
>   (c.f., in part., Sect. 13-14)
> 
> and formalized by the LKMM. (The notion of A-cumulativity also appears, in
> different contexts, in some memory consistency literature, e.g.,
> 
>   http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/index.html
>   http://www.cl.cam.ac.uk/~pes20/armv8-mca/
>   https://arxiv.org/abs/1308.6810 )

Got it.  Thanks.


-- 
~Randy


Re: [PATCH] refcount_t: documentation for memory ordering differences

2017-12-03 Thread Randy Dunlap
On 12/02/2017 10:20 PM, Andrea Parri wrote:
> On Fri, Dec 01, 2017 at 12:34:23PM -0800, Randy Dunlap wrote:
>> On 11/29/2017 04:36 AM, Elena Reshetova wrote:
>>> Some functions from refcount_t API provide different
>>> memory ordering guarantees that their atomic counterparts.
>>> This adds a document outlining these differences.
>>>
>>> Signed-off-by: Elena Reshetova 
>>> ---
>>>  Documentation/core-api/index.rst  |   1 +
>>>  Documentation/core-api/refcount-vs-atomic.rst | 129 
>>> ++
>>>  2 files changed, 130 insertions(+)
>>>  create mode 100644 Documentation/core-api/refcount-vs-atomic.rst
>>
>>> diff --git a/Documentation/core-api/refcount-vs-atomic.rst 
>>> b/Documentation/core-api/refcount-vs-atomic.rst
>>> new file mode 100644
>>> index 000..5619d48
>>> --- /dev/null
>>> +++ b/Documentation/core-api/refcount-vs-atomic.rst
>>> @@ -0,0 +1,129 @@
>>> +===
>>> +refcount_t API compared to atomic_t
>>> +===
>>> +
>>> +The goal of refcount_t API is to provide a minimal API for implementing
>>> +an object's reference counters. While a generic architecture-independent
>>> +implementation from lib/refcount.c uses atomic operations underneath,
>>> +there are a number of differences between some of the refcount_*() and
>>> +atomic_*() functions with regards to the memory ordering guarantees.
>>> +This document outlines the differences and provides respective examples
>>> +in order to help maintainers validate their code against the change in
>>> +these memory ordering guarantees.
>>> +
>>> +memory-barriers.txt and atomic_t.txt provide more background to the
>>> +memory ordering in general and for atomic operations specifically.
>>> +
>>> +Relevant types of memory ordering
>>> +=
>>> +
>>> +**Note**: the following section only covers some of the memory
>>> +ordering types that are relevant for the atomics and reference
>>> +counters and used through this document. For a much broader picture
>>> +please consult memory-barriers.txt document.
>>> +
>>> +In the absence of any memory ordering guarantees (i.e. fully unordered)
>>> +atomics & refcounters only provide atomicity and
>>> +program order (po) relation (on the same CPU). It guarantees that
>>> +each atomic_*() and refcount_*() operation is atomic and instructions
>>> +are executed in program order on a single CPU.
>>> +This is implemented using READ_ONCE()/WRITE_ONCE() and
>>> +compare-and-swap primitives.
>>> +
>>> +A strong (full) memory ordering guarantees that all prior loads and
>>> +stores (all po-earlier instructions) on the same CPU are completed
>>> +before any po-later instruction is executed on the same CPU.
>>> +It also guarantees that all po-earlier stores on the same CPU
>>> +and all propagated stores from other CPUs must propagate to all
>>> +other CPUs before any po-later instruction is executed on the original
>>> +CPU (A-cumulative property). This is implemented using smp_mb().
>>
>> I don't know what "A-cumulative property" means, and google search didn't
>> either.
> 
> The description above seems to follow the (informal) definition given in:
> 
>   
> https://github.com/aparri/memory-model/blob/master/Documentation/explanation.txt
>   (c.f., in part., Sect. 13-14)
> 
> and formalized by the LKMM. (The notion of A-cumulativity also appears, in
> different contexts, in some memory consistency literature, e.g.,
> 
>   http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/index.html
>   http://www.cl.cam.ac.uk/~pes20/armv8-mca/
>   https://arxiv.org/abs/1308.6810 )

Got it.  Thanks.


-- 
~Randy


Re: [PATCH] refcount_t: documentation for memory ordering differences

2017-12-02 Thread Andrea Parri
On Sun, Dec 03, 2017 at 07:20:03AM +0100, Andrea Parri wrote:
> On Fri, Dec 01, 2017 at 12:34:23PM -0800, Randy Dunlap wrote:
> > On 11/29/2017 04:36 AM, Elena Reshetova wrote:
> > > Some functions from refcount_t API provide different
> > > memory ordering guarantees that their atomic counterparts.
> > > This adds a document outlining these differences.
> > > 
> > > Signed-off-by: Elena Reshetova 
> > > ---
> > >  Documentation/core-api/index.rst  |   1 +
> > >  Documentation/core-api/refcount-vs-atomic.rst | 129 
> > > ++
> > >  2 files changed, 130 insertions(+)
> > >  create mode 100644 Documentation/core-api/refcount-vs-atomic.rst
> > 
> > > diff --git a/Documentation/core-api/refcount-vs-atomic.rst 
> > > b/Documentation/core-api/refcount-vs-atomic.rst
> > > new file mode 100644
> > > index 000..5619d48
> > > --- /dev/null
> > > +++ b/Documentation/core-api/refcount-vs-atomic.rst
> > > @@ -0,0 +1,129 @@
> > > +===
> > > +refcount_t API compared to atomic_t
> > > +===
> > > +
> > > +The goal of refcount_t API is to provide a minimal API for implementing
> > > +an object's reference counters. While a generic architecture-independent
> > > +implementation from lib/refcount.c uses atomic operations underneath,
> > > +there are a number of differences between some of the refcount_*() and
> > > +atomic_*() functions with regards to the memory ordering guarantees.
> > > +This document outlines the differences and provides respective examples
> > > +in order to help maintainers validate their code against the change in
> > > +these memory ordering guarantees.
> > > +
> > > +memory-barriers.txt and atomic_t.txt provide more background to the
> > > +memory ordering in general and for atomic operations specifically.
> > > +
> > > +Relevant types of memory ordering
> > > +=
> > > +
> > > +**Note**: the following section only covers some of the memory
> > > +ordering types that are relevant for the atomics and reference
> > > +counters and used through this document. For a much broader picture
> > > +please consult memory-barriers.txt document.
> > > +
> > > +In the absence of any memory ordering guarantees (i.e. fully unordered)
> > > +atomics & refcounters only provide atomicity and
> > > +program order (po) relation (on the same CPU). It guarantees that
> > > +each atomic_*() and refcount_*() operation is atomic and instructions
> > > +are executed in program order on a single CPU.
> > > +This is implemented using READ_ONCE()/WRITE_ONCE() and
> > > +compare-and-swap primitives.
> > > +
> > > +A strong (full) memory ordering guarantees that all prior loads and
> > > +stores (all po-earlier instructions) on the same CPU are completed
> > > +before any po-later instruction is executed on the same CPU.
> > > +It also guarantees that all po-earlier stores on the same CPU
> > > +and all propagated stores from other CPUs must propagate to all
> > > +other CPUs before any po-later instruction is executed on the original
> > > +CPU (A-cumulative property). This is implemented using smp_mb().
> > 
> > I don't know what "A-cumulative property" means, and google search didn't
> > either.
> 
> The description above seems to follow the (informal) definition given in:
> 
>   
> https://github.com/aparri/memory-model/blob/master/Documentation/explanation.txt
>   (c.f., in part., Sect. 13-14)
> 
> and formalized by the LKMM. (The notion of A-cumulativity also appears, in
> different contexts, in some memory consistency literature, e.g.,
> 
>   http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/index.html
>   http://www.cl.cam.ac.uk/~pes20/armv8-mca/
>   https://arxiv.org/abs/1308.6810 )
> 
> A typical illustration of A-cumulativity (for smp_store_release(), say) is
> given with the following program:
> 
> int x = 0;
> int y = 0;
> 
> void thread0()
> {
>   WRITE_ONCE(x, 1);
> }
> 
> void thread1()
> {
>   int r0;
> 
>   r0 = READ_ONCE(x);
>   smp_store_release(, 1);
> }
> 
> void thread2()
> {
>   int r1;
>   int r2;
> 
>   r1 = READ_ONCE(y);
>   smp_rmb();
>   r2 = READ_ONCE(x);
> }
> 
> (This is a variation of the so called "message-passing" pattern, where the
>  stores are "distributed" over two threads; see also
> 
>   
> https://github.com/aparri/memory-model/blob/master/litmus-tests/WRC%2Bpooncerelease%2Brmbonceonce%2BOnce.litmus
>  )
> 
> The question we want to address is whether the final state
> 
>   (r0 == 1 && r1 == 1 && r2 == 0)
> 
> can be reached/is allowed, and the answer is no (due to the A-cumulativity
> of the store-release).
> 
> By contrast, dependencies provides no (A-)cumulativity; for example, if we
> modify the previous program by replacing the store-release with a data dep.
> as follows:
> 
> int x = 0;
> int y = 0;
> 
> void thread0()
> {
>   WRITE_ONCE(x, 1);
> }
> 
> void thread1()
> {
>   int r0;

Re: [PATCH] refcount_t: documentation for memory ordering differences

2017-12-02 Thread Andrea Parri
On Sun, Dec 03, 2017 at 07:20:03AM +0100, Andrea Parri wrote:
> On Fri, Dec 01, 2017 at 12:34:23PM -0800, Randy Dunlap wrote:
> > On 11/29/2017 04:36 AM, Elena Reshetova wrote:
> > > Some functions from refcount_t API provide different
> > > memory ordering guarantees that their atomic counterparts.
> > > This adds a document outlining these differences.
> > > 
> > > Signed-off-by: Elena Reshetova 
> > > ---
> > >  Documentation/core-api/index.rst  |   1 +
> > >  Documentation/core-api/refcount-vs-atomic.rst | 129 
> > > ++
> > >  2 files changed, 130 insertions(+)
> > >  create mode 100644 Documentation/core-api/refcount-vs-atomic.rst
> > 
> > > diff --git a/Documentation/core-api/refcount-vs-atomic.rst 
> > > b/Documentation/core-api/refcount-vs-atomic.rst
> > > new file mode 100644
> > > index 000..5619d48
> > > --- /dev/null
> > > +++ b/Documentation/core-api/refcount-vs-atomic.rst
> > > @@ -0,0 +1,129 @@
> > > +===
> > > +refcount_t API compared to atomic_t
> > > +===
> > > +
> > > +The goal of refcount_t API is to provide a minimal API for implementing
> > > +an object's reference counters. While a generic architecture-independent
> > > +implementation from lib/refcount.c uses atomic operations underneath,
> > > +there are a number of differences between some of the refcount_*() and
> > > +atomic_*() functions with regards to the memory ordering guarantees.
> > > +This document outlines the differences and provides respective examples
> > > +in order to help maintainers validate their code against the change in
> > > +these memory ordering guarantees.
> > > +
> > > +memory-barriers.txt and atomic_t.txt provide more background to the
> > > +memory ordering in general and for atomic operations specifically.
> > > +
> > > +Relevant types of memory ordering
> > > +=
> > > +
> > > +**Note**: the following section only covers some of the memory
> > > +ordering types that are relevant for the atomics and reference
> > > +counters and used through this document. For a much broader picture
> > > +please consult memory-barriers.txt document.
> > > +
> > > +In the absence of any memory ordering guarantees (i.e. fully unordered)
> > > +atomics & refcounters only provide atomicity and
> > > +program order (po) relation (on the same CPU). It guarantees that
> > > +each atomic_*() and refcount_*() operation is atomic and instructions
> > > +are executed in program order on a single CPU.
> > > +This is implemented using READ_ONCE()/WRITE_ONCE() and
> > > +compare-and-swap primitives.
> > > +
> > > +A strong (full) memory ordering guarantees that all prior loads and
> > > +stores (all po-earlier instructions) on the same CPU are completed
> > > +before any po-later instruction is executed on the same CPU.
> > > +It also guarantees that all po-earlier stores on the same CPU
> > > +and all propagated stores from other CPUs must propagate to all
> > > +other CPUs before any po-later instruction is executed on the original
> > > +CPU (A-cumulative property). This is implemented using smp_mb().
> > 
> > I don't know what "A-cumulative property" means, and google search didn't
> > either.
> 
> The description above seems to follow the (informal) definition given in:
> 
>   
> https://github.com/aparri/memory-model/blob/master/Documentation/explanation.txt
>   (c.f., in part., Sect. 13-14)
> 
> and formalized by the LKMM. (The notion of A-cumulativity also appears, in
> different contexts, in some memory consistency literature, e.g.,
> 
>   http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/index.html
>   http://www.cl.cam.ac.uk/~pes20/armv8-mca/
>   https://arxiv.org/abs/1308.6810 )
> 
> A typical illustration of A-cumulativity (for smp_store_release(), say) is
> given with the following program:
> 
> int x = 0;
> int y = 0;
> 
> void thread0()
> {
>   WRITE_ONCE(x, 1);
> }
> 
> void thread1()
> {
>   int r0;
> 
>   r0 = READ_ONCE(x);
>   smp_store_release(, 1);
> }
> 
> void thread2()
> {
>   int r1;
>   int r2;
> 
>   r1 = READ_ONCE(y);
>   smp_rmb();
>   r2 = READ_ONCE(x);
> }
> 
> (This is a variation of the so called "message-passing" pattern, where the
>  stores are "distributed" over two threads; see also
> 
>   
> https://github.com/aparri/memory-model/blob/master/litmus-tests/WRC%2Bpooncerelease%2Brmbonceonce%2BOnce.litmus
>  )
> 
> The question we want to address is whether the final state
> 
>   (r0 == 1 && r1 == 1 && r2 == 0)
> 
> can be reached/is allowed, and the answer is no (due to the A-cumulativity
> of the store-release).
> 
> By contrast, dependencies provides no (A-)cumulativity; for example, if we
> modify the previous program by replacing the store-release with a data dep.
> as follows:
> 
> int x = 0;
> int y = 0;
> 
> void thread0()
> {
>   WRITE_ONCE(x, 1);
> }
> 
> void thread1()
> {
>   int r0;
> 
>   r0 = 

Re: [PATCH] refcount_t: documentation for memory ordering differences

2017-12-02 Thread Andrea Parri
On Fri, Dec 01, 2017 at 12:34:23PM -0800, Randy Dunlap wrote:
> On 11/29/2017 04:36 AM, Elena Reshetova wrote:
> > Some functions from refcount_t API provide different
> > memory ordering guarantees that their atomic counterparts.
> > This adds a document outlining these differences.
> > 
> > Signed-off-by: Elena Reshetova 
> > ---
> >  Documentation/core-api/index.rst  |   1 +
> >  Documentation/core-api/refcount-vs-atomic.rst | 129 
> > ++
> >  2 files changed, 130 insertions(+)
> >  create mode 100644 Documentation/core-api/refcount-vs-atomic.rst
> 
> > diff --git a/Documentation/core-api/refcount-vs-atomic.rst 
> > b/Documentation/core-api/refcount-vs-atomic.rst
> > new file mode 100644
> > index 000..5619d48
> > --- /dev/null
> > +++ b/Documentation/core-api/refcount-vs-atomic.rst
> > @@ -0,0 +1,129 @@
> > +===
> > +refcount_t API compared to atomic_t
> > +===
> > +
> > +The goal of refcount_t API is to provide a minimal API for implementing
> > +an object's reference counters. While a generic architecture-independent
> > +implementation from lib/refcount.c uses atomic operations underneath,
> > +there are a number of differences between some of the refcount_*() and
> > +atomic_*() functions with regards to the memory ordering guarantees.
> > +This document outlines the differences and provides respective examples
> > +in order to help maintainers validate their code against the change in
> > +these memory ordering guarantees.
> > +
> > +memory-barriers.txt and atomic_t.txt provide more background to the
> > +memory ordering in general and for atomic operations specifically.
> > +
> > +Relevant types of memory ordering
> > +=
> > +
> > +**Note**: the following section only covers some of the memory
> > +ordering types that are relevant for the atomics and reference
> > +counters and used through this document. For a much broader picture
> > +please consult memory-barriers.txt document.
> > +
> > +In the absence of any memory ordering guarantees (i.e. fully unordered)
> > +atomics & refcounters only provide atomicity and
> > +program order (po) relation (on the same CPU). It guarantees that
> > +each atomic_*() and refcount_*() operation is atomic and instructions
> > +are executed in program order on a single CPU.
> > +This is implemented using READ_ONCE()/WRITE_ONCE() and
> > +compare-and-swap primitives.
> > +
> > +A strong (full) memory ordering guarantees that all prior loads and
> > +stores (all po-earlier instructions) on the same CPU are completed
> > +before any po-later instruction is executed on the same CPU.
> > +It also guarantees that all po-earlier stores on the same CPU
> > +and all propagated stores from other CPUs must propagate to all
> > +other CPUs before any po-later instruction is executed on the original
> > +CPU (A-cumulative property). This is implemented using smp_mb().
> 
> I don't know what "A-cumulative property" means, and google search didn't
> either.

The description above seems to follow the (informal) definition given in:

  
https://github.com/aparri/memory-model/blob/master/Documentation/explanation.txt
  (c.f., in part., Sect. 13-14)

and formalized by the LKMM. (The notion of A-cumulativity also appears, in
different contexts, in some memory consistency literature, e.g.,

  http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/index.html
  http://www.cl.cam.ac.uk/~pes20/armv8-mca/
  https://arxiv.org/abs/1308.6810 )

A typical illustration of A-cumulativity (for smp_store_release(), say) is
given with the following program:

int x = 0;
int y = 0;

void thread0()
{
WRITE_ONCE(x, 1);
}

void thread1()
{
int r0;

r0 = READ_ONCE(x);
smp_store_release(, 1);
}

void thread2()
{
int r1;
int r2;

r1 = READ_ONCE(y);
smp_rmb();
r2 = READ_ONCE(x);
}

(This is a variation of the so called "message-passing" pattern, where the
 stores are "distributed" over two threads; see also

  
https://github.com/aparri/memory-model/blob/master/litmus-tests/WRC%2Bpooncerelease%2Brmbonceonce%2BOnce.litmus
 )

The question we want to address is whether the final state

  (r0 == 1 && r1 == 1 && r2 == 0)

can be reached/is allowed, and the answer is no (due to the A-cumulativity
of the store-release).

By contrast, dependencies provides no (A-)cumulativity; for example, if we
modify the previous program by replacing the store-release with a data dep.
as follows:

int x = 0;
int y = 0;

void thread0()
{
WRITE_ONCE(x, 1);
}

void thread1()
{
int r0;

r0 = READ_ONCE(x);
WRITE_ONCE(x, r0);
}

void thread2()
{
int r1;
int r2;

r1 = READ_ONCE(y);
smp_rmb();
r2 = READ_ONCE(x);
}

then that same final state is allowed (and observed on some PPC machines).

  Andrea


> 
> Is it non-cumulative, similar to 

Re: [PATCH] refcount_t: documentation for memory ordering differences

2017-12-02 Thread Andrea Parri
On Fri, Dec 01, 2017 at 12:34:23PM -0800, Randy Dunlap wrote:
> On 11/29/2017 04:36 AM, Elena Reshetova wrote:
> > Some functions from refcount_t API provide different
> > memory ordering guarantees that their atomic counterparts.
> > This adds a document outlining these differences.
> > 
> > Signed-off-by: Elena Reshetova 
> > ---
> >  Documentation/core-api/index.rst  |   1 +
> >  Documentation/core-api/refcount-vs-atomic.rst | 129 
> > ++
> >  2 files changed, 130 insertions(+)
> >  create mode 100644 Documentation/core-api/refcount-vs-atomic.rst
> 
> > diff --git a/Documentation/core-api/refcount-vs-atomic.rst 
> > b/Documentation/core-api/refcount-vs-atomic.rst
> > new file mode 100644
> > index 000..5619d48
> > --- /dev/null
> > +++ b/Documentation/core-api/refcount-vs-atomic.rst
> > @@ -0,0 +1,129 @@
> > +===
> > +refcount_t API compared to atomic_t
> > +===
> > +
> > +The goal of refcount_t API is to provide a minimal API for implementing
> > +an object's reference counters. While a generic architecture-independent
> > +implementation from lib/refcount.c uses atomic operations underneath,
> > +there are a number of differences between some of the refcount_*() and
> > +atomic_*() functions with regards to the memory ordering guarantees.
> > +This document outlines the differences and provides respective examples
> > +in order to help maintainers validate their code against the change in
> > +these memory ordering guarantees.
> > +
> > +memory-barriers.txt and atomic_t.txt provide more background to the
> > +memory ordering in general and for atomic operations specifically.
> > +
> > +Relevant types of memory ordering
> > +=
> > +
> > +**Note**: the following section only covers some of the memory
> > +ordering types that are relevant for the atomics and reference
> > +counters and used through this document. For a much broader picture
> > +please consult memory-barriers.txt document.
> > +
> > +In the absence of any memory ordering guarantees (i.e. fully unordered)
> > +atomics & refcounters only provide atomicity and
> > +program order (po) relation (on the same CPU). It guarantees that
> > +each atomic_*() and refcount_*() operation is atomic and instructions
> > +are executed in program order on a single CPU.
> > +This is implemented using READ_ONCE()/WRITE_ONCE() and
> > +compare-and-swap primitives.
> > +
> > +A strong (full) memory ordering guarantees that all prior loads and
> > +stores (all po-earlier instructions) on the same CPU are completed
> > +before any po-later instruction is executed on the same CPU.
> > +It also guarantees that all po-earlier stores on the same CPU
> > +and all propagated stores from other CPUs must propagate to all
> > +other CPUs before any po-later instruction is executed on the original
> > +CPU (A-cumulative property). This is implemented using smp_mb().
> 
> I don't know what "A-cumulative property" means, and google search didn't
> either.

The description above seems to follow the (informal) definition given in:

  
https://github.com/aparri/memory-model/blob/master/Documentation/explanation.txt
  (c.f., in part., Sect. 13-14)

and formalized by the LKMM. (The notion of A-cumulativity also appears, in
different contexts, in some memory consistency literature, e.g.,

  http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/index.html
  http://www.cl.cam.ac.uk/~pes20/armv8-mca/
  https://arxiv.org/abs/1308.6810 )

A typical illustration of A-cumulativity (for smp_store_release(), say) is
given with the following program:

int x = 0;
int y = 0;

void thread0()
{
WRITE_ONCE(x, 1);
}

void thread1()
{
int r0;

r0 = READ_ONCE(x);
smp_store_release(, 1);
}

void thread2()
{
int r1;
int r2;

r1 = READ_ONCE(y);
smp_rmb();
r2 = READ_ONCE(x);
}

(This is a variation of the so called "message-passing" pattern, where the
 stores are "distributed" over two threads; see also

  
https://github.com/aparri/memory-model/blob/master/litmus-tests/WRC%2Bpooncerelease%2Brmbonceonce%2BOnce.litmus
 )

The question we want to address is whether the final state

  (r0 == 1 && r1 == 1 && r2 == 0)

can be reached/is allowed, and the answer is no (due to the A-cumulativity
of the store-release).

By contrast, dependencies provides no (A-)cumulativity; for example, if we
modify the previous program by replacing the store-release with a data dep.
as follows:

int x = 0;
int y = 0;

void thread0()
{
WRITE_ONCE(x, 1);
}

void thread1()
{
int r0;

r0 = READ_ONCE(x);
WRITE_ONCE(x, r0);
}

void thread2()
{
int r1;
int r2;

r1 = READ_ONCE(y);
smp_rmb();
r2 = READ_ONCE(x);
}

then that same final state is allowed (and observed on some PPC machines).

  Andrea


> 
> Is it non-cumulative, similar to typical vs. atypical, where 

Re: [PATCH] refcount_t: documentation for memory ordering differences

2017-12-01 Thread Randy Dunlap
On 11/29/2017 04:36 AM, Elena Reshetova wrote:
> Some functions from refcount_t API provide different
> memory ordering guarantees that their atomic counterparts.
> This adds a document outlining these differences.
> 
> Signed-off-by: Elena Reshetova 
> ---
>  Documentation/core-api/index.rst  |   1 +
>  Documentation/core-api/refcount-vs-atomic.rst | 129 
> ++
>  2 files changed, 130 insertions(+)
>  create mode 100644 Documentation/core-api/refcount-vs-atomic.rst

> diff --git a/Documentation/core-api/refcount-vs-atomic.rst 
> b/Documentation/core-api/refcount-vs-atomic.rst
> new file mode 100644
> index 000..5619d48
> --- /dev/null
> +++ b/Documentation/core-api/refcount-vs-atomic.rst
> @@ -0,0 +1,129 @@
> +===
> +refcount_t API compared to atomic_t
> +===
> +
> +The goal of refcount_t API is to provide a minimal API for implementing
> +an object's reference counters. While a generic architecture-independent
> +implementation from lib/refcount.c uses atomic operations underneath,
> +there are a number of differences between some of the refcount_*() and
> +atomic_*() functions with regards to the memory ordering guarantees.
> +This document outlines the differences and provides respective examples
> +in order to help maintainers validate their code against the change in
> +these memory ordering guarantees.
> +
> +memory-barriers.txt and atomic_t.txt provide more background to the
> +memory ordering in general and for atomic operations specifically.
> +
> +Relevant types of memory ordering
> +=
> +
> +**Note**: the following section only covers some of the memory
> +ordering types that are relevant for the atomics and reference
> +counters and used through this document. For a much broader picture
> +please consult memory-barriers.txt document.
> +
> +In the absence of any memory ordering guarantees (i.e. fully unordered)
> +atomics & refcounters only provide atomicity and
> +program order (po) relation (on the same CPU). It guarantees that
> +each atomic_*() and refcount_*() operation is atomic and instructions
> +are executed in program order on a single CPU.
> +This is implemented using READ_ONCE()/WRITE_ONCE() and
> +compare-and-swap primitives.
> +
> +A strong (full) memory ordering guarantees that all prior loads and
> +stores (all po-earlier instructions) on the same CPU are completed
> +before any po-later instruction is executed on the same CPU.
> +It also guarantees that all po-earlier stores on the same CPU
> +and all propagated stores from other CPUs must propagate to all
> +other CPUs before any po-later instruction is executed on the original
> +CPU (A-cumulative property). This is implemented using smp_mb().

I don't know what "A-cumulative property" means, and google search didn't
either.

Is it non-cumulative, similar to typical vs. atypical, where atypical
roughly means non-typical.  Or is it accumlative (something being
accumulated, summed up, gathered up)?

Or is it something else.. TBD?

> +A RELEASE memory ordering guarantees that all prior loads and
> +stores (all po-earlier instructions) on the same CPU are completed
> +before the operation. It also guarantees that all po-earlier
> +stores on the same CPU and all propagated stores from other CPUs
> +must propagate to all other CPUs before the release operation
> +(A-cumulative property). This is implemented using smp_store_release().

thanks.
-- 
~Randy


Re: [PATCH] refcount_t: documentation for memory ordering differences

2017-12-01 Thread Randy Dunlap
On 11/29/2017 04:36 AM, Elena Reshetova wrote:
> Some functions from refcount_t API provide different
> memory ordering guarantees that their atomic counterparts.
> This adds a document outlining these differences.
> 
> Signed-off-by: Elena Reshetova 
> ---
>  Documentation/core-api/index.rst  |   1 +
>  Documentation/core-api/refcount-vs-atomic.rst | 129 
> ++
>  2 files changed, 130 insertions(+)
>  create mode 100644 Documentation/core-api/refcount-vs-atomic.rst

> diff --git a/Documentation/core-api/refcount-vs-atomic.rst 
> b/Documentation/core-api/refcount-vs-atomic.rst
> new file mode 100644
> index 000..5619d48
> --- /dev/null
> +++ b/Documentation/core-api/refcount-vs-atomic.rst
> @@ -0,0 +1,129 @@
> +===
> +refcount_t API compared to atomic_t
> +===
> +
> +The goal of refcount_t API is to provide a minimal API for implementing
> +an object's reference counters. While a generic architecture-independent
> +implementation from lib/refcount.c uses atomic operations underneath,
> +there are a number of differences between some of the refcount_*() and
> +atomic_*() functions with regards to the memory ordering guarantees.
> +This document outlines the differences and provides respective examples
> +in order to help maintainers validate their code against the change in
> +these memory ordering guarantees.
> +
> +memory-barriers.txt and atomic_t.txt provide more background to the
> +memory ordering in general and for atomic operations specifically.
> +
> +Relevant types of memory ordering
> +=
> +
> +**Note**: the following section only covers some of the memory
> +ordering types that are relevant for the atomics and reference
> +counters and used through this document. For a much broader picture
> +please consult memory-barriers.txt document.
> +
> +In the absence of any memory ordering guarantees (i.e. fully unordered)
> +atomics & refcounters only provide atomicity and
> +program order (po) relation (on the same CPU). It guarantees that
> +each atomic_*() and refcount_*() operation is atomic and instructions
> +are executed in program order on a single CPU.
> +This is implemented using READ_ONCE()/WRITE_ONCE() and
> +compare-and-swap primitives.
> +
> +A strong (full) memory ordering guarantees that all prior loads and
> +stores (all po-earlier instructions) on the same CPU are completed
> +before any po-later instruction is executed on the same CPU.
> +It also guarantees that all po-earlier stores on the same CPU
> +and all propagated stores from other CPUs must propagate to all
> +other CPUs before any po-later instruction is executed on the original
> +CPU (A-cumulative property). This is implemented using smp_mb().

I don't know what "A-cumulative property" means, and google search didn't
either.

Is it non-cumulative, similar to typical vs. atypical, where atypical
roughly means non-typical.  Or is it accumlative (something being
accumulated, summed up, gathered up)?

Or is it something else.. TBD?

> +A RELEASE memory ordering guarantees that all prior loads and
> +stores (all po-earlier instructions) on the same CPU are completed
> +before the operation. It also guarantees that all po-earlier
> +stores on the same CPU and all propagated stores from other CPUs
> +must propagate to all other CPUs before the release operation
> +(A-cumulative property). This is implemented using smp_store_release().

thanks.
-- 
~Randy


Re: [PATCH] refcount_t: documentation for memory ordering differences

2017-11-29 Thread Kees Cook
On Wed, Nov 29, 2017 at 4:36 AM, Elena Reshetova
 wrote:
> Some functions from refcount_t API provide different
> memory ordering guarantees that their atomic counterparts.
> This adds a document outlining these differences.
>
> Signed-off-by: Elena Reshetova 

Thanks for the improvements!

I have some markup changes to add, but I'll send that as a separate patch.

Acked-by: Kees Cook 

-Kees

> ---
>  Documentation/core-api/index.rst  |   1 +
>  Documentation/core-api/refcount-vs-atomic.rst | 129 
> ++
>  2 files changed, 130 insertions(+)
>  create mode 100644 Documentation/core-api/refcount-vs-atomic.rst
>
> diff --git a/Documentation/core-api/index.rst 
> b/Documentation/core-api/index.rst
> index d5bbe03..d4d54b0 100644
> --- a/Documentation/core-api/index.rst
> +++ b/Documentation/core-api/index.rst
> @@ -14,6 +14,7 @@ Core utilities
> kernel-api
> assoc_array
> atomic_ops
> +   refcount-vs-atomic
> cpu_hotplug
> local_ops
> workqueue
> diff --git a/Documentation/core-api/refcount-vs-atomic.rst 
> b/Documentation/core-api/refcount-vs-atomic.rst
> new file mode 100644
> index 000..5619d48
> --- /dev/null
> +++ b/Documentation/core-api/refcount-vs-atomic.rst
> @@ -0,0 +1,129 @@
> +===
> +refcount_t API compared to atomic_t
> +===
> +
> +The goal of refcount_t API is to provide a minimal API for implementing
> +an object's reference counters. While a generic architecture-independent
> +implementation from lib/refcount.c uses atomic operations underneath,
> +there are a number of differences between some of the refcount_*() and
> +atomic_*() functions with regards to the memory ordering guarantees.
> +This document outlines the differences and provides respective examples
> +in order to help maintainers validate their code against the change in
> +these memory ordering guarantees.
> +
> +memory-barriers.txt and atomic_t.txt provide more background to the
> +memory ordering in general and for atomic operations specifically.
> +
> +Relevant types of memory ordering
> +=
> +
> +**Note**: the following section only covers some of the memory
> +ordering types that are relevant for the atomics and reference
> +counters and used through this document. For a much broader picture
> +please consult memory-barriers.txt document.
> +
> +In the absence of any memory ordering guarantees (i.e. fully unordered)
> +atomics & refcounters only provide atomicity and
> +program order (po) relation (on the same CPU). It guarantees that
> +each atomic_*() and refcount_*() operation is atomic and instructions
> +are executed in program order on a single CPU.
> +This is implemented using READ_ONCE()/WRITE_ONCE() and
> +compare-and-swap primitives.
> +
> +A strong (full) memory ordering guarantees that all prior loads and
> +stores (all po-earlier instructions) on the same CPU are completed
> +before any po-later instruction is executed on the same CPU.
> +It also guarantees that all po-earlier stores on the same CPU
> +and all propagated stores from other CPUs must propagate to all
> +other CPUs before any po-later instruction is executed on the original
> +CPU (A-cumulative property). This is implemented using smp_mb().
> +
> +A RELEASE memory ordering guarantees that all prior loads and
> +stores (all po-earlier instructions) on the same CPU are completed
> +before the operation. It also guarantees that all po-earlier
> +stores on the same CPU and all propagated stores from other CPUs
> +must propagate to all other CPUs before the release operation
> +(A-cumulative property). This is implemented using smp_store_release().
> +
> +A control dependency (on success) for refcounters guarantees that
> +if a reference for an object was successfully obtained (reference
> +counter increment or addition happened, function returned true),
> +then further stores are ordered against this operation.
> +Control dependency on stores are not implemented using any explicit
> +barriers, but rely on CPU not to speculate on stores. This is only
> +a single CPU relation and provides no guarantees for other CPUs.
> +
> +
> +Comparison of functions
> +===
> +
> +case 1) - non-"Read/Modify/Write" (RMW) ops
> +---
> +
> +Function changes:
> +atomic_set() --> refcount_set()
> +atomic_read() --> refcount_read()
> +
> +Memory ordering guarantee changes:
> +none (both fully unordered)
> +
> +case 2) - increment-based ops that return no value
> +--
> +
> +Function changes:
> +atomic_inc() --> refcount_inc()
> +atomic_add() --> refcount_add()
> +
> +Memory ordering guarantee changes:
> +none (both fully unordered)
> +
> +
> 

Re: [PATCH] refcount_t: documentation for memory ordering differences

2017-11-29 Thread Kees Cook
On Wed, Nov 29, 2017 at 4:36 AM, Elena Reshetova
 wrote:
> Some functions from refcount_t API provide different
> memory ordering guarantees that their atomic counterparts.
> This adds a document outlining these differences.
>
> Signed-off-by: Elena Reshetova 

Thanks for the improvements!

I have some markup changes to add, but I'll send that as a separate patch.

Acked-by: Kees Cook 

-Kees

> ---
>  Documentation/core-api/index.rst  |   1 +
>  Documentation/core-api/refcount-vs-atomic.rst | 129 
> ++
>  2 files changed, 130 insertions(+)
>  create mode 100644 Documentation/core-api/refcount-vs-atomic.rst
>
> diff --git a/Documentation/core-api/index.rst 
> b/Documentation/core-api/index.rst
> index d5bbe03..d4d54b0 100644
> --- a/Documentation/core-api/index.rst
> +++ b/Documentation/core-api/index.rst
> @@ -14,6 +14,7 @@ Core utilities
> kernel-api
> assoc_array
> atomic_ops
> +   refcount-vs-atomic
> cpu_hotplug
> local_ops
> workqueue
> diff --git a/Documentation/core-api/refcount-vs-atomic.rst 
> b/Documentation/core-api/refcount-vs-atomic.rst
> new file mode 100644
> index 000..5619d48
> --- /dev/null
> +++ b/Documentation/core-api/refcount-vs-atomic.rst
> @@ -0,0 +1,129 @@
> +===
> +refcount_t API compared to atomic_t
> +===
> +
> +The goal of refcount_t API is to provide a minimal API for implementing
> +an object's reference counters. While a generic architecture-independent
> +implementation from lib/refcount.c uses atomic operations underneath,
> +there are a number of differences between some of the refcount_*() and
> +atomic_*() functions with regards to the memory ordering guarantees.
> +This document outlines the differences and provides respective examples
> +in order to help maintainers validate their code against the change in
> +these memory ordering guarantees.
> +
> +memory-barriers.txt and atomic_t.txt provide more background to the
> +memory ordering in general and for atomic operations specifically.
> +
> +Relevant types of memory ordering
> +=
> +
> +**Note**: the following section only covers some of the memory
> +ordering types that are relevant for the atomics and reference
> +counters and used through this document. For a much broader picture
> +please consult memory-barriers.txt document.
> +
> +In the absence of any memory ordering guarantees (i.e. fully unordered)
> +atomics & refcounters only provide atomicity and
> +program order (po) relation (on the same CPU). It guarantees that
> +each atomic_*() and refcount_*() operation is atomic and instructions
> +are executed in program order on a single CPU.
> +This is implemented using READ_ONCE()/WRITE_ONCE() and
> +compare-and-swap primitives.
> +
> +A strong (full) memory ordering guarantees that all prior loads and
> +stores (all po-earlier instructions) on the same CPU are completed
> +before any po-later instruction is executed on the same CPU.
> +It also guarantees that all po-earlier stores on the same CPU
> +and all propagated stores from other CPUs must propagate to all
> +other CPUs before any po-later instruction is executed on the original
> +CPU (A-cumulative property). This is implemented using smp_mb().
> +
> +A RELEASE memory ordering guarantees that all prior loads and
> +stores (all po-earlier instructions) on the same CPU are completed
> +before the operation. It also guarantees that all po-earlier
> +stores on the same CPU and all propagated stores from other CPUs
> +must propagate to all other CPUs before the release operation
> +(A-cumulative property). This is implemented using smp_store_release().
> +
> +A control dependency (on success) for refcounters guarantees that
> +if a reference for an object was successfully obtained (reference
> +counter increment or addition happened, function returned true),
> +then further stores are ordered against this operation.
> +Control dependency on stores are not implemented using any explicit
> +barriers, but rely on CPU not to speculate on stores. This is only
> +a single CPU relation and provides no guarantees for other CPUs.
> +
> +
> +Comparison of functions
> +===
> +
> +case 1) - non-"Read/Modify/Write" (RMW) ops
> +---
> +
> +Function changes:
> +atomic_set() --> refcount_set()
> +atomic_read() --> refcount_read()
> +
> +Memory ordering guarantee changes:
> +none (both fully unordered)
> +
> +case 2) - increment-based ops that return no value
> +--
> +
> +Function changes:
> +atomic_inc() --> refcount_inc()
> +atomic_add() --> refcount_add()
> +
> +Memory ordering guarantee changes:
> +none (both fully unordered)
> +
> +
> +case 3) - decrement-based RMW ops that return no value
> 

[PATCH] refcount_t: documentation for memory ordering differences

2017-11-29 Thread Elena Reshetova
Some functions from refcount_t API provide different
memory ordering guarantees that their atomic counterparts.
This adds a document outlining these differences.

Signed-off-by: Elena Reshetova 
---
 Documentation/core-api/index.rst  |   1 +
 Documentation/core-api/refcount-vs-atomic.rst | 129 ++
 2 files changed, 130 insertions(+)
 create mode 100644 Documentation/core-api/refcount-vs-atomic.rst

diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/index.rst
index d5bbe03..d4d54b0 100644
--- a/Documentation/core-api/index.rst
+++ b/Documentation/core-api/index.rst
@@ -14,6 +14,7 @@ Core utilities
kernel-api
assoc_array
atomic_ops
+   refcount-vs-atomic
cpu_hotplug
local_ops
workqueue
diff --git a/Documentation/core-api/refcount-vs-atomic.rst 
b/Documentation/core-api/refcount-vs-atomic.rst
new file mode 100644
index 000..5619d48
--- /dev/null
+++ b/Documentation/core-api/refcount-vs-atomic.rst
@@ -0,0 +1,129 @@
+===
+refcount_t API compared to atomic_t
+===
+
+The goal of refcount_t API is to provide a minimal API for implementing
+an object's reference counters. While a generic architecture-independent
+implementation from lib/refcount.c uses atomic operations underneath,
+there are a number of differences between some of the refcount_*() and
+atomic_*() functions with regards to the memory ordering guarantees.
+This document outlines the differences and provides respective examples
+in order to help maintainers validate their code against the change in
+these memory ordering guarantees.
+
+memory-barriers.txt and atomic_t.txt provide more background to the
+memory ordering in general and for atomic operations specifically.
+
+Relevant types of memory ordering
+=
+
+**Note**: the following section only covers some of the memory
+ordering types that are relevant for the atomics and reference
+counters and used through this document. For a much broader picture
+please consult memory-barriers.txt document.
+
+In the absence of any memory ordering guarantees (i.e. fully unordered)
+atomics & refcounters only provide atomicity and
+program order (po) relation (on the same CPU). It guarantees that
+each atomic_*() and refcount_*() operation is atomic and instructions
+are executed in program order on a single CPU.
+This is implemented using READ_ONCE()/WRITE_ONCE() and
+compare-and-swap primitives.
+
+A strong (full) memory ordering guarantees that all prior loads and
+stores (all po-earlier instructions) on the same CPU are completed
+before any po-later instruction is executed on the same CPU.
+It also guarantees that all po-earlier stores on the same CPU
+and all propagated stores from other CPUs must propagate to all
+other CPUs before any po-later instruction is executed on the original
+CPU (A-cumulative property). This is implemented using smp_mb().
+
+A RELEASE memory ordering guarantees that all prior loads and
+stores (all po-earlier instructions) on the same CPU are completed
+before the operation. It also guarantees that all po-earlier
+stores on the same CPU and all propagated stores from other CPUs
+must propagate to all other CPUs before the release operation
+(A-cumulative property). This is implemented using smp_store_release().
+
+A control dependency (on success) for refcounters guarantees that
+if a reference for an object was successfully obtained (reference
+counter increment or addition happened, function returned true),
+then further stores are ordered against this operation.
+Control dependency on stores are not implemented using any explicit
+barriers, but rely on CPU not to speculate on stores. This is only
+a single CPU relation and provides no guarantees for other CPUs.
+
+
+Comparison of functions
+===
+
+case 1) - non-"Read/Modify/Write" (RMW) ops
+---
+
+Function changes:
+atomic_set() --> refcount_set()
+atomic_read() --> refcount_read()
+
+Memory ordering guarantee changes:
+none (both fully unordered)
+
+case 2) - increment-based ops that return no value
+--
+
+Function changes:
+atomic_inc() --> refcount_inc()
+atomic_add() --> refcount_add()
+
+Memory ordering guarantee changes:
+none (both fully unordered)
+
+
+case 3) - decrement-based RMW ops that return no value
+--
+Function changes:
+atomic_dec() --> refcount_dec()
+
+Memory ordering guarantee changes:
+fully unordered --> RELEASE ordering
+
+
+case 4) - increment-based RMW ops that return a value
+-
+
+Function changes:
+atomic_inc_not_zero() --> 

[PATCH] refcount_t: documentation for memory ordering differences

2017-11-29 Thread Elena Reshetova
Some functions from refcount_t API provide different
memory ordering guarantees that their atomic counterparts.
This adds a document outlining these differences.

Signed-off-by: Elena Reshetova 
---
 Documentation/core-api/index.rst  |   1 +
 Documentation/core-api/refcount-vs-atomic.rst | 129 ++
 2 files changed, 130 insertions(+)
 create mode 100644 Documentation/core-api/refcount-vs-atomic.rst

diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/index.rst
index d5bbe03..d4d54b0 100644
--- a/Documentation/core-api/index.rst
+++ b/Documentation/core-api/index.rst
@@ -14,6 +14,7 @@ Core utilities
kernel-api
assoc_array
atomic_ops
+   refcount-vs-atomic
cpu_hotplug
local_ops
workqueue
diff --git a/Documentation/core-api/refcount-vs-atomic.rst 
b/Documentation/core-api/refcount-vs-atomic.rst
new file mode 100644
index 000..5619d48
--- /dev/null
+++ b/Documentation/core-api/refcount-vs-atomic.rst
@@ -0,0 +1,129 @@
+===
+refcount_t API compared to atomic_t
+===
+
+The goal of refcount_t API is to provide a minimal API for implementing
+an object's reference counters. While a generic architecture-independent
+implementation from lib/refcount.c uses atomic operations underneath,
+there are a number of differences between some of the refcount_*() and
+atomic_*() functions with regards to the memory ordering guarantees.
+This document outlines the differences and provides respective examples
+in order to help maintainers validate their code against the change in
+these memory ordering guarantees.
+
+memory-barriers.txt and atomic_t.txt provide more background to the
+memory ordering in general and for atomic operations specifically.
+
+Relevant types of memory ordering
+=
+
+**Note**: the following section only covers some of the memory
+ordering types that are relevant for the atomics and reference
+counters and used through this document. For a much broader picture
+please consult memory-barriers.txt document.
+
+In the absence of any memory ordering guarantees (i.e. fully unordered)
+atomics & refcounters only provide atomicity and
+program order (po) relation (on the same CPU). It guarantees that
+each atomic_*() and refcount_*() operation is atomic and instructions
+are executed in program order on a single CPU.
+This is implemented using READ_ONCE()/WRITE_ONCE() and
+compare-and-swap primitives.
+
+A strong (full) memory ordering guarantees that all prior loads and
+stores (all po-earlier instructions) on the same CPU are completed
+before any po-later instruction is executed on the same CPU.
+It also guarantees that all po-earlier stores on the same CPU
+and all propagated stores from other CPUs must propagate to all
+other CPUs before any po-later instruction is executed on the original
+CPU (A-cumulative property). This is implemented using smp_mb().
+
+A RELEASE memory ordering guarantees that all prior loads and
+stores (all po-earlier instructions) on the same CPU are completed
+before the operation. It also guarantees that all po-earlier
+stores on the same CPU and all propagated stores from other CPUs
+must propagate to all other CPUs before the release operation
+(A-cumulative property). This is implemented using smp_store_release().
+
+A control dependency (on success) for refcounters guarantees that
+if a reference for an object was successfully obtained (reference
+counter increment or addition happened, function returned true),
+then further stores are ordered against this operation.
+Control dependency on stores are not implemented using any explicit
+barriers, but rely on CPU not to speculate on stores. This is only
+a single CPU relation and provides no guarantees for other CPUs.
+
+
+Comparison of functions
+===
+
+case 1) - non-"Read/Modify/Write" (RMW) ops
+---
+
+Function changes:
+atomic_set() --> refcount_set()
+atomic_read() --> refcount_read()
+
+Memory ordering guarantee changes:
+none (both fully unordered)
+
+case 2) - increment-based ops that return no value
+--
+
+Function changes:
+atomic_inc() --> refcount_inc()
+atomic_add() --> refcount_add()
+
+Memory ordering guarantee changes:
+none (both fully unordered)
+
+
+case 3) - decrement-based RMW ops that return no value
+--
+Function changes:
+atomic_dec() --> refcount_dec()
+
+Memory ordering guarantee changes:
+fully unordered --> RELEASE ordering
+
+
+case 4) - increment-based RMW ops that return a value
+-
+
+Function changes:
+atomic_inc_not_zero() --> refcount_inc_not_zero()
+no 

RE: [PATCH] refcount_t: documentation for memory ordering differences

2017-11-17 Thread Reshetova, Elena
Hi Kees, 

Thank you for the proof reading. I will fix the typos/language, but
see the comments on bigger things inside. 

> On Tue, Nov 14, 2017 at 11:55 PM, Elena Reshetova
>  wrote:
> > Some functions from refcount_t API provide different
> > memory ordering guarantees that their atomic counterparts.
> > This adds a document outlining these differences.
> 
> Thanks for writing this up! One bike-shedding thing I'll bring up
> before anyone else does is: please format this in ReST and link to it
> from somewhere (likely developer documentation) in the Documentation/
> index.rst file somewhere.
> 
> Perhaps in Documentation/core-api/index.rst ?

Sure, I can do it. 
Peter do you have any objections?

> 
> Lots of notes here:
> https://www.kernel.org/doc/html/latest/doc-guide/sphinx.html#writing-
> documentation
> 
> > Signed-off-by: Elena Reshetova 
> > ---
> >  Documentation/refcount-vs-atomic.txt | 124
> +++
> >  1 file changed, 124 insertions(+)
> >  create mode 100644 Documentation/refcount-vs-atomic.txt
> >
> > diff --git a/Documentation/refcount-vs-atomic.txt 
> > b/Documentation/refcount-vs-
> atomic.txt
> > new file mode 100644
> > index 000..e703039
> > --- /dev/null
> > +++ b/Documentation/refcount-vs-atomic.txt
> > @@ -0,0 +1,124 @@
> > +==
> > +refcount_t API compare to atomic_t
> 
> "compared"
> 
> > +==
> > +
> > +The goal of refcount_t API is to provide a minimal API for implementing
> > +object's reference counters. While a generic architecture-independent
> 
> "an object's"
> 
> > +implementation from lib/refcount.c uses atomic operations underneath,
> > +there are a number of differences between some of the refcount_*() and
> > +atomic_*() functions with regards to the memory ordering guarantees.
> > +
> > +This document outlines the differences and provides respective examples
> > +in order to help maintainers validate their code against the change in
> > +these memory ordering guarantees.
> > +
> > +memory-barriers.txt and atomic_t.txt provide more background to the
> > +memory ordering in general and for atomic operations specifically.
> > +
> > +Notation
> 
> Should this section be called "Types of memory ordering"?

Well, these are only some types of ordering and explained mostly around
refcount_t vs. atomic_t, so it doesn't cover everything...

> 
> > +
> > +
> > +An absence of memory ordering guarantees (i.e. fully unordered)
> > +in case of atomics & refcounters only provides atomicity and
> 
> I can't parse this. "In an absense ... atomics & refcounts only provide ... "?
> 
> > +program order (po) relation (on the same CPU). It guarantees that
> > +each atomic_*() and refcount_*() operation is atomic and instructions
> > +are executed in program order on a single CPU.
> > +Implemented using READ_ONCE()/WRITE_ONCE() and
> > +compare-and-swap primitives.
> 
> For here an later, maybe "This is implemented ..."
> 
> > +
> > +A strong (full) memory ordering guarantees that all prior loads and
> > +stores (all po-earlier instructions) on the same CPU are completed
> > +before any po-later instruction is executed on the same CPU.
> > +It also guarantees that all po-earlier stores on the same CPU
> > +and all propagated stores from other CPUs must propagate to all
> > +other CPUs before any po-later instruction is executed on the original
> > +CPU (A-cumulative property). Implemented using smp_mb().
> > +
> > +A RELEASE memory ordering guarantees that all prior loads and
> > +stores (all po-earlier instructions) on the same CPU are completed
> > +before the operation. It also guarantees that all po-earlier
> > +stores on the same CPU and all propagated stores from other CPUs
> > +must propagate to all other CPUs before the release operation
> > +(A-cumulative property). Implemented using smp_store_release().
> > +
> > +A control dependency (on success) for refcounters guarantees that
> > +if a reference for an object was successfully obtained (reference
> > +counter increment or addition happened, function returned true),
> > +then further stores are ordered against this operation.
> > +Control dependency on stores are not implemented using any explicit
> > +barriers, but rely on CPU not to speculate on stores. This is only
> > +a single CPU relation and provides no guarantees for other CPUs.
> > +
> > +
> > +Comparison of functions
> > +==
> > +
> > +case 1) - non-RMW ops
> 
> Should this be spelled out "Read/Modify/Write"?

Sure.

> 
> > +-
> > +
> > +Function changes:
> > +atomic_set() --> refcount_set()
> > +atomic_read() --> refcount_read()
> > +
> > +Memory ordering guarantee changes:
> > +fully unordered --> fully unordered
> 
> Maybe say: "none (both fully unordered)"

Ok

> 
> > +case 2) - increment-based ops that return no value

RE: [PATCH] refcount_t: documentation for memory ordering differences

2017-11-17 Thread Reshetova, Elena
Hi Kees, 

Thank you for the proof reading. I will fix the typos/language, but
see the comments on bigger things inside. 

> On Tue, Nov 14, 2017 at 11:55 PM, Elena Reshetova
>  wrote:
> > Some functions from refcount_t API provide different
> > memory ordering guarantees that their atomic counterparts.
> > This adds a document outlining these differences.
> 
> Thanks for writing this up! One bike-shedding thing I'll bring up
> before anyone else does is: please format this in ReST and link to it
> from somewhere (likely developer documentation) in the Documentation/
> index.rst file somewhere.
> 
> Perhaps in Documentation/core-api/index.rst ?

Sure, I can do it. 
Peter do you have any objections?

> 
> Lots of notes here:
> https://www.kernel.org/doc/html/latest/doc-guide/sphinx.html#writing-
> documentation
> 
> > Signed-off-by: Elena Reshetova 
> > ---
> >  Documentation/refcount-vs-atomic.txt | 124
> +++
> >  1 file changed, 124 insertions(+)
> >  create mode 100644 Documentation/refcount-vs-atomic.txt
> >
> > diff --git a/Documentation/refcount-vs-atomic.txt 
> > b/Documentation/refcount-vs-
> atomic.txt
> > new file mode 100644
> > index 000..e703039
> > --- /dev/null
> > +++ b/Documentation/refcount-vs-atomic.txt
> > @@ -0,0 +1,124 @@
> > +==
> > +refcount_t API compare to atomic_t
> 
> "compared"
> 
> > +==
> > +
> > +The goal of refcount_t API is to provide a minimal API for implementing
> > +object's reference counters. While a generic architecture-independent
> 
> "an object's"
> 
> > +implementation from lib/refcount.c uses atomic operations underneath,
> > +there are a number of differences between some of the refcount_*() and
> > +atomic_*() functions with regards to the memory ordering guarantees.
> > +
> > +This document outlines the differences and provides respective examples
> > +in order to help maintainers validate their code against the change in
> > +these memory ordering guarantees.
> > +
> > +memory-barriers.txt and atomic_t.txt provide more background to the
> > +memory ordering in general and for atomic operations specifically.
> > +
> > +Notation
> 
> Should this section be called "Types of memory ordering"?

Well, these are only some types of ordering and explained mostly around
refcount_t vs. atomic_t, so it doesn't cover everything...

> 
> > +
> > +
> > +An absence of memory ordering guarantees (i.e. fully unordered)
> > +in case of atomics & refcounters only provides atomicity and
> 
> I can't parse this. "In an absense ... atomics & refcounts only provide ... "?
> 
> > +program order (po) relation (on the same CPU). It guarantees that
> > +each atomic_*() and refcount_*() operation is atomic and instructions
> > +are executed in program order on a single CPU.
> > +Implemented using READ_ONCE()/WRITE_ONCE() and
> > +compare-and-swap primitives.
> 
> For here an later, maybe "This is implemented ..."
> 
> > +
> > +A strong (full) memory ordering guarantees that all prior loads and
> > +stores (all po-earlier instructions) on the same CPU are completed
> > +before any po-later instruction is executed on the same CPU.
> > +It also guarantees that all po-earlier stores on the same CPU
> > +and all propagated stores from other CPUs must propagate to all
> > +other CPUs before any po-later instruction is executed on the original
> > +CPU (A-cumulative property). Implemented using smp_mb().
> > +
> > +A RELEASE memory ordering guarantees that all prior loads and
> > +stores (all po-earlier instructions) on the same CPU are completed
> > +before the operation. It also guarantees that all po-earlier
> > +stores on the same CPU and all propagated stores from other CPUs
> > +must propagate to all other CPUs before the release operation
> > +(A-cumulative property). Implemented using smp_store_release().
> > +
> > +A control dependency (on success) for refcounters guarantees that
> > +if a reference for an object was successfully obtained (reference
> > +counter increment or addition happened, function returned true),
> > +then further stores are ordered against this operation.
> > +Control dependency on stores are not implemented using any explicit
> > +barriers, but rely on CPU not to speculate on stores. This is only
> > +a single CPU relation and provides no guarantees for other CPUs.
> > +
> > +
> > +Comparison of functions
> > +==
> > +
> > +case 1) - non-RMW ops
> 
> Should this be spelled out "Read/Modify/Write"?

Sure.

> 
> > +-
> > +
> > +Function changes:
> > +atomic_set() --> refcount_set()
> > +atomic_read() --> refcount_read()
> > +
> > +Memory ordering guarantee changes:
> > +fully unordered --> fully unordered
> 
> Maybe say: "none (both fully unordered)"

Ok

> 
> > +case 2) - increment-based ops that return no value
> > 

Re: [PATCH] refcount_t: documentation for memory ordering differences

2017-11-16 Thread Kees Cook
On Tue, Nov 14, 2017 at 11:55 PM, Elena Reshetova
 wrote:
> Some functions from refcount_t API provide different
> memory ordering guarantees that their atomic counterparts.
> This adds a document outlining these differences.

Thanks for writing this up! One bike-shedding thing I'll bring up
before anyone else does is: please format this in ReST and link to it
from somewhere (likely developer documentation) in the Documentation/
index.rst file somewhere.

Perhaps in Documentation/core-api/index.rst ?

Lots of notes here:
https://www.kernel.org/doc/html/latest/doc-guide/sphinx.html#writing-documentation

> Signed-off-by: Elena Reshetova 
> ---
>  Documentation/refcount-vs-atomic.txt | 124 
> +++
>  1 file changed, 124 insertions(+)
>  create mode 100644 Documentation/refcount-vs-atomic.txt
>
> diff --git a/Documentation/refcount-vs-atomic.txt 
> b/Documentation/refcount-vs-atomic.txt
> new file mode 100644
> index 000..e703039
> --- /dev/null
> +++ b/Documentation/refcount-vs-atomic.txt
> @@ -0,0 +1,124 @@
> +==
> +refcount_t API compare to atomic_t

"compared"

> +==
> +
> +The goal of refcount_t API is to provide a minimal API for implementing
> +object's reference counters. While a generic architecture-independent

"an object's"

> +implementation from lib/refcount.c uses atomic operations underneath,
> +there are a number of differences between some of the refcount_*() and
> +atomic_*() functions with regards to the memory ordering guarantees.
> +
> +This document outlines the differences and provides respective examples
> +in order to help maintainers validate their code against the change in
> +these memory ordering guarantees.
> +
> +memory-barriers.txt and atomic_t.txt provide more background to the
> +memory ordering in general and for atomic operations specifically.
> +
> +Notation

Should this section be called "Types of memory ordering"?

> +
> +
> +An absence of memory ordering guarantees (i.e. fully unordered)
> +in case of atomics & refcounters only provides atomicity and

I can't parse this. "In an absense ... atomics & refcounts only provide ... "?

> +program order (po) relation (on the same CPU). It guarantees that
> +each atomic_*() and refcount_*() operation is atomic and instructions
> +are executed in program order on a single CPU.
> +Implemented using READ_ONCE()/WRITE_ONCE() and
> +compare-and-swap primitives.

For here an later, maybe "This is implemented ..."

> +
> +A strong (full) memory ordering guarantees that all prior loads and
> +stores (all po-earlier instructions) on the same CPU are completed
> +before any po-later instruction is executed on the same CPU.
> +It also guarantees that all po-earlier stores on the same CPU
> +and all propagated stores from other CPUs must propagate to all
> +other CPUs before any po-later instruction is executed on the original
> +CPU (A-cumulative property). Implemented using smp_mb().
> +
> +A RELEASE memory ordering guarantees that all prior loads and
> +stores (all po-earlier instructions) on the same CPU are completed
> +before the operation. It also guarantees that all po-earlier
> +stores on the same CPU and all propagated stores from other CPUs
> +must propagate to all other CPUs before the release operation
> +(A-cumulative property). Implemented using smp_store_release().
> +
> +A control dependency (on success) for refcounters guarantees that
> +if a reference for an object was successfully obtained (reference
> +counter increment or addition happened, function returned true),
> +then further stores are ordered against this operation.
> +Control dependency on stores are not implemented using any explicit
> +barriers, but rely on CPU not to speculate on stores. This is only
> +a single CPU relation and provides no guarantees for other CPUs.
> +
> +
> +Comparison of functions
> +==
> +
> +case 1) - non-RMW ops

Should this be spelled out "Read/Modify/Write"?

> +-
> +
> +Function changes:
> +atomic_set() --> refcount_set()
> +atomic_read() --> refcount_read()
> +
> +Memory ordering guarantee changes:
> +fully unordered --> fully unordered

Maybe say: "none (both fully unordered)"

> +case 2) - increment-based ops that return no value
> +--
> +
> +Function changes:
> +atomic_inc() --> refcount_inc()
> +atomic_add() --> refcount_add()
> +
> +Memory ordering guarantee changes:
> +fully unordered --> fully unordered

Same.

> +case 3) - decrement-based RMW ops that return no value
> +--
> +Function changes:
> +atomic_dec() --> refcount_dec()
> +
> +Memory ordering guarantee changes:
> +fully unordered --> 

Re: [PATCH] refcount_t: documentation for memory ordering differences

2017-11-16 Thread Kees Cook
On Tue, Nov 14, 2017 at 11:55 PM, Elena Reshetova
 wrote:
> Some functions from refcount_t API provide different
> memory ordering guarantees that their atomic counterparts.
> This adds a document outlining these differences.

Thanks for writing this up! One bike-shedding thing I'll bring up
before anyone else does is: please format this in ReST and link to it
from somewhere (likely developer documentation) in the Documentation/
index.rst file somewhere.

Perhaps in Documentation/core-api/index.rst ?

Lots of notes here:
https://www.kernel.org/doc/html/latest/doc-guide/sphinx.html#writing-documentation

> Signed-off-by: Elena Reshetova 
> ---
>  Documentation/refcount-vs-atomic.txt | 124 
> +++
>  1 file changed, 124 insertions(+)
>  create mode 100644 Documentation/refcount-vs-atomic.txt
>
> diff --git a/Documentation/refcount-vs-atomic.txt 
> b/Documentation/refcount-vs-atomic.txt
> new file mode 100644
> index 000..e703039
> --- /dev/null
> +++ b/Documentation/refcount-vs-atomic.txt
> @@ -0,0 +1,124 @@
> +==
> +refcount_t API compare to atomic_t

"compared"

> +==
> +
> +The goal of refcount_t API is to provide a minimal API for implementing
> +object's reference counters. While a generic architecture-independent

"an object's"

> +implementation from lib/refcount.c uses atomic operations underneath,
> +there are a number of differences between some of the refcount_*() and
> +atomic_*() functions with regards to the memory ordering guarantees.
> +
> +This document outlines the differences and provides respective examples
> +in order to help maintainers validate their code against the change in
> +these memory ordering guarantees.
> +
> +memory-barriers.txt and atomic_t.txt provide more background to the
> +memory ordering in general and for atomic operations specifically.
> +
> +Notation

Should this section be called "Types of memory ordering"?

> +
> +
> +An absence of memory ordering guarantees (i.e. fully unordered)
> +in case of atomics & refcounters only provides atomicity and

I can't parse this. "In an absense ... atomics & refcounts only provide ... "?

> +program order (po) relation (on the same CPU). It guarantees that
> +each atomic_*() and refcount_*() operation is atomic and instructions
> +are executed in program order on a single CPU.
> +Implemented using READ_ONCE()/WRITE_ONCE() and
> +compare-and-swap primitives.

For here an later, maybe "This is implemented ..."

> +
> +A strong (full) memory ordering guarantees that all prior loads and
> +stores (all po-earlier instructions) on the same CPU are completed
> +before any po-later instruction is executed on the same CPU.
> +It also guarantees that all po-earlier stores on the same CPU
> +and all propagated stores from other CPUs must propagate to all
> +other CPUs before any po-later instruction is executed on the original
> +CPU (A-cumulative property). Implemented using smp_mb().
> +
> +A RELEASE memory ordering guarantees that all prior loads and
> +stores (all po-earlier instructions) on the same CPU are completed
> +before the operation. It also guarantees that all po-earlier
> +stores on the same CPU and all propagated stores from other CPUs
> +must propagate to all other CPUs before the release operation
> +(A-cumulative property). Implemented using smp_store_release().
> +
> +A control dependency (on success) for refcounters guarantees that
> +if a reference for an object was successfully obtained (reference
> +counter increment or addition happened, function returned true),
> +then further stores are ordered against this operation.
> +Control dependency on stores are not implemented using any explicit
> +barriers, but rely on CPU not to speculate on stores. This is only
> +a single CPU relation and provides no guarantees for other CPUs.
> +
> +
> +Comparison of functions
> +==
> +
> +case 1) - non-RMW ops

Should this be spelled out "Read/Modify/Write"?

> +-
> +
> +Function changes:
> +atomic_set() --> refcount_set()
> +atomic_read() --> refcount_read()
> +
> +Memory ordering guarantee changes:
> +fully unordered --> fully unordered

Maybe say: "none (both fully unordered)"

> +case 2) - increment-based ops that return no value
> +--
> +
> +Function changes:
> +atomic_inc() --> refcount_inc()
> +atomic_add() --> refcount_add()
> +
> +Memory ordering guarantee changes:
> +fully unordered --> fully unordered

Same.

> +case 3) - decrement-based RMW ops that return no value
> +--
> +Function changes:
> +atomic_dec() --> refcount_dec()
> +
> +Memory ordering guarantee changes:
> +fully unordered --> RELEASE ordering

Should the sections where there is a 

[PATCH] refcount_t: documentation for memory ordering differences

2017-11-14 Thread Elena Reshetova
Some functions from refcount_t API provide different
memory ordering guarantees that their atomic counterparts.
This adds a document outlining these differences.

Signed-off-by: Elena Reshetova 
---
 Documentation/refcount-vs-atomic.txt | 124 +++
 1 file changed, 124 insertions(+)
 create mode 100644 Documentation/refcount-vs-atomic.txt

diff --git a/Documentation/refcount-vs-atomic.txt 
b/Documentation/refcount-vs-atomic.txt
new file mode 100644
index 000..e703039
--- /dev/null
+++ b/Documentation/refcount-vs-atomic.txt
@@ -0,0 +1,124 @@
+==
+refcount_t API compare to atomic_t
+==
+
+The goal of refcount_t API is to provide a minimal API for implementing
+object's reference counters. While a generic architecture-independent
+implementation from lib/refcount.c uses atomic operations underneath,
+there are a number of differences between some of the refcount_*() and
+atomic_*() functions with regards to the memory ordering guarantees.
+This document outlines the differences and provides respective examples
+in order to help maintainers validate their code against the change in
+these memory ordering guarantees.
+
+memory-barriers.txt and atomic_t.txt provide more background to the
+memory ordering in general and for atomic operations specifically.
+
+Notation
+
+
+An absence of memory ordering guarantees (i.e. fully unordered)
+in case of atomics & refcounters only provides atomicity and
+program order (po) relation (on the same CPU). It guarantees that
+each atomic_*() and refcount_*() operation is atomic and instructions
+are executed in program order on a single CPU.
+Implemented using READ_ONCE()/WRITE_ONCE() and
+compare-and-swap primitives.
+
+A strong (full) memory ordering guarantees that all prior loads and
+stores (all po-earlier instructions) on the same CPU are completed
+before any po-later instruction is executed on the same CPU.
+It also guarantees that all po-earlier stores on the same CPU
+and all propagated stores from other CPUs must propagate to all
+other CPUs before any po-later instruction is executed on the original
+CPU (A-cumulative property). Implemented using smp_mb().
+
+A RELEASE memory ordering guarantees that all prior loads and
+stores (all po-earlier instructions) on the same CPU are completed
+before the operation. It also guarantees that all po-earlier
+stores on the same CPU and all propagated stores from other CPUs
+must propagate to all other CPUs before the release operation
+(A-cumulative property). Implemented using smp_store_release().
+
+A control dependency (on success) for refcounters guarantees that
+if a reference for an object was successfully obtained (reference
+counter increment or addition happened, function returned true),
+then further stores are ordered against this operation.
+Control dependency on stores are not implemented using any explicit
+barriers, but rely on CPU not to speculate on stores. This is only
+a single CPU relation and provides no guarantees for other CPUs.
+
+
+Comparison of functions
+==
+
+case 1) - non-RMW ops
+-
+
+Function changes:
+atomic_set() --> refcount_set()
+atomic_read() --> refcount_read()
+
+Memory ordering guarantee changes:
+fully unordered --> fully unordered
+
+case 2) - increment-based ops that return no value
+--
+
+Function changes:
+atomic_inc() --> refcount_inc()
+atomic_add() --> refcount_add()
+
+Memory ordering guarantee changes:
+fully unordered --> fully unordered
+
+
+case 3) - decrement-based RMW ops that return no value
+--
+Function changes:
+atomic_dec() --> refcount_dec()
+
+Memory ordering guarantee changes:
+fully unordered --> RELEASE ordering
+
+
+case 4) - increment-based RMW ops that return a value
+-
+
+Function changes:
+atomic_inc_not_zero() --> refcount_inc_not_zero()
+no atomic counterpart --> refcount_add_not_zero()
+
+Memory ordering guarantees changes:
+fully ordered --> control dependency on success for stores
+
+*Note*: we really assume here that necessary ordering is provided as a result
+of obtaining pointer to the object!
+
+
+case 5) - decrement-based RMW ops that return a value
+-
+
+Function changes:
+atomic_dec_and_test() --> refcount_dec_and_test()
+atomic_sub_and_test() --> refcount_sub_and_test()
+no atomic counterpart --> refcount_dec_if_one()
+atomic_add_unless(, -1, 1) --> refcount_dec_not_one()
+
+Memory ordering guarantees changes:
+fully 

[PATCH] refcount_t: documentation for memory ordering differences

2017-11-14 Thread Elena Reshetova
Some functions from refcount_t API provide different
memory ordering guarantees that their atomic counterparts.
This adds a document outlining these differences.

Signed-off-by: Elena Reshetova 
---
 Documentation/refcount-vs-atomic.txt | 124 +++
 1 file changed, 124 insertions(+)
 create mode 100644 Documentation/refcount-vs-atomic.txt

diff --git a/Documentation/refcount-vs-atomic.txt 
b/Documentation/refcount-vs-atomic.txt
new file mode 100644
index 000..e703039
--- /dev/null
+++ b/Documentation/refcount-vs-atomic.txt
@@ -0,0 +1,124 @@
+==
+refcount_t API compare to atomic_t
+==
+
+The goal of refcount_t API is to provide a minimal API for implementing
+object's reference counters. While a generic architecture-independent
+implementation from lib/refcount.c uses atomic operations underneath,
+there are a number of differences between some of the refcount_*() and
+atomic_*() functions with regards to the memory ordering guarantees.
+This document outlines the differences and provides respective examples
+in order to help maintainers validate their code against the change in
+these memory ordering guarantees.
+
+memory-barriers.txt and atomic_t.txt provide more background to the
+memory ordering in general and for atomic operations specifically.
+
+Notation
+
+
+An absence of memory ordering guarantees (i.e. fully unordered)
+in case of atomics & refcounters only provides atomicity and
+program order (po) relation (on the same CPU). It guarantees that
+each atomic_*() and refcount_*() operation is atomic and instructions
+are executed in program order on a single CPU.
+Implemented using READ_ONCE()/WRITE_ONCE() and
+compare-and-swap primitives.
+
+A strong (full) memory ordering guarantees that all prior loads and
+stores (all po-earlier instructions) on the same CPU are completed
+before any po-later instruction is executed on the same CPU.
+It also guarantees that all po-earlier stores on the same CPU
+and all propagated stores from other CPUs must propagate to all
+other CPUs before any po-later instruction is executed on the original
+CPU (A-cumulative property). Implemented using smp_mb().
+
+A RELEASE memory ordering guarantees that all prior loads and
+stores (all po-earlier instructions) on the same CPU are completed
+before the operation. It also guarantees that all po-earlier
+stores on the same CPU and all propagated stores from other CPUs
+must propagate to all other CPUs before the release operation
+(A-cumulative property). Implemented using smp_store_release().
+
+A control dependency (on success) for refcounters guarantees that
+if a reference for an object was successfully obtained (reference
+counter increment or addition happened, function returned true),
+then further stores are ordered against this operation.
+Control dependency on stores are not implemented using any explicit
+barriers, but rely on CPU not to speculate on stores. This is only
+a single CPU relation and provides no guarantees for other CPUs.
+
+
+Comparison of functions
+==
+
+case 1) - non-RMW ops
+-
+
+Function changes:
+atomic_set() --> refcount_set()
+atomic_read() --> refcount_read()
+
+Memory ordering guarantee changes:
+fully unordered --> fully unordered
+
+case 2) - increment-based ops that return no value
+--
+
+Function changes:
+atomic_inc() --> refcount_inc()
+atomic_add() --> refcount_add()
+
+Memory ordering guarantee changes:
+fully unordered --> fully unordered
+
+
+case 3) - decrement-based RMW ops that return no value
+--
+Function changes:
+atomic_dec() --> refcount_dec()
+
+Memory ordering guarantee changes:
+fully unordered --> RELEASE ordering
+
+
+case 4) - increment-based RMW ops that return a value
+-
+
+Function changes:
+atomic_inc_not_zero() --> refcount_inc_not_zero()
+no atomic counterpart --> refcount_add_not_zero()
+
+Memory ordering guarantees changes:
+fully ordered --> control dependency on success for stores
+
+*Note*: we really assume here that necessary ordering is provided as a result
+of obtaining pointer to the object!
+
+
+case 5) - decrement-based RMW ops that return a value
+-
+
+Function changes:
+atomic_dec_and_test() --> refcount_dec_and_test()
+atomic_sub_and_test() --> refcount_sub_and_test()
+no atomic counterpart --> refcount_dec_if_one()
+atomic_add_unless(, -1, 1) --> refcount_dec_not_one()
+
+Memory ordering guarantees changes:
+fully ordered --> RELEASE ordering 

RE: [PATCH] refcount_t: documentation for memory ordering differences

2017-11-07 Thread Reshetova, Elena

Hi Randy, 

Thank you for your corrections! I will fix the language-related issues in the
next version. More on content below.

> On 11/06/2017 05:32 AM, Elena Reshetova wrote:
> > Some functions from refcount_t API provide different
> > memory ordering guarantees that their atomic counterparts.
> > This adds a document outlining the differences and
> > showing examples.
> >
> > Signed-off-by: Elena Reshetova 
> > ---
> >  Documentation/refcount-vs-atomic.txt | 234
> +++
> >  1 file changed, 234 insertions(+)
> >  create mode 100644 Documentation/refcount-vs-atomic.txt
> >
> > diff --git a/Documentation/refcount-vs-atomic.txt 
> > b/Documentation/refcount-vs-
> atomic.txt
> > new file mode 100644
> > index 000..09efd2b
> > --- /dev/null
> > +++ b/Documentation/refcount-vs-atomic.txt
> > @@ -0,0 +1,234 @@
> > +==
> > +refcount_t API compare to atomic_t
> > +==
> > +
> > +The goal of refcount_t API is to provide a minimal API for implementing
> > +object's reference counters. While a generic architecture-independent
> > +implementation from lib/refcount.c uses atomic operations underneath,
> > +there is a number of differences between some of the refcount_*() and
> 
>there are
> 
> > +atomic_*() functions with regards to the memory ordering guarantees.
> > +This document outlines the differences and provides respective examples
> > +in order to help maintainers validate their code against the change in
> > +these memory ordering guarantees.
> > +
> > +memory-barriers.txt and atomic_t.txt provide more background to the
> > +memory ordering in general and for atomic operations specifically.
> > +
> > +Summary of the differences
> > +==
> > +
> > + 1) There is no difference between respective non-RMW ops, i.e.
> > +   refcount_set() & refcount_read() have exactly the same ordering
> > +   guarantees (meaning fully unordered) as atomic_set() and atomic_read().
> > + 2) For the increment-based ops that return no value (namely
> > +   refcount_inc() & refcount_add()) memory ordering guarantees are
> > +   exactly the same (meaning fully unordered) as respective atomic
> > +   functions (atomic_inc() & atomic_add()).
> > + 3) For the decrement-based ops that return no value (namely
> > +   refcount_dec()) memory ordering guarantees are slightly
> > +   stronger than respective atomic counterpart (atomic_dec()).
> > +   While atomic_dec() is fully unordered, refcount_dec() does
> > +   provide a RELEASE memory ordering guarantee (see next section).
> > + 4) For the rest of increment-based RMW ops (refcount_inc_not_zero(),
> > +   refcount_add_not_zero()) the memory ordering guarantees are relaxed
> > +   compare to their atomic counterparts (atomic_inc_not_zero()).
> 
>   compared
> 
> > +   Refcount variants provide no memory ordering guarantees apart from
> > +   control dependency on success, while atomics provide a full memory
> 
>provide full memory
> 
> > +   ordering guarantees (see next section).
> > + 5) The rest of decrement-based RMW ops (refcount_dec_and_test(),
> > +   refcount_sub_and_test(), refcount_dec_if_one(), refcount_dec_not_one())
> > +   provide only RELEASE memory ordering and control dependency on success
> > +   (see next section). The respective atomic counterparts
> > +   (atomic_dec_and_test(), atomic_sub_and_test()) provide full memory 
> > ordering.
> > + 6) The lock-based RMW ops (refcount_dec_and_lock() &
> > +   refcount_dec_and_mutex_lock()) alway provide RELEASE memory ordering
> > +   and ACQUIRE memory ordering & control dependency on success
> > +   (see next section). The respective atomic counterparts
> > +   (atomic_dec_and_lock() & atomic_dec_and_mutex_lock())
> > +   provide full memory ordering.
> > +
> > +
> > +
> > +Details and examples
> > +
> > +
> > +Here we consider the cases 3)-6) that do present differences together
> > +with respective examples.
> > +
> > +case 3) - decrement-based RMW ops that return no value
> > +--
> > +
> > +Function changes:
> > +atomic_dec() --> refcount_dec()
> > +
> > +Memory ordering guarantee changes:
> > +fully unordered --> RELEASE ordering
> > +
> > +RELEASE ordering guarantees that prior loads and stores are
> > +completed before the operation. Implemented using smp_store_release().
> > +
> > +Examples:
> > +~
> > +
> > +For fully unordered operations stores to a, b and c can
> > +happen in any sequence:
> > +
> > +P0(int *a, int *b, int *c)
> > +  {
> > + WRITE_ONCE(*a, 1);
> > + WRITE_ONCE(*b, 1);
> > + WRITE_ONCE(*c, 1);
> > +  }
> > +
> > +
> > +For a RELEASE ordered operation, read and write from/to @a
> 
> read or write  (??)
> 
> > +is guaranteed to happen 

RE: [PATCH] refcount_t: documentation for memory ordering differences

2017-11-07 Thread Reshetova, Elena

Hi Randy, 

Thank you for your corrections! I will fix the language-related issues in the
next version. More on content below.

> On 11/06/2017 05:32 AM, Elena Reshetova wrote:
> > Some functions from refcount_t API provide different
> > memory ordering guarantees that their atomic counterparts.
> > This adds a document outlining the differences and
> > showing examples.
> >
> > Signed-off-by: Elena Reshetova 
> > ---
> >  Documentation/refcount-vs-atomic.txt | 234
> +++
> >  1 file changed, 234 insertions(+)
> >  create mode 100644 Documentation/refcount-vs-atomic.txt
> >
> > diff --git a/Documentation/refcount-vs-atomic.txt 
> > b/Documentation/refcount-vs-
> atomic.txt
> > new file mode 100644
> > index 000..09efd2b
> > --- /dev/null
> > +++ b/Documentation/refcount-vs-atomic.txt
> > @@ -0,0 +1,234 @@
> > +==
> > +refcount_t API compare to atomic_t
> > +==
> > +
> > +The goal of refcount_t API is to provide a minimal API for implementing
> > +object's reference counters. While a generic architecture-independent
> > +implementation from lib/refcount.c uses atomic operations underneath,
> > +there is a number of differences between some of the refcount_*() and
> 
>there are
> 
> > +atomic_*() functions with regards to the memory ordering guarantees.
> > +This document outlines the differences and provides respective examples
> > +in order to help maintainers validate their code against the change in
> > +these memory ordering guarantees.
> > +
> > +memory-barriers.txt and atomic_t.txt provide more background to the
> > +memory ordering in general and for atomic operations specifically.
> > +
> > +Summary of the differences
> > +==
> > +
> > + 1) There is no difference between respective non-RMW ops, i.e.
> > +   refcount_set() & refcount_read() have exactly the same ordering
> > +   guarantees (meaning fully unordered) as atomic_set() and atomic_read().
> > + 2) For the increment-based ops that return no value (namely
> > +   refcount_inc() & refcount_add()) memory ordering guarantees are
> > +   exactly the same (meaning fully unordered) as respective atomic
> > +   functions (atomic_inc() & atomic_add()).
> > + 3) For the decrement-based ops that return no value (namely
> > +   refcount_dec()) memory ordering guarantees are slightly
> > +   stronger than respective atomic counterpart (atomic_dec()).
> > +   While atomic_dec() is fully unordered, refcount_dec() does
> > +   provide a RELEASE memory ordering guarantee (see next section).
> > + 4) For the rest of increment-based RMW ops (refcount_inc_not_zero(),
> > +   refcount_add_not_zero()) the memory ordering guarantees are relaxed
> > +   compare to their atomic counterparts (atomic_inc_not_zero()).
> 
>   compared
> 
> > +   Refcount variants provide no memory ordering guarantees apart from
> > +   control dependency on success, while atomics provide a full memory
> 
>provide full memory
> 
> > +   ordering guarantees (see next section).
> > + 5) The rest of decrement-based RMW ops (refcount_dec_and_test(),
> > +   refcount_sub_and_test(), refcount_dec_if_one(), refcount_dec_not_one())
> > +   provide only RELEASE memory ordering and control dependency on success
> > +   (see next section). The respective atomic counterparts
> > +   (atomic_dec_and_test(), atomic_sub_and_test()) provide full memory 
> > ordering.
> > + 6) The lock-based RMW ops (refcount_dec_and_lock() &
> > +   refcount_dec_and_mutex_lock()) alway provide RELEASE memory ordering
> > +   and ACQUIRE memory ordering & control dependency on success
> > +   (see next section). The respective atomic counterparts
> > +   (atomic_dec_and_lock() & atomic_dec_and_mutex_lock())
> > +   provide full memory ordering.
> > +
> > +
> > +
> > +Details and examples
> > +
> > +
> > +Here we consider the cases 3)-6) that do present differences together
> > +with respective examples.
> > +
> > +case 3) - decrement-based RMW ops that return no value
> > +--
> > +
> > +Function changes:
> > +atomic_dec() --> refcount_dec()
> > +
> > +Memory ordering guarantee changes:
> > +fully unordered --> RELEASE ordering
> > +
> > +RELEASE ordering guarantees that prior loads and stores are
> > +completed before the operation. Implemented using smp_store_release().
> > +
> > +Examples:
> > +~
> > +
> > +For fully unordered operations stores to a, b and c can
> > +happen in any sequence:
> > +
> > +P0(int *a, int *b, int *c)
> > +  {
> > + WRITE_ONCE(*a, 1);
> > + WRITE_ONCE(*b, 1);
> > + WRITE_ONCE(*c, 1);
> > +  }
> > +
> > +
> > +For a RELEASE ordered operation, read and write from/to @a
> 
> read or write  (??)
> 
> > +is guaranteed to happen before store to @b. There are 

Re: [PATCH] refcount_t: documentation for memory ordering differences

2017-11-06 Thread Randy Dunlap
On 11/06/2017 05:32 AM, Elena Reshetova wrote:
> Some functions from refcount_t API provide different
> memory ordering guarantees that their atomic counterparts.
> This adds a document outlining the differences and
> showing examples.
> 
> Signed-off-by: Elena Reshetova 
> ---
>  Documentation/refcount-vs-atomic.txt | 234 
> +++
>  1 file changed, 234 insertions(+)
>  create mode 100644 Documentation/refcount-vs-atomic.txt
> 
> diff --git a/Documentation/refcount-vs-atomic.txt 
> b/Documentation/refcount-vs-atomic.txt
> new file mode 100644
> index 000..09efd2b
> --- /dev/null
> +++ b/Documentation/refcount-vs-atomic.txt
> @@ -0,0 +1,234 @@
> +==
> +refcount_t API compare to atomic_t
> +==
> +
> +The goal of refcount_t API is to provide a minimal API for implementing
> +object's reference counters. While a generic architecture-independent
> +implementation from lib/refcount.c uses atomic operations underneath,
> +there is a number of differences between some of the refcount_*() and

   there are

> +atomic_*() functions with regards to the memory ordering guarantees.
> +This document outlines the differences and provides respective examples
> +in order to help maintainers validate their code against the change in
> +these memory ordering guarantees.
> +
> +memory-barriers.txt and atomic_t.txt provide more background to the
> +memory ordering in general and for atomic operations specifically.
> +
> +Summary of the differences
> +==
> +
> + 1) There is no difference between respective non-RMW ops, i.e.
> +   refcount_set() & refcount_read() have exactly the same ordering
> +   guarantees (meaning fully unordered) as atomic_set() and atomic_read().
> + 2) For the increment-based ops that return no value (namely
> +   refcount_inc() & refcount_add()) memory ordering guarantees are
> +   exactly the same (meaning fully unordered) as respective atomic
> +   functions (atomic_inc() & atomic_add()).
> + 3) For the decrement-based ops that return no value (namely
> +   refcount_dec()) memory ordering guarantees are slightly
> +   stronger than respective atomic counterpart (atomic_dec()).
> +   While atomic_dec() is fully unordered, refcount_dec() does
> +   provide a RELEASE memory ordering guarantee (see next section).
> + 4) For the rest of increment-based RMW ops (refcount_inc_not_zero(),
> +   refcount_add_not_zero()) the memory ordering guarantees are relaxed
> +   compare to their atomic counterparts (atomic_inc_not_zero()).

  compared

> +   Refcount variants provide no memory ordering guarantees apart from
> +   control dependency on success, while atomics provide a full memory

   provide full memory

> +   ordering guarantees (see next section).
> + 5) The rest of decrement-based RMW ops (refcount_dec_and_test(),
> +   refcount_sub_and_test(), refcount_dec_if_one(), refcount_dec_not_one())
> +   provide only RELEASE memory ordering and control dependency on success
> +   (see next section). The respective atomic counterparts
> +   (atomic_dec_and_test(), atomic_sub_and_test()) provide full memory 
> ordering.
> + 6) The lock-based RMW ops (refcount_dec_and_lock() &
> +   refcount_dec_and_mutex_lock()) alway provide RELEASE memory ordering
> +   and ACQUIRE memory ordering & control dependency on success
> +   (see next section). The respective atomic counterparts
> +   (atomic_dec_and_lock() & atomic_dec_and_mutex_lock())
> +   provide full memory ordering.
> +
> +
> +
> +Details and examples
> +
> +
> +Here we consider the cases 3)-6) that do present differences together
> +with respective examples.
> +
> +case 3) - decrement-based RMW ops that return no value
> +--
> +
> +Function changes:
> +atomic_dec() --> refcount_dec()
> +
> +Memory ordering guarantee changes:
> +fully unordered --> RELEASE ordering
> +
> +RELEASE ordering guarantees that prior loads and stores are
> +completed before the operation. Implemented using smp_store_release().
> +
> +Examples:
> +~
> +
> +For fully unordered operations stores to a, b and c can
> +happen in any sequence:
> +
> +P0(int *a, int *b, int *c)
> +  {
> +   WRITE_ONCE(*a, 1);
> +   WRITE_ONCE(*b, 1);
> +   WRITE_ONCE(*c, 1);
> +  }
> +
> +
> +For a RELEASE ordered operation, read and write from/to @a

read or write  (??)

> +is guaranteed to happen before store to @b. There are no

If you want to keep "read and write" above, please change "is" to "are".

Are "write" and "store" the same?  They seem to be used interchangeably.

> +guarantees on the order of store/read to/from @c:
> +
> +P0(int *a, int *b, int *c)
> +  {
> +  READ_ONCE(*a);
> +  WRITE_ONCE(*a, 1);
> +  

Re: [PATCH] refcount_t: documentation for memory ordering differences

2017-11-06 Thread Randy Dunlap
On 11/06/2017 05:32 AM, Elena Reshetova wrote:
> Some functions from refcount_t API provide different
> memory ordering guarantees that their atomic counterparts.
> This adds a document outlining the differences and
> showing examples.
> 
> Signed-off-by: Elena Reshetova 
> ---
>  Documentation/refcount-vs-atomic.txt | 234 
> +++
>  1 file changed, 234 insertions(+)
>  create mode 100644 Documentation/refcount-vs-atomic.txt
> 
> diff --git a/Documentation/refcount-vs-atomic.txt 
> b/Documentation/refcount-vs-atomic.txt
> new file mode 100644
> index 000..09efd2b
> --- /dev/null
> +++ b/Documentation/refcount-vs-atomic.txt
> @@ -0,0 +1,234 @@
> +==
> +refcount_t API compare to atomic_t
> +==
> +
> +The goal of refcount_t API is to provide a minimal API for implementing
> +object's reference counters. While a generic architecture-independent
> +implementation from lib/refcount.c uses atomic operations underneath,
> +there is a number of differences between some of the refcount_*() and

   there are

> +atomic_*() functions with regards to the memory ordering guarantees.
> +This document outlines the differences and provides respective examples
> +in order to help maintainers validate their code against the change in
> +these memory ordering guarantees.
> +
> +memory-barriers.txt and atomic_t.txt provide more background to the
> +memory ordering in general and for atomic operations specifically.
> +
> +Summary of the differences
> +==
> +
> + 1) There is no difference between respective non-RMW ops, i.e.
> +   refcount_set() & refcount_read() have exactly the same ordering
> +   guarantees (meaning fully unordered) as atomic_set() and atomic_read().
> + 2) For the increment-based ops that return no value (namely
> +   refcount_inc() & refcount_add()) memory ordering guarantees are
> +   exactly the same (meaning fully unordered) as respective atomic
> +   functions (atomic_inc() & atomic_add()).
> + 3) For the decrement-based ops that return no value (namely
> +   refcount_dec()) memory ordering guarantees are slightly
> +   stronger than respective atomic counterpart (atomic_dec()).
> +   While atomic_dec() is fully unordered, refcount_dec() does
> +   provide a RELEASE memory ordering guarantee (see next section).
> + 4) For the rest of increment-based RMW ops (refcount_inc_not_zero(),
> +   refcount_add_not_zero()) the memory ordering guarantees are relaxed
> +   compare to their atomic counterparts (atomic_inc_not_zero()).

  compared

> +   Refcount variants provide no memory ordering guarantees apart from
> +   control dependency on success, while atomics provide a full memory

   provide full memory

> +   ordering guarantees (see next section).
> + 5) The rest of decrement-based RMW ops (refcount_dec_and_test(),
> +   refcount_sub_and_test(), refcount_dec_if_one(), refcount_dec_not_one())
> +   provide only RELEASE memory ordering and control dependency on success
> +   (see next section). The respective atomic counterparts
> +   (atomic_dec_and_test(), atomic_sub_and_test()) provide full memory 
> ordering.
> + 6) The lock-based RMW ops (refcount_dec_and_lock() &
> +   refcount_dec_and_mutex_lock()) alway provide RELEASE memory ordering
> +   and ACQUIRE memory ordering & control dependency on success
> +   (see next section). The respective atomic counterparts
> +   (atomic_dec_and_lock() & atomic_dec_and_mutex_lock())
> +   provide full memory ordering.
> +
> +
> +
> +Details and examples
> +
> +
> +Here we consider the cases 3)-6) that do present differences together
> +with respective examples.
> +
> +case 3) - decrement-based RMW ops that return no value
> +--
> +
> +Function changes:
> +atomic_dec() --> refcount_dec()
> +
> +Memory ordering guarantee changes:
> +fully unordered --> RELEASE ordering
> +
> +RELEASE ordering guarantees that prior loads and stores are
> +completed before the operation. Implemented using smp_store_release().
> +
> +Examples:
> +~
> +
> +For fully unordered operations stores to a, b and c can
> +happen in any sequence:
> +
> +P0(int *a, int *b, int *c)
> +  {
> +   WRITE_ONCE(*a, 1);
> +   WRITE_ONCE(*b, 1);
> +   WRITE_ONCE(*c, 1);
> +  }
> +
> +
> +For a RELEASE ordered operation, read and write from/to @a

read or write  (??)

> +is guaranteed to happen before store to @b. There are no

If you want to keep "read and write" above, please change "is" to "are".

Are "write" and "store" the same?  They seem to be used interchangeably.

> +guarantees on the order of store/read to/from @c:
> +
> +P0(int *a, int *b, int *c)
> +  {
> +  READ_ONCE(*a);
> +  WRITE_ONCE(*a, 1);
> +  smp_store_release(b, 1);
> +  

[PATCH] refcount_t: documentation for memory ordering differences

2017-11-06 Thread Elena Reshetova
Some functions from refcount_t API provide different
memory ordering guarantees that their atomic counterparts.
This adds a document outlining the differences and
showing examples.

Signed-off-by: Elena Reshetova 
---
 Documentation/refcount-vs-atomic.txt | 234 +++
 1 file changed, 234 insertions(+)
 create mode 100644 Documentation/refcount-vs-atomic.txt

diff --git a/Documentation/refcount-vs-atomic.txt 
b/Documentation/refcount-vs-atomic.txt
new file mode 100644
index 000..09efd2b
--- /dev/null
+++ b/Documentation/refcount-vs-atomic.txt
@@ -0,0 +1,234 @@
+==
+refcount_t API compare to atomic_t
+==
+
+The goal of refcount_t API is to provide a minimal API for implementing
+object's reference counters. While a generic architecture-independent
+implementation from lib/refcount.c uses atomic operations underneath,
+there is a number of differences between some of the refcount_*() and
+atomic_*() functions with regards to the memory ordering guarantees.
+This document outlines the differences and provides respective examples
+in order to help maintainers validate their code against the change in
+these memory ordering guarantees.
+
+memory-barriers.txt and atomic_t.txt provide more background to the
+memory ordering in general and for atomic operations specifically.
+
+Summary of the differences
+==
+
+ 1) There is no difference between respective non-RMW ops, i.e.
+   refcount_set() & refcount_read() have exactly the same ordering
+   guarantees (meaning fully unordered) as atomic_set() and atomic_read().
+ 2) For the increment-based ops that return no value (namely
+   refcount_inc() & refcount_add()) memory ordering guarantees are
+   exactly the same (meaning fully unordered) as respective atomic
+   functions (atomic_inc() & atomic_add()).
+ 3) For the decrement-based ops that return no value (namely
+   refcount_dec()) memory ordering guarantees are slightly
+   stronger than respective atomic counterpart (atomic_dec()).
+   While atomic_dec() is fully unordered, refcount_dec() does
+   provide a RELEASE memory ordering guarantee (see next section).
+ 4) For the rest of increment-based RMW ops (refcount_inc_not_zero(),
+   refcount_add_not_zero()) the memory ordering guarantees are relaxed
+   compare to their atomic counterparts (atomic_inc_not_zero()).
+   Refcount variants provide no memory ordering guarantees apart from
+   control dependency on success, while atomics provide a full memory
+   ordering guarantees (see next section).
+ 5) The rest of decrement-based RMW ops (refcount_dec_and_test(),
+   refcount_sub_and_test(), refcount_dec_if_one(), refcount_dec_not_one())
+   provide only RELEASE memory ordering and control dependency on success
+   (see next section). The respective atomic counterparts
+   (atomic_dec_and_test(), atomic_sub_and_test()) provide full memory ordering.
+ 6) The lock-based RMW ops (refcount_dec_and_lock() &
+   refcount_dec_and_mutex_lock()) alway provide RELEASE memory ordering
+   and ACQUIRE memory ordering & control dependency on success
+   (see next section). The respective atomic counterparts
+   (atomic_dec_and_lock() & atomic_dec_and_mutex_lock())
+   provide full memory ordering.
+
+
+
+Details and examples
+
+
+Here we consider the cases 3)-6) that do present differences together
+with respective examples.
+
+case 3) - decrement-based RMW ops that return no value
+--
+
+Function changes:
+atomic_dec() --> refcount_dec()
+
+Memory ordering guarantee changes:
+fully unordered --> RELEASE ordering
+
+RELEASE ordering guarantees that prior loads and stores are
+completed before the operation. Implemented using smp_store_release().
+
+Examples:
+~
+
+For fully unordered operations stores to a, b and c can
+happen in any sequence:
+
+P0(int *a, int *b, int *c)
+  {
+ WRITE_ONCE(*a, 1);
+ WRITE_ONCE(*b, 1);
+ WRITE_ONCE(*c, 1);
+  }
+
+
+For a RELEASE ordered operation, read and write from/to @a
+is guaranteed to happen before store to @b. There are no
+guarantees on the order of store/read to/from @c:
+
+P0(int *a, int *b, int *c)
+  {
+  READ_ONCE(*a);
+  WRITE_ONCE(*a, 1);
+  smp_store_release(b, 1);
+  WRITE_ONCE(*c, 1);
+  READ_ONCE(*c);
+  }
+
+
+case 4) - increment-based RMW ops that return a value
+-
+
+Function changes:
+atomic_inc_not_zero() --> refcount_inc_not_zero()
+no atomic counterpart --> refcount_add_not_zero()
+
+Memory ordering guarantees changes:
+fully ordered --> control dependency on success for stores
+
+Control dependency on success guarantees that if a reference for an
+object was successfully obtained (reference counter increment or

[PATCH] refcount_t: documentation for memory ordering differences

2017-11-06 Thread Elena Reshetova
Some functions from refcount_t API provide different
memory ordering guarantees that their atomic counterparts.
This adds a document outlining the differences and
showing examples.

Signed-off-by: Elena Reshetova 
---
 Documentation/refcount-vs-atomic.txt | 234 +++
 1 file changed, 234 insertions(+)
 create mode 100644 Documentation/refcount-vs-atomic.txt

diff --git a/Documentation/refcount-vs-atomic.txt 
b/Documentation/refcount-vs-atomic.txt
new file mode 100644
index 000..09efd2b
--- /dev/null
+++ b/Documentation/refcount-vs-atomic.txt
@@ -0,0 +1,234 @@
+==
+refcount_t API compare to atomic_t
+==
+
+The goal of refcount_t API is to provide a minimal API for implementing
+object's reference counters. While a generic architecture-independent
+implementation from lib/refcount.c uses atomic operations underneath,
+there is a number of differences between some of the refcount_*() and
+atomic_*() functions with regards to the memory ordering guarantees.
+This document outlines the differences and provides respective examples
+in order to help maintainers validate their code against the change in
+these memory ordering guarantees.
+
+memory-barriers.txt and atomic_t.txt provide more background to the
+memory ordering in general and for atomic operations specifically.
+
+Summary of the differences
+==
+
+ 1) There is no difference between respective non-RMW ops, i.e.
+   refcount_set() & refcount_read() have exactly the same ordering
+   guarantees (meaning fully unordered) as atomic_set() and atomic_read().
+ 2) For the increment-based ops that return no value (namely
+   refcount_inc() & refcount_add()) memory ordering guarantees are
+   exactly the same (meaning fully unordered) as respective atomic
+   functions (atomic_inc() & atomic_add()).
+ 3) For the decrement-based ops that return no value (namely
+   refcount_dec()) memory ordering guarantees are slightly
+   stronger than respective atomic counterpart (atomic_dec()).
+   While atomic_dec() is fully unordered, refcount_dec() does
+   provide a RELEASE memory ordering guarantee (see next section).
+ 4) For the rest of increment-based RMW ops (refcount_inc_not_zero(),
+   refcount_add_not_zero()) the memory ordering guarantees are relaxed
+   compare to their atomic counterparts (atomic_inc_not_zero()).
+   Refcount variants provide no memory ordering guarantees apart from
+   control dependency on success, while atomics provide a full memory
+   ordering guarantees (see next section).
+ 5) The rest of decrement-based RMW ops (refcount_dec_and_test(),
+   refcount_sub_and_test(), refcount_dec_if_one(), refcount_dec_not_one())
+   provide only RELEASE memory ordering and control dependency on success
+   (see next section). The respective atomic counterparts
+   (atomic_dec_and_test(), atomic_sub_and_test()) provide full memory ordering.
+ 6) The lock-based RMW ops (refcount_dec_and_lock() &
+   refcount_dec_and_mutex_lock()) alway provide RELEASE memory ordering
+   and ACQUIRE memory ordering & control dependency on success
+   (see next section). The respective atomic counterparts
+   (atomic_dec_and_lock() & atomic_dec_and_mutex_lock())
+   provide full memory ordering.
+
+
+
+Details and examples
+
+
+Here we consider the cases 3)-6) that do present differences together
+with respective examples.
+
+case 3) - decrement-based RMW ops that return no value
+--
+
+Function changes:
+atomic_dec() --> refcount_dec()
+
+Memory ordering guarantee changes:
+fully unordered --> RELEASE ordering
+
+RELEASE ordering guarantees that prior loads and stores are
+completed before the operation. Implemented using smp_store_release().
+
+Examples:
+~
+
+For fully unordered operations stores to a, b and c can
+happen in any sequence:
+
+P0(int *a, int *b, int *c)
+  {
+ WRITE_ONCE(*a, 1);
+ WRITE_ONCE(*b, 1);
+ WRITE_ONCE(*c, 1);
+  }
+
+
+For a RELEASE ordered operation, read and write from/to @a
+is guaranteed to happen before store to @b. There are no
+guarantees on the order of store/read to/from @c:
+
+P0(int *a, int *b, int *c)
+  {
+  READ_ONCE(*a);
+  WRITE_ONCE(*a, 1);
+  smp_store_release(b, 1);
+  WRITE_ONCE(*c, 1);
+  READ_ONCE(*c);
+  }
+
+
+case 4) - increment-based RMW ops that return a value
+-
+
+Function changes:
+atomic_inc_not_zero() --> refcount_inc_not_zero()
+no atomic counterpart --> refcount_add_not_zero()
+
+Memory ordering guarantees changes:
+fully ordered --> control dependency on success for stores
+
+Control dependency on success guarantees that if a reference for an
+object was successfully obtained (reference counter increment or
+addition happened,