Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Thu, Nov 16, 2017 at 02:44:30PM +0100, Michal Hocko wrote: > On Mon 13-11-17 09:09:57, Reshetova, Elena wrote: > > > > > > > Note that there's work done on better documents and updates to this one. > > > One document that might be good to read (I have not in fact had time to > > > read it myself yet :-(): > > > > > > https://github.com/aparri/memory- > > > model/blob/master/Documentation/explanation.txt > > > > > > > I have just finished reading over this and must say that this is excellent. > > I fully second this. The main problem with > Documentation/memory-barriers.txt is that it jumps from "easy to follow" > to "blow your head" parts is just too quick. On the other hand the above > explanation builds the picture from basics and piles up new layers on > top of previous. So I found it much more easier to grasp. I cannot > really speak for the correctness in all aspects but it certainly makes a > lot of sense to me. If you are really interested then feel free to add. > Acked-by: Michal HockoThank you, props to Alan for the writing, and I have applied your ack! Thanx, Paul
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Thu, Nov 16, 2017 at 02:44:30PM +0100, Michal Hocko wrote: > On Mon 13-11-17 09:09:57, Reshetova, Elena wrote: > > > > > > > Note that there's work done on better documents and updates to this one. > > > One document that might be good to read (I have not in fact had time to > > > read it myself yet :-(): > > > > > > https://github.com/aparri/memory- > > > model/blob/master/Documentation/explanation.txt > > > > > > > I have just finished reading over this and must say that this is excellent. > > I fully second this. The main problem with > Documentation/memory-barriers.txt is that it jumps from "easy to follow" > to "blow your head" parts is just too quick. On the other hand the above > explanation builds the picture from basics and piles up new layers on > top of previous. So I found it much more easier to grasp. I cannot > really speak for the correctness in all aspects but it certainly makes a > lot of sense to me. If you are really interested then feel free to add. > Acked-by: Michal Hocko Thank you, props to Alan for the writing, and I have applied your ack! Thanx, Paul
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Mon 13-11-17 09:09:57, Reshetova, Elena wrote: > > > > Note that there's work done on better documents and updates to this one. > > One document that might be good to read (I have not in fact had time to > > read it myself yet :-(): > > > > https://github.com/aparri/memory- > > model/blob/master/Documentation/explanation.txt > > > > I have just finished reading over this and must say that this is excellent. I fully second this. The main problem with Documentation/memory-barriers.txt is that it jumps from "easy to follow" to "blow your head" parts is just too quick. On the other hand the above explanation builds the picture from basics and piles up new layers on top of previous. So I found it much more easier to grasp. I cannot really speak for the correctness in all aspects but it certainly makes a lot of sense to me. If you are really interested then feel free to add. Acked-by: Michal Hocko-- Michal Hocko SUSE Labs
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Mon 13-11-17 09:09:57, Reshetova, Elena wrote: > > > > Note that there's work done on better documents and updates to this one. > > One document that might be good to read (I have not in fact had time to > > read it myself yet :-(): > > > > https://github.com/aparri/memory- > > model/blob/master/Documentation/explanation.txt > > > > I have just finished reading over this and must say that this is excellent. I fully second this. The main problem with Documentation/memory-barriers.txt is that it jumps from "easy to follow" to "blow your head" parts is just too quick. On the other hand the above explanation builds the picture from basics and piles up new layers on top of previous. So I found it much more easier to grasp. I cannot really speak for the correctness in all aspects but it certainly makes a lot of sense to me. If you are really interested then feel free to add. Acked-by: Michal Hocko -- Michal Hocko SUSE Labs
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Thu, Nov 16, 2017 at 09:58:04AM +0100, Peter Zijlstra wrote: > On Wed, Nov 15, 2017 at 10:01:11PM +0100, Andrea Parri wrote: > > > > And in specific things like: > > > > > > 135e8c9250dd5 > > > ecf7d01c229d1 > > > > > > which use the release of rq->lock paired with the next acquire of the > > > same rq->lock to match with an smp_rmb(). > > > > Those cycles are currently forbidden by LKMM _when_ you consider the > > smp_mb__after_spinlock() from schedule(). See rfi-rel-acq-is-not-mb > > from my previous email and Alan's remarks about cumul-fence. > > I'm not sure I get your point; and you all seem to forget I do not in > fact speak the ordering lingo. So I have no idea what > rfi-blah-blah or cumul-fence mean. I expand on my comment. Consider the following test: C T1 {} P0(int *x, int *y, spinlock_t *s) { spin_lock(s); WRITE_ONCE(*x, 1); spin_unlock(s); spin_lock(s); WRITE_ONCE(*y, 1); spin_unlock(s); } P1(int *x, int *y) { int r0; int r1; r0 = READ_ONCE(*y); smp_rmb(); r1 = READ_ONCE(*x); } exists (1:r0=1 /\ 1:r1=0) According to LKMM, the store to x happens before the store to y but there is no guarantee that the former store propagate (to P1) before the latter (which is what we need to forbid that state). As a result, that state in the "exists" clause is _allowed_ by LKMM. The LKMM encodes happens-before (or execution) ordering with a relation named "hb", while it encodes "propagation ordering" with "cumul-fence". Andrea > > I know rel-acq isn't smp_mb() and I don't think any of the above patches > need it to be. They just need it do be a local ordering, no? > > Even without smp_mb__after_spinlock() we get that: > > spin_lock() > x = 1 > spin_unlock() > spin_lock() > y = 1 > spin_unlock() > > guarantees that x happens-before y, right? > > And that should be sufficient to then order something else against, like > for example: > > r2 = y > smp_rmb() > r1 = x > > no? > >
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Thu, Nov 16, 2017 at 09:58:04AM +0100, Peter Zijlstra wrote: > On Wed, Nov 15, 2017 at 10:01:11PM +0100, Andrea Parri wrote: > > > > And in specific things like: > > > > > > 135e8c9250dd5 > > > ecf7d01c229d1 > > > > > > which use the release of rq->lock paired with the next acquire of the > > > same rq->lock to match with an smp_rmb(). > > > > Those cycles are currently forbidden by LKMM _when_ you consider the > > smp_mb__after_spinlock() from schedule(). See rfi-rel-acq-is-not-mb > > from my previous email and Alan's remarks about cumul-fence. > > I'm not sure I get your point; and you all seem to forget I do not in > fact speak the ordering lingo. So I have no idea what > rfi-blah-blah or cumul-fence mean. I expand on my comment. Consider the following test: C T1 {} P0(int *x, int *y, spinlock_t *s) { spin_lock(s); WRITE_ONCE(*x, 1); spin_unlock(s); spin_lock(s); WRITE_ONCE(*y, 1); spin_unlock(s); } P1(int *x, int *y) { int r0; int r1; r0 = READ_ONCE(*y); smp_rmb(); r1 = READ_ONCE(*x); } exists (1:r0=1 /\ 1:r1=0) According to LKMM, the store to x happens before the store to y but there is no guarantee that the former store propagate (to P1) before the latter (which is what we need to forbid that state). As a result, that state in the "exists" clause is _allowed_ by LKMM. The LKMM encodes happens-before (or execution) ordering with a relation named "hb", while it encodes "propagation ordering" with "cumul-fence". Andrea > > I know rel-acq isn't smp_mb() and I don't think any of the above patches > need it to be. They just need it do be a local ordering, no? > > Even without smp_mb__after_spinlock() we get that: > > spin_lock() > x = 1 > spin_unlock() > spin_lock() > y = 1 > spin_unlock() > > guarantees that x happens-before y, right? > > And that should be sufficient to then order something else against, like > for example: > > r2 = y > smp_rmb() > r1 = x > > no? > >
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Wed, Nov 15, 2017 at 10:01:11PM +0100, Andrea Parri wrote: > > And in specific things like: > > > > 135e8c9250dd5 > > ecf7d01c229d1 > > > > which use the release of rq->lock paired with the next acquire of the > > same rq->lock to match with an smp_rmb(). > > Those cycles are currently forbidden by LKMM _when_ you consider the > smp_mb__after_spinlock() from schedule(). See rfi-rel-acq-is-not-mb > from my previous email and Alan's remarks about cumul-fence. I'm not sure I get your point; and you all seem to forget I do not in fact speak the ordering lingo. So I have no idea what rfi-blah-blah or cumul-fence mean. I know rel-acq isn't smp_mb() and I don't think any of the above patches need it to be. They just need it do be a local ordering, no? Even without smp_mb__after_spinlock() we get that: spin_lock() x = 1 spin_unlock() spin_lock() y = 1 spin_unlock() guarantees that x happens-before y, right? And that should be sufficient to then order something else against, like for example: r2 = y smp_rmb() r1 = x no?
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Wed, Nov 15, 2017 at 10:01:11PM +0100, Andrea Parri wrote: > > And in specific things like: > > > > 135e8c9250dd5 > > ecf7d01c229d1 > > > > which use the release of rq->lock paired with the next acquire of the > > same rq->lock to match with an smp_rmb(). > > Those cycles are currently forbidden by LKMM _when_ you consider the > smp_mb__after_spinlock() from schedule(). See rfi-rel-acq-is-not-mb > from my previous email and Alan's remarks about cumul-fence. I'm not sure I get your point; and you all seem to forget I do not in fact speak the ordering lingo. So I have no idea what rfi-blah-blah or cumul-fence mean. I know rel-acq isn't smp_mb() and I don't think any of the above patches need it to be. They just need it do be a local ordering, no? Even without smp_mb__after_spinlock() we get that: spin_lock() x = 1 spin_unlock() spin_lock() y = 1 spin_unlock() guarantees that x happens-before y, right? And that should be sufficient to then order something else against, like for example: r2 = y smp_rmb() r1 = x no?
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Wed, Nov 15, 2017 at 03:22:33PM -0500, Alan Stern wrote: > You know, sometimes I feel like I'm losing my mind. Hehe, I'm very familiar with that feeling ;-)
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Wed, Nov 15, 2017 at 03:22:33PM -0500, Alan Stern wrote: > You know, sometimes I feel like I'm losing my mind. Hehe, I'm very familiar with that feeling ;-)
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Wed, Nov 15, 2017 at 09:03:07PM +0100, Peter Zijlstra wrote: > On Wed, Nov 15, 2017 at 02:15:19PM -0500, Alan Stern wrote: > > On Wed, 15 Nov 2017, Will Deacon wrote: > > > > > On Thu, Nov 02, 2017 at 04:21:56PM -0400, Alan Stern wrote: > > > > I was trying to think of something completely different. If you have a > > > > release/acquire to the same address, it creates a happens-before > > > > ordering: > > > > > > > > Access x > > > > Release a > > > > Acquire a > > > > Access y > > > > > > > > Here is the access to x happens-before the access to y. This is true > > > > even on x86, even in the presence of forwarding -- the CPU still has to > > > > execute the instructions in order. But if the release and acquire are > > > > to different addresses: > > > > > > > > Access x > > > > Release a > > > > Acquire b > > > > Access y > > > > > > > > then there is no happens-before ordering for x and y -- the CPU can > > > > execute the last two instructions before the first two. x86 and > > > > PowerPC won't do this, but I believe ARMv8 can. (Please correct me if > > > > it can't.) > > > > > > Release/Acquire are RCsc on ARMv8, so they are ordered irrespective of > > > address. > > > > Ah, okay, thanks. > > > > In any case, we have considered removing this ordering constraint > > (store-release followed by load-acquire for the same location) from the > > Linux-kernel memory model. > > Why? Its a perfectly sensible construct. > > > I'm not aware of any code in the kernel that depends on it. Do any of > > you happen to know of any examples? > > All locks? Something like: > > spin_lock() > /* foo */ > spin_unlock() > spin_lock() > /* bar */ > spin_unlock(); > > Has a fairly high foo happens-before bar expectation level. > > And in specific things like: > > 135e8c9250dd5 > ecf7d01c229d1 > > which use the release of rq->lock paired with the next acquire of the > same rq->lock to match with an smp_rmb(). Those cycles are currently forbidden by LKMM _when_ you consider the smp_mb__after_spinlock() from schedule(). See rfi-rel-acq-is-not-mb from my previous email and Alan's remarks about cumul-fence. Andrea
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Wed, Nov 15, 2017 at 09:03:07PM +0100, Peter Zijlstra wrote: > On Wed, Nov 15, 2017 at 02:15:19PM -0500, Alan Stern wrote: > > On Wed, 15 Nov 2017, Will Deacon wrote: > > > > > On Thu, Nov 02, 2017 at 04:21:56PM -0400, Alan Stern wrote: > > > > I was trying to think of something completely different. If you have a > > > > release/acquire to the same address, it creates a happens-before > > > > ordering: > > > > > > > > Access x > > > > Release a > > > > Acquire a > > > > Access y > > > > > > > > Here is the access to x happens-before the access to y. This is true > > > > even on x86, even in the presence of forwarding -- the CPU still has to > > > > execute the instructions in order. But if the release and acquire are > > > > to different addresses: > > > > > > > > Access x > > > > Release a > > > > Acquire b > > > > Access y > > > > > > > > then there is no happens-before ordering for x and y -- the CPU can > > > > execute the last two instructions before the first two. x86 and > > > > PowerPC won't do this, but I believe ARMv8 can. (Please correct me if > > > > it can't.) > > > > > > Release/Acquire are RCsc on ARMv8, so they are ordered irrespective of > > > address. > > > > Ah, okay, thanks. > > > > In any case, we have considered removing this ordering constraint > > (store-release followed by load-acquire for the same location) from the > > Linux-kernel memory model. > > Why? Its a perfectly sensible construct. > > > I'm not aware of any code in the kernel that depends on it. Do any of > > you happen to know of any examples? > > All locks? Something like: > > spin_lock() > /* foo */ > spin_unlock() > spin_lock() > /* bar */ > spin_unlock(); > > Has a fairly high foo happens-before bar expectation level. > > And in specific things like: > > 135e8c9250dd5 > ecf7d01c229d1 > > which use the release of rq->lock paired with the next acquire of the > same rq->lock to match with an smp_rmb(). Those cycles are currently forbidden by LKMM _when_ you consider the smp_mb__after_spinlock() from schedule(). See rfi-rel-acq-is-not-mb from my previous email and Alan's remarks about cumul-fence. Andrea
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Wed, 15 Nov 2017, Peter Zijlstra wrote: > On Wed, Nov 15, 2017 at 02:15:19PM -0500, Alan Stern wrote: > > On Wed, 15 Nov 2017, Will Deacon wrote: > > > > > On Thu, Nov 02, 2017 at 04:21:56PM -0400, Alan Stern wrote: > > > > I was trying to think of something completely different. If you have a > > > > release/acquire to the same address, it creates a happens-before > > > > ordering: > > > > > > > > Access x > > > > Release a > > > > Acquire a > > > > Access y > > > > > > > > Here is the access to x happens-before the access to y. This is true > > > > even on x86, even in the presence of forwarding -- the CPU still has to > > > > execute the instructions in order. But if the release and acquire are > > > > to different addresses: > > > > > > > > Access x > > > > Release a > > > > Acquire b > > > > Access y > > > > > > > > then there is no happens-before ordering for x and y -- the CPU can > > > > execute the last two instructions before the first two. x86 and > > > > PowerPC won't do this, but I believe ARMv8 can. (Please correct me if > > > > it can't.) > > > > > > Release/Acquire are RCsc on ARMv8, so they are ordered irrespective of > > > address. > > > > Ah, okay, thanks. > > > > In any case, we have considered removing this ordering constraint > > (store-release followed by load-acquire for the same location) from the > > Linux-kernel memory model. > > Why? Its a perfectly sensible construct. > > > I'm not aware of any code in the kernel that depends on it. Do any of > > you happen to know of any examples? > > All locks? Something like: > > spin_lock() > /* foo */ > spin_unlock() > spin_lock() > /* bar */ > spin_unlock(); > > Has a fairly high foo happens-before bar expectation level. > > And in specific things like: > > 135e8c9250dd5 > ecf7d01c229d1 > > which use the release of rq->lock paired with the next acquire of the > same rq->lock to match with an smp_rmb(). You know, sometimes I feel like I'm losing my mind. Yes, of course -- this was in fact the original reason for adding that constraint to the memory model in the first place! An unlock-to-lock link between two CPUs would naturally create an ordering relation, and we wanted the same to be true when everything occurred on a single CPU. I'll shut up now... Alan
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Wed, 15 Nov 2017, Peter Zijlstra wrote: > On Wed, Nov 15, 2017 at 02:15:19PM -0500, Alan Stern wrote: > > On Wed, 15 Nov 2017, Will Deacon wrote: > > > > > On Thu, Nov 02, 2017 at 04:21:56PM -0400, Alan Stern wrote: > > > > I was trying to think of something completely different. If you have a > > > > release/acquire to the same address, it creates a happens-before > > > > ordering: > > > > > > > > Access x > > > > Release a > > > > Acquire a > > > > Access y > > > > > > > > Here is the access to x happens-before the access to y. This is true > > > > even on x86, even in the presence of forwarding -- the CPU still has to > > > > execute the instructions in order. But if the release and acquire are > > > > to different addresses: > > > > > > > > Access x > > > > Release a > > > > Acquire b > > > > Access y > > > > > > > > then there is no happens-before ordering for x and y -- the CPU can > > > > execute the last two instructions before the first two. x86 and > > > > PowerPC won't do this, but I believe ARMv8 can. (Please correct me if > > > > it can't.) > > > > > > Release/Acquire are RCsc on ARMv8, so they are ordered irrespective of > > > address. > > > > Ah, okay, thanks. > > > > In any case, we have considered removing this ordering constraint > > (store-release followed by load-acquire for the same location) from the > > Linux-kernel memory model. > > Why? Its a perfectly sensible construct. > > > I'm not aware of any code in the kernel that depends on it. Do any of > > you happen to know of any examples? > > All locks? Something like: > > spin_lock() > /* foo */ > spin_unlock() > spin_lock() > /* bar */ > spin_unlock(); > > Has a fairly high foo happens-before bar expectation level. > > And in specific things like: > > 135e8c9250dd5 > ecf7d01c229d1 > > which use the release of rq->lock paired with the next acquire of the > same rq->lock to match with an smp_rmb(). You know, sometimes I feel like I'm losing my mind. Yes, of course -- this was in fact the original reason for adding that constraint to the memory model in the first place! An unlock-to-lock link between two CPUs would naturally create an ordering relation, and we wanted the same to be true when everything occurred on a single CPU. I'll shut up now... Alan
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Wed, Nov 15, 2017 at 02:15:19PM -0500, Alan Stern wrote: > On Wed, 15 Nov 2017, Will Deacon wrote: > > > On Thu, Nov 02, 2017 at 04:21:56PM -0400, Alan Stern wrote: > > > I was trying to think of something completely different. If you have a > > > release/acquire to the same address, it creates a happens-before > > > ordering: > > > > > > Access x > > > Release a > > > Acquire a > > > Access y > > > > > > Here is the access to x happens-before the access to y. This is true > > > even on x86, even in the presence of forwarding -- the CPU still has to > > > execute the instructions in order. But if the release and acquire are > > > to different addresses: > > > > > > Access x > > > Release a > > > Acquire b > > > Access y > > > > > > then there is no happens-before ordering for x and y -- the CPU can > > > execute the last two instructions before the first two. x86 and > > > PowerPC won't do this, but I believe ARMv8 can. (Please correct me if > > > it can't.) > > > > Release/Acquire are RCsc on ARMv8, so they are ordered irrespective of > > address. > > Ah, okay, thanks. > > In any case, we have considered removing this ordering constraint > (store-release followed by load-acquire for the same location) from the > Linux-kernel memory model. Why? Its a perfectly sensible construct. > I'm not aware of any code in the kernel that depends on it. Do any of > you happen to know of any examples? All locks? Something like: spin_lock() /* foo */ spin_unlock() spin_lock() /* bar */ spin_unlock(); Has a fairly high foo happens-before bar expectation level. And in specific things like: 135e8c9250dd5 ecf7d01c229d1 which use the release of rq->lock paired with the next acquire of the same rq->lock to match with an smp_rmb().
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Wed, Nov 15, 2017 at 02:15:19PM -0500, Alan Stern wrote: > On Wed, 15 Nov 2017, Will Deacon wrote: > > > On Thu, Nov 02, 2017 at 04:21:56PM -0400, Alan Stern wrote: > > > I was trying to think of something completely different. If you have a > > > release/acquire to the same address, it creates a happens-before > > > ordering: > > > > > > Access x > > > Release a > > > Acquire a > > > Access y > > > > > > Here is the access to x happens-before the access to y. This is true > > > even on x86, even in the presence of forwarding -- the CPU still has to > > > execute the instructions in order. But if the release and acquire are > > > to different addresses: > > > > > > Access x > > > Release a > > > Acquire b > > > Access y > > > > > > then there is no happens-before ordering for x and y -- the CPU can > > > execute the last two instructions before the first two. x86 and > > > PowerPC won't do this, but I believe ARMv8 can. (Please correct me if > > > it can't.) > > > > Release/Acquire are RCsc on ARMv8, so they are ordered irrespective of > > address. > > Ah, okay, thanks. > > In any case, we have considered removing this ordering constraint > (store-release followed by load-acquire for the same location) from the > Linux-kernel memory model. Why? Its a perfectly sensible construct. > I'm not aware of any code in the kernel that depends on it. Do any of > you happen to know of any examples? All locks? Something like: spin_lock() /* foo */ spin_unlock() spin_lock() /* bar */ spin_unlock(); Has a fairly high foo happens-before bar expectation level. And in specific things like: 135e8c9250dd5 ecf7d01c229d1 which use the release of rq->lock paired with the next acquire of the same rq->lock to match with an smp_rmb().
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Wed, 15 Nov 2017, Will Deacon wrote: > On Thu, Nov 02, 2017 at 04:21:56PM -0400, Alan Stern wrote: > > I was trying to think of something completely different. If you have a > > release/acquire to the same address, it creates a happens-before > > ordering: > > > > Access x > > Release a > > Acquire a > > Access y > > > > Here is the access to x happens-before the access to y. This is true > > even on x86, even in the presence of forwarding -- the CPU still has to > > execute the instructions in order. But if the release and acquire are > > to different addresses: > > > > Access x > > Release a > > Acquire b > > Access y > > > > then there is no happens-before ordering for x and y -- the CPU can > > execute the last two instructions before the first two. x86 and > > PowerPC won't do this, but I believe ARMv8 can. (Please correct me if > > it can't.) > > Release/Acquire are RCsc on ARMv8, so they are ordered irrespective of > address. Ah, okay, thanks. In any case, we have considered removing this ordering constraint (store-release followed by load-acquire for the same location) from the Linux-kernel memory model. I'm not aware of any code in the kernel that depends on it. Do any of you happen to know of any examples? Alan
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Wed, 15 Nov 2017, Will Deacon wrote: > On Thu, Nov 02, 2017 at 04:21:56PM -0400, Alan Stern wrote: > > I was trying to think of something completely different. If you have a > > release/acquire to the same address, it creates a happens-before > > ordering: > > > > Access x > > Release a > > Acquire a > > Access y > > > > Here is the access to x happens-before the access to y. This is true > > even on x86, even in the presence of forwarding -- the CPU still has to > > execute the instructions in order. But if the release and acquire are > > to different addresses: > > > > Access x > > Release a > > Acquire b > > Access y > > > > then there is no happens-before ordering for x and y -- the CPU can > > execute the last two instructions before the first two. x86 and > > PowerPC won't do this, but I believe ARMv8 can. (Please correct me if > > it can't.) > > Release/Acquire are RCsc on ARMv8, so they are ordered irrespective of > address. Ah, okay, thanks. In any case, we have considered removing this ordering constraint (store-release followed by load-acquire for the same location) from the Linux-kernel memory model. I'm not aware of any code in the kernel that depends on it. Do any of you happen to know of any examples? Alan
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Thu, Nov 02, 2017 at 04:21:56PM -0400, Alan Stern wrote: > I was trying to think of something completely different. If you have a > release/acquire to the same address, it creates a happens-before > ordering: > > Access x > Release a > Acquire a > Access y > > Here is the access to x happens-before the access to y. This is true > even on x86, even in the presence of forwarding -- the CPU still has to > execute the instructions in order. But if the release and acquire are > to different addresses: > > Access x > Release a > Acquire b > Access y > > then there is no happens-before ordering for x and y -- the CPU can > execute the last two instructions before the first two. x86 and > PowerPC won't do this, but I believe ARMv8 can. (Please correct me if > it can't.) Release/Acquire are RCsc on ARMv8, so they are ordered irrespective of address. Will
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Thu, Nov 02, 2017 at 04:21:56PM -0400, Alan Stern wrote: > I was trying to think of something completely different. If you have a > release/acquire to the same address, it creates a happens-before > ordering: > > Access x > Release a > Acquire a > Access y > > Here is the access to x happens-before the access to y. This is true > even on x86, even in the presence of forwarding -- the CPU still has to > execute the instructions in order. But if the release and acquire are > to different addresses: > > Access x > Release a > Acquire b > Access y > > then there is no happens-before ordering for x and y -- the CPU can > execute the last two instructions before the first two. x86 and > PowerPC won't do this, but I believe ARMv8 can. (Please correct me if > it can't.) Release/Acquire are RCsc on ARMv8, so they are ordered irrespective of address. Will
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Tue, Nov 14, 2017 at 11:23:53AM +, Reshetova, Elena wrote: > > On Mon, Nov 13, 2017 at 04:01:11PM +, Reshetova, Elena wrote: > > > > > > > On Mon, Nov 13, 2017 at 09:09:57AM +, Reshetova, Elena wrote: > > > > > > > > > > > > > > > > Note that there's work done on better documents and updates to this > > > > > > one. > > > > > > One document that might be good to read (I have not in fact had > > > > > > time to > > > > > > read it myself yet :-(): > > > > > > > > > > > > https://github.com/aparri/memory- > > > > > > model/blob/master/Documentation/explanation.txt > > > > > > > > > > I have just finished reading over this and must say that this is > > > > > excellent. > > > > > If I would have started reading on this topic from this doc and then > > > > > move > > > > > to other in-tree docs, including memory-barriers.txt, I would > > > > > have had much less issues/questions and it would be much less of a > > > > > bumpy > > > > > read. > > > > > > > > Glad you like it! May we have your Acked-by? > > > > > > I think my Acked-by has little value in this case since I am really a > > > beginner in it > > > myself, but I would strongly suggest to Peter and others to consider > > > inclusion > > > of this doc into the tree since I do see a value in it. Again, I am not > > > really an > > > important person here :) > > > > I respectfully disagree. > > > > The fact that this file was helpful to a self-described beginner > > such as yourself is -way- more important than it being helpful to us > > battlescarred veterans. > > > > Besides, Peter already agreed, perhaps in a moment of weakness, to > > allow his name to be added to this patch. > > > > In any case, it is your choice, but I personally would value your > > Acked-by. > > Sure, please feel free to add it if it is of any help/value! Very good, I have added it, thank you! Thanx, Paul > Best Regards, > Elena. > > > > > > > > Is there any plan to include it into official kernel doc tree? I > > > > > really think it > > > > > would be very helpful for others also even basically to explain the > > > > > notations, > > > > properties > > > > > and language people talk about these issues and give examples. > > > > > > > > Yes, we do plan to submit it for inclusion. > > > > > > Great, I think it would help people! > > > > Here is hoping! ;-) > > > > > > Thanx, Paul > > > > > Best Regards, > > > Elena. > > > > > > > > > > > > > > > Thanx, Paul > > > > > > > > > I will try to improve a bit the new doc I have previously sent a > > > > > patch for in the > > > > > spirit of this reading. > > > > > > > > > > > > > > > Best Regards, > > > > > Elena. > > > > > > > > >
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Tue, Nov 14, 2017 at 11:23:53AM +, Reshetova, Elena wrote: > > On Mon, Nov 13, 2017 at 04:01:11PM +, Reshetova, Elena wrote: > > > > > > > On Mon, Nov 13, 2017 at 09:09:57AM +, Reshetova, Elena wrote: > > > > > > > > > > > > > > > > Note that there's work done on better documents and updates to this > > > > > > one. > > > > > > One document that might be good to read (I have not in fact had > > > > > > time to > > > > > > read it myself yet :-(): > > > > > > > > > > > > https://github.com/aparri/memory- > > > > > > model/blob/master/Documentation/explanation.txt > > > > > > > > > > I have just finished reading over this and must say that this is > > > > > excellent. > > > > > If I would have started reading on this topic from this doc and then > > > > > move > > > > > to other in-tree docs, including memory-barriers.txt, I would > > > > > have had much less issues/questions and it would be much less of a > > > > > bumpy > > > > > read. > > > > > > > > Glad you like it! May we have your Acked-by? > > > > > > I think my Acked-by has little value in this case since I am really a > > > beginner in it > > > myself, but I would strongly suggest to Peter and others to consider > > > inclusion > > > of this doc into the tree since I do see a value in it. Again, I am not > > > really an > > > important person here :) > > > > I respectfully disagree. > > > > The fact that this file was helpful to a self-described beginner > > such as yourself is -way- more important than it being helpful to us > > battlescarred veterans. > > > > Besides, Peter already agreed, perhaps in a moment of weakness, to > > allow his name to be added to this patch. > > > > In any case, it is your choice, but I personally would value your > > Acked-by. > > Sure, please feel free to add it if it is of any help/value! Very good, I have added it, thank you! Thanx, Paul > Best Regards, > Elena. > > > > > > > > Is there any plan to include it into official kernel doc tree? I > > > > > really think it > > > > > would be very helpful for others also even basically to explain the > > > > > notations, > > > > properties > > > > > and language people talk about these issues and give examples. > > > > > > > > Yes, we do plan to submit it for inclusion. > > > > > > Great, I think it would help people! > > > > Here is hoping! ;-) > > > > > > Thanx, Paul > > > > > Best Regards, > > > Elena. > > > > > > > > > > > > > > > Thanx, Paul > > > > > > > > > I will try to improve a bit the new doc I have previously sent a > > > > > patch for in the > > > > > spirit of this reading. > > > > > > > > > > > > > > > Best Regards, > > > > > Elena. > > > > > > > > >
RE: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
> On Mon, Nov 13, 2017 at 04:01:11PM +, Reshetova, Elena wrote: > > > > > On Mon, Nov 13, 2017 at 09:09:57AM +, Reshetova, Elena wrote: > > > > > > > > > > > > > Note that there's work done on better documents and updates to this > > > > > one. > > > > > One document that might be good to read (I have not in fact had time > > > > > to > > > > > read it myself yet :-(): > > > > > > > > > > https://github.com/aparri/memory- > > > > > model/blob/master/Documentation/explanation.txt > > > > > > > > I have just finished reading over this and must say that this is > > > > excellent. > > > > If I would have started reading on this topic from this doc and then > > > > move > > > > to other in-tree docs, including memory-barriers.txt, I would > > > > have had much less issues/questions and it would be much less of a bumpy > > > > read. > > > > > > Glad you like it! May we have your Acked-by? > > > > I think my Acked-by has little value in this case since I am really a > > beginner in it > > myself, but I would strongly suggest to Peter and others to consider > > inclusion > > of this doc into the tree since I do see a value in it. Again, I am not > > really an > > important person here :) > > I respectfully disagree. > > The fact that this file was helpful to a self-described beginner > such as yourself is -way- more important than it being helpful to us > battlescarred veterans. > > Besides, Peter already agreed, perhaps in a moment of weakness, to > allow his name to be added to this patch. > > In any case, it is your choice, but I personally would value your > Acked-by. Sure, please feel free to add it if it is of any help/value! Best Regards, Elena. > > > > > Is there any plan to include it into official kernel doc tree? I really > > > > think it > > > > would be very helpful for others also even basically to explain the > > > > notations, > > > properties > > > > and language people talk about these issues and give examples. > > > > > > Yes, we do plan to submit it for inclusion. > > > > Great, I think it would help people! > > Here is hoping! ;-) > > > Thanx, Paul > > > Best Regards, > > Elena. > > > > > > > > > > > Thanx, Paul > > > > > > > I will try to improve a bit the new doc I have previously sent a patch > > > > for in the > > > > spirit of this reading. > > > > > > > > > > > > Best Regards, > > > > Elena. > > > > > >
RE: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
> On Mon, Nov 13, 2017 at 04:01:11PM +, Reshetova, Elena wrote: > > > > > On Mon, Nov 13, 2017 at 09:09:57AM +, Reshetova, Elena wrote: > > > > > > > > > > > > > Note that there's work done on better documents and updates to this > > > > > one. > > > > > One document that might be good to read (I have not in fact had time > > > > > to > > > > > read it myself yet :-(): > > > > > > > > > > https://github.com/aparri/memory- > > > > > model/blob/master/Documentation/explanation.txt > > > > > > > > I have just finished reading over this and must say that this is > > > > excellent. > > > > If I would have started reading on this topic from this doc and then > > > > move > > > > to other in-tree docs, including memory-barriers.txt, I would > > > > have had much less issues/questions and it would be much less of a bumpy > > > > read. > > > > > > Glad you like it! May we have your Acked-by? > > > > I think my Acked-by has little value in this case since I am really a > > beginner in it > > myself, but I would strongly suggest to Peter and others to consider > > inclusion > > of this doc into the tree since I do see a value in it. Again, I am not > > really an > > important person here :) > > I respectfully disagree. > > The fact that this file was helpful to a self-described beginner > such as yourself is -way- more important than it being helpful to us > battlescarred veterans. > > Besides, Peter already agreed, perhaps in a moment of weakness, to > allow his name to be added to this patch. > > In any case, it is your choice, but I personally would value your > Acked-by. Sure, please feel free to add it if it is of any help/value! Best Regards, Elena. > > > > > Is there any plan to include it into official kernel doc tree? I really > > > > think it > > > > would be very helpful for others also even basically to explain the > > > > notations, > > > properties > > > > and language people talk about these issues and give examples. > > > > > > Yes, we do plan to submit it for inclusion. > > > > Great, I think it would help people! > > Here is hoping! ;-) > > > Thanx, Paul > > > Best Regards, > > Elena. > > > > > > > > > > > Thanx, Paul > > > > > > > I will try to improve a bit the new doc I have previously sent a patch > > > > for in the > > > > spirit of this reading. > > > > > > > > > > > > Best Regards, > > > > Elena. > > > > > >
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Mon, Nov 13, 2017 at 04:01:11PM +, Reshetova, Elena wrote: > > > On Mon, Nov 13, 2017 at 09:09:57AM +, Reshetova, Elena wrote: > > > > > > > > > > Note that there's work done on better documents and updates to this one. > > > > One document that might be good to read (I have not in fact had time to > > > > read it myself yet :-(): > > > > > > > > https://github.com/aparri/memory- > > > > model/blob/master/Documentation/explanation.txt > > > > > > I have just finished reading over this and must say that this is > > > excellent. > > > If I would have started reading on this topic from this doc and then move > > > to other in-tree docs, including memory-barriers.txt, I would > > > have had much less issues/questions and it would be much less of a bumpy > > > read. > > > > Glad you like it! May we have your Acked-by? > > I think my Acked-by has little value in this case since I am really a > beginner in it > myself, but I would strongly suggest to Peter and others to consider > inclusion > of this doc into the tree since I do see a value in it. Again, I am not > really an > important person here :) I respectfully disagree. The fact that this file was helpful to a self-described beginner such as yourself is -way- more important than it being helpful to us battlescarred veterans. Besides, Peter already agreed, perhaps in a moment of weakness, to allow his name to be added to this patch. In any case, it is your choice, but I personally would value your Acked-by. > > > Is there any plan to include it into official kernel doc tree? I really > > > think it > > > would be very helpful for others also even basically to explain the > > > notations, > > properties > > > and language people talk about these issues and give examples. > > > > Yes, we do plan to submit it for inclusion. > > Great, I think it would help people! Here is hoping! ;-) Thanx, Paul > Best Regards, > Elena. > > > > > > > Thanx, Paul > > > > > I will try to improve a bit the new doc I have previously sent a patch > > > for in the > > > spirit of this reading. > > > > > > > > > Best Regards, > > > Elena. > > > >
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Mon, Nov 13, 2017 at 04:01:11PM +, Reshetova, Elena wrote: > > > On Mon, Nov 13, 2017 at 09:09:57AM +, Reshetova, Elena wrote: > > > > > > > > > > Note that there's work done on better documents and updates to this one. > > > > One document that might be good to read (I have not in fact had time to > > > > read it myself yet :-(): > > > > > > > > https://github.com/aparri/memory- > > > > model/blob/master/Documentation/explanation.txt > > > > > > I have just finished reading over this and must say that this is > > > excellent. > > > If I would have started reading on this topic from this doc and then move > > > to other in-tree docs, including memory-barriers.txt, I would > > > have had much less issues/questions and it would be much less of a bumpy > > > read. > > > > Glad you like it! May we have your Acked-by? > > I think my Acked-by has little value in this case since I am really a > beginner in it > myself, but I would strongly suggest to Peter and others to consider > inclusion > of this doc into the tree since I do see a value in it. Again, I am not > really an > important person here :) I respectfully disagree. The fact that this file was helpful to a self-described beginner such as yourself is -way- more important than it being helpful to us battlescarred veterans. Besides, Peter already agreed, perhaps in a moment of weakness, to allow his name to be added to this patch. In any case, it is your choice, but I personally would value your Acked-by. > > > Is there any plan to include it into official kernel doc tree? I really > > > think it > > > would be very helpful for others also even basically to explain the > > > notations, > > properties > > > and language people talk about these issues and give examples. > > > > Yes, we do plan to submit it for inclusion. > > Great, I think it would help people! Here is hoping! ;-) Thanx, Paul > Best Regards, > Elena. > > > > > > > Thanx, Paul > > > > > I will try to improve a bit the new doc I have previously sent a patch > > > for in the > > > spirit of this reading. > > > > > > > > > Best Regards, > > > Elena. > > > >
RE: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
> On Mon, Nov 13, 2017 at 09:09:57AM +, Reshetova, Elena wrote: > > > > > > > Note that there's work done on better documents and updates to this one. > > > One document that might be good to read (I have not in fact had time to > > > read it myself yet :-(): > > > > > > https://github.com/aparri/memory- > > > model/blob/master/Documentation/explanation.txt > > > > I have just finished reading over this and must say that this is excellent. > > If I would have started reading on this topic from this doc and then move > > to other in-tree docs, including memory-barriers.txt, I would > > have had much less issues/questions and it would be much less of a bumpy > > read. > > Glad you like it! May we have your Acked-by? I think my Acked-by has little value in this case since I am really a beginner in it myself, but I would strongly suggest to Peter and others to consider inclusion of this doc into the tree since I do see a value in it. Again, I am not really an important person here :) > > > Is there any plan to include it into official kernel doc tree? I really > > think it > > would be very helpful for others also even basically to explain the > > notations, > properties > > and language people talk about these issues and give examples. > > Yes, we do plan to submit it for inclusion. Great, I think it would help people! Best Regards, Elena. > > > Thanx, Paul > > > I will try to improve a bit the new doc I have previously sent a patch for > > in the > > spirit of this reading. > > > > > > Best Regards, > > Elena. > >
RE: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
> On Mon, Nov 13, 2017 at 09:09:57AM +, Reshetova, Elena wrote: > > > > > > > Note that there's work done on better documents and updates to this one. > > > One document that might be good to read (I have not in fact had time to > > > read it myself yet :-(): > > > > > > https://github.com/aparri/memory- > > > model/blob/master/Documentation/explanation.txt > > > > I have just finished reading over this and must say that this is excellent. > > If I would have started reading on this topic from this doc and then move > > to other in-tree docs, including memory-barriers.txt, I would > > have had much less issues/questions and it would be much less of a bumpy > > read. > > Glad you like it! May we have your Acked-by? I think my Acked-by has little value in this case since I am really a beginner in it myself, but I would strongly suggest to Peter and others to consider inclusion of this doc into the tree since I do see a value in it. Again, I am not really an important person here :) > > > Is there any plan to include it into official kernel doc tree? I really > > think it > > would be very helpful for others also even basically to explain the > > notations, > properties > > and language people talk about these issues and give examples. > > Yes, we do plan to submit it for inclusion. Great, I think it would help people! Best Regards, Elena. > > > Thanx, Paul > > > I will try to improve a bit the new doc I have previously sent a patch for > > in the > > spirit of this reading. > > > > > > Best Regards, > > Elena. > >
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Mon, Nov 13, 2017 at 09:09:57AM +, Reshetova, Elena wrote: > > > > Note that there's work done on better documents and updates to this one. > > One document that might be good to read (I have not in fact had time to > > read it myself yet :-(): > > > > https://github.com/aparri/memory- > > model/blob/master/Documentation/explanation.txt > > I have just finished reading over this and must say that this is excellent. > If I would have started reading on this topic from this doc and then move > to other in-tree docs, including memory-barriers.txt, I would > have had much less issues/questions and it would be much less of a bumpy > read. Glad you like it! May we have your Acked-by? > Is there any plan to include it into official kernel doc tree? I really think > it > would be very helpful for others also even basically to explain the > notations, properties > and language people talk about these issues and give examples. Yes, we do plan to submit it for inclusion. Thanx, Paul > I will try to improve a bit the new doc I have previously sent a patch for in > the > spirit of this reading. > > > Best Regards, > Elena. >
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Mon, Nov 13, 2017 at 09:09:57AM +, Reshetova, Elena wrote: > > > > Note that there's work done on better documents and updates to this one. > > One document that might be good to read (I have not in fact had time to > > read it myself yet :-(): > > > > https://github.com/aparri/memory- > > model/blob/master/Documentation/explanation.txt > > I have just finished reading over this and must say that this is excellent. > If I would have started reading on this topic from this doc and then move > to other in-tree docs, including memory-barriers.txt, I would > have had much less issues/questions and it would be much less of a bumpy > read. Glad you like it! May we have your Acked-by? > Is there any plan to include it into official kernel doc tree? I really think > it > would be very helpful for others also even basically to explain the > notations, properties > and language people talk about these issues and give examples. Yes, we do plan to submit it for inclusion. Thanx, Paul > I will try to improve a bit the new doc I have previously sent a patch for in > the > spirit of this reading. > > > Best Regards, > Elena. >
RE: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
> Note that there's work done on better documents and updates to this one. > One document that might be good to read (I have not in fact had time to > read it myself yet :-(): > > https://github.com/aparri/memory- > model/blob/master/Documentation/explanation.txt > I have just finished reading over this and must say that this is excellent. If I would have started reading on this topic from this doc and then move to other in-tree docs, including memory-barriers.txt, I would have had much less issues/questions and it would be much less of a bumpy read. Is there any plan to include it into official kernel doc tree? I really think it would be very helpful for others also even basically to explain the notations, properties and language people talk about these issues and give examples. I will try to improve a bit the new doc I have previously sent a patch for in the spirit of this reading. Best Regards, Elena.
RE: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
> Note that there's work done on better documents and updates to this one. > One document that might be good to read (I have not in fact had time to > read it myself yet :-(): > > https://github.com/aparri/memory- > model/blob/master/Documentation/explanation.txt > I have just finished reading over this and must say that this is excellent. If I would have started reading on this topic from this doc and then move to other in-tree docs, including memory-barriers.txt, I would have had much less issues/questions and it would be much less of a bumpy read. Is there any plan to include it into official kernel doc tree? I really think it would be very helpful for others also even basically to explain the notations, properties and language people talk about these issues and give examples. I will try to improve a bit the new doc I have previously sent a patch for in the spirit of this reading. Best Regards, Elena.
RE: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
> On Thu, Nov 02, 2017 at 11:04:53AM +, Reshetova, Elena wrote: > > > Well refcount_dec_and_test() is not the only function that has different > > memory ordering specifics. So, the full answer then for any arbitrary case > > according to your points above would be smth like: > > > > for each substituted function (atomic_* --> refcount_*) that actually > > has changes in memory ordering *** perform the following: > > - mention the difference > > - mention the actual code place where the change would affect ( > >various put and etc. functions) > > - potentially try to understand if it would make a difference for > > this code (again here is where I am not sure how to do it properly) > > > > > > *** the actual list of these functions to me looks like: > > > 1) atomic_inc_not_zero -> refcount_inc_not_zero. Change is from > > underlying atomic_cmpxchg() to atomic_try_cmpxchg_relaxed () First > > one implies SMP-conditional general memory barrier (smp_mb()) on each > > side of the actual operation (at least according to documentation, In > > reality it goes through so many changes, ifdefs and conditionals that > > one gets lost Easily in the process). > > That matches _inc(), which doesn't imply anything. So you are saying that atomic_inc_not_zero has the same guarantees (meaning no guarantees) as atomic_inc? If yes, then I am now confused here because atomic_inc_not_zero based on atomic_cmpxchg() which according to this https://elixir.free-electrons.com/linux/latest/source/Documentation/memory-barriers.txt#L2527 does imply the smp_mb() > > The reasoning being that you must already have obtained a pointer to the > object; and that will necessarily include enough ordering to observe the > object. Therefore increasing the refcount doesn't require further > constraints. > > > Second one according to the comment implies no memory barriers > > whatsoever, BUT "provides a control dependency which will order future > > stores against the inc" So, every store (not load) operation (I guess > > we are talking here only about store operations that touch the same > > object, but I wonder how it is defined in terms of memory location? > > Memory location is irrelevant. I was just trying to understand to what "stores" does control dependency barrier applies here? You mean it applies to absolutely all stores on all objects? I guess we are talking about the same object here, just trying to understand how object is defined in terms of memory location. > > > (overlapping?) that comes inside "if refcount_inc_not_zero(){}" cause > > would only be executed if functions returns true. > > The point is that a CPU must NOT speculate on stores. So while it can > speculate a branch, any store inside the branch must not become visible > until it can commit to that branch. > > The whole point being that possible modifications to the object to which > we've _conditionally_ acquired a reference, will only happen after we're > sure to indeed have acquired this reference. > > Otherwise its similar to _inc(). Yes, now I understand this part. > > > So, practically what we might "loose" here is any updates on the > > object protected by this refcounter done by another CPU. But for this > > purpose we expect the developer to take some other lock/memory barrier > > into use, right? > > Correct, object modification had better be serialized. Refcounts cannot > (even with atomic_t) help with that. > > > We only care of incrementing the refcount atomically and make sure we > > don't do anything with object unless it is ready for us to be used. > > Just so.. > > > If yes, then I guess it might be a big change for the code that > > previously relied on atomic-provided smp_mb() barriers and now instead > > needs to take an explicit locks/barriers by itself. > > Right, however such memory ordering should be explicitly documented. > Unknown and hidden memory ordering is a straight up bug, because > modifications to the code (be they introducing refcount_t or anything > else) can easily break things. Yes, this is what has been mentioned before many times, but again reality might be different, so better be prepared here also. > > > 2) atomic_** -> refcount_add_not_zero. Fortunately these are super > > rare and need to see per each case dependent on actual atomic function > > substituted. > > See inc_not_zero. > > > 3) atomic_add() --> refcount_add(). This should not make any change > > since both do not provide memory ordering at all, but there is a > > comment in the refcount.c that says that refcount_add " provide a > > control dependency and thereby orders future stores". How is it done > > given that refcount_add is void returning function?? I am fully > > confused with this one. > > Weird, mostly comes from being implemented using add_not_zero I suppose. Yes, underlying is add_not_zero, but is it still correct to talk about any control dependencies here? How would it possibly look in
RE: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
> On Thu, Nov 02, 2017 at 11:04:53AM +, Reshetova, Elena wrote: > > > Well refcount_dec_and_test() is not the only function that has different > > memory ordering specifics. So, the full answer then for any arbitrary case > > according to your points above would be smth like: > > > > for each substituted function (atomic_* --> refcount_*) that actually > > has changes in memory ordering *** perform the following: > > - mention the difference > > - mention the actual code place where the change would affect ( > >various put and etc. functions) > > - potentially try to understand if it would make a difference for > > this code (again here is where I am not sure how to do it properly) > > > > > > *** the actual list of these functions to me looks like: > > > 1) atomic_inc_not_zero -> refcount_inc_not_zero. Change is from > > underlying atomic_cmpxchg() to atomic_try_cmpxchg_relaxed () First > > one implies SMP-conditional general memory barrier (smp_mb()) on each > > side of the actual operation (at least according to documentation, In > > reality it goes through so many changes, ifdefs and conditionals that > > one gets lost Easily in the process). > > That matches _inc(), which doesn't imply anything. So you are saying that atomic_inc_not_zero has the same guarantees (meaning no guarantees) as atomic_inc? If yes, then I am now confused here because atomic_inc_not_zero based on atomic_cmpxchg() which according to this https://elixir.free-electrons.com/linux/latest/source/Documentation/memory-barriers.txt#L2527 does imply the smp_mb() > > The reasoning being that you must already have obtained a pointer to the > object; and that will necessarily include enough ordering to observe the > object. Therefore increasing the refcount doesn't require further > constraints. > > > Second one according to the comment implies no memory barriers > > whatsoever, BUT "provides a control dependency which will order future > > stores against the inc" So, every store (not load) operation (I guess > > we are talking here only about store operations that touch the same > > object, but I wonder how it is defined in terms of memory location? > > Memory location is irrelevant. I was just trying to understand to what "stores" does control dependency barrier applies here? You mean it applies to absolutely all stores on all objects? I guess we are talking about the same object here, just trying to understand how object is defined in terms of memory location. > > > (overlapping?) that comes inside "if refcount_inc_not_zero(){}" cause > > would only be executed if functions returns true. > > The point is that a CPU must NOT speculate on stores. So while it can > speculate a branch, any store inside the branch must not become visible > until it can commit to that branch. > > The whole point being that possible modifications to the object to which > we've _conditionally_ acquired a reference, will only happen after we're > sure to indeed have acquired this reference. > > Otherwise its similar to _inc(). Yes, now I understand this part. > > > So, practically what we might "loose" here is any updates on the > > object protected by this refcounter done by another CPU. But for this > > purpose we expect the developer to take some other lock/memory barrier > > into use, right? > > Correct, object modification had better be serialized. Refcounts cannot > (even with atomic_t) help with that. > > > We only care of incrementing the refcount atomically and make sure we > > don't do anything with object unless it is ready for us to be used. > > Just so.. > > > If yes, then I guess it might be a big change for the code that > > previously relied on atomic-provided smp_mb() barriers and now instead > > needs to take an explicit locks/barriers by itself. > > Right, however such memory ordering should be explicitly documented. > Unknown and hidden memory ordering is a straight up bug, because > modifications to the code (be they introducing refcount_t or anything > else) can easily break things. Yes, this is what has been mentioned before many times, but again reality might be different, so better be prepared here also. > > > 2) atomic_** -> refcount_add_not_zero. Fortunately these are super > > rare and need to see per each case dependent on actual atomic function > > substituted. > > See inc_not_zero. > > > 3) atomic_add() --> refcount_add(). This should not make any change > > since both do not provide memory ordering at all, but there is a > > comment in the refcount.c that says that refcount_add " provide a > > control dependency and thereby orders future stores". How is it done > > given that refcount_add is void returning function?? I am fully > > confused with this one. > > Weird, mostly comes from being implemented using add_not_zero I suppose. Yes, underlying is add_not_zero, but is it still correct to talk about any control dependencies here? How would it possibly look in
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Thu, 2 Nov 2017, Andrea Parri wrote: > > This is forbidden. It would remain forbidden even if the smp_mb in P1 > > were replaced by a similar release/acquire pair for the same memory > > location. > > Hopefully, the LKMM does not agree with this assessment... ;-) No, it doesn't. > Here's a two-threads example showing that "(w)mb is _not_ rfi-rel-acq": > > C rfi-rel-acq-is-not-mb > > {} > > P0(int *x, int *y, int *a) > { > WRITE_ONCE(*x, 1); > smp_store_release(a, 1); > r1 = smp_load_acquire(a); > WRITE_ONCE(*y, 1); > } > > P1(int *x, int *y) > { > int r0; > int r1; > > r0 = READ_ONCE(*y); > smp_rmb(); > r1 = READ_ONCE(*x); > } > > exists (1:r0=1 /\ 1:r1=0) Right. There is a happens-before edge between the two WRITE_ONCE calls in P0 but no cumul-fence edge, and therefore the test is allowed. Here is an example where a happens-before edge suffices to provide ordering: P0(int *x, int *y) { WRITE_ONCE(*x, 1); smp_wmb(); WRITE_ONCE(*y, 1); } P1(int *x, int *y, int *a) { int rx, ry, ra; ry = READ_ONCE(*y); smp_store_release(a, 1); ra = smp_load_acquire(a); rx = READ_ONCE(*x); } exists (1:rx=0 /\ 1:ry=1) This test is forbidden, but it would be allowed if the release and acquire accessed different locations. Alan Stern
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Thu, 2 Nov 2017, Andrea Parri wrote: > > This is forbidden. It would remain forbidden even if the smp_mb in P1 > > were replaced by a similar release/acquire pair for the same memory > > location. > > Hopefully, the LKMM does not agree with this assessment... ;-) No, it doesn't. > Here's a two-threads example showing that "(w)mb is _not_ rfi-rel-acq": > > C rfi-rel-acq-is-not-mb > > {} > > P0(int *x, int *y, int *a) > { > WRITE_ONCE(*x, 1); > smp_store_release(a, 1); > r1 = smp_load_acquire(a); > WRITE_ONCE(*y, 1); > } > > P1(int *x, int *y) > { > int r0; > int r1; > > r0 = READ_ONCE(*y); > smp_rmb(); > r1 = READ_ONCE(*x); > } > > exists (1:r0=1 /\ 1:r1=0) Right. There is a happens-before edge between the two WRITE_ONCE calls in P0 but no cumul-fence edge, and therefore the test is allowed. Here is an example where a happens-before edge suffices to provide ordering: P0(int *x, int *y) { WRITE_ONCE(*x, 1); smp_wmb(); WRITE_ONCE(*y, 1); } P1(int *x, int *y, int *a) { int rx, ry, ra; ry = READ_ONCE(*y); smp_store_release(a, 1); ra = smp_load_acquire(a); rx = READ_ONCE(*x); } exists (1:rx=0 /\ 1:ry=1) This test is forbidden, but it would be allowed if the release and acquire accessed different locations. Alan Stern
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Thu, 2 Nov 2017, Will Deacon wrote: > > Right. To address your point: release + acquire isn't the same as a > > full barrier either. The SB pattern illustrates the difference: > > > > P0 P1 > > Write x=1 Write y=1 > > Release a smp_mb > > Acquire b Read x=0 > > Read y=0 > > > > This would not be allowed if the release + acquire sequence was > > replaced by smp_mb. But as it stands, this is allowed because nothing > > prevents the CPU from interchanging the order of the release and the > > acquire -- and then you're back to the acquire + release case. > > > > However, there is one circumstance where this interchange isn't > > allowed: when the release and acquire access the same memory > > location. Thus: > > > > P0(int *x, int *y, int *a) > > { > > int r0; > > > > WRITE_ONCE(*x, 1); > > smp_store_release(a, 1); > > smp_load_acquire(a); > > r0 = READ_ONCE(*y); > > } > > > > P1(int *x, int *y) > > { > > int r1; > > > > WRITE_ONCE(*y, 1); > > smp_mb(); > > r1 = READ_ONCE(*x); > > } > > > > exists (0:r0=0 /\ 1:r1=0) > > > > This is forbidden. It would remain forbidden even if the smp_mb in P1 > > were replaced by a similar release/acquire pair for the same memory > > location. I have to apologize; this was totally wrong. This test is not forbidden under the LKMM, and it certainly isn't forbidden if the smp_mb is replaced by a release/acquire pair. I was trying to think of something completely different. If you have a release/acquire to the same address, it creates a happens-before ordering: Access x Release a Acquire a Access y Here is the access to x happens-before the access to y. This is true even on x86, even in the presence of forwarding -- the CPU still has to execute the instructions in order. But if the release and acquire are to different addresses: Access x Release a Acquire b Access y then there is no happens-before ordering for x and y -- the CPU can execute the last two instructions before the first two. x86 and PowerPC won't do this, but I believe ARMv8 can. (Please correct me if it can't.) But happens-before is much weaker than a strong fence. So in short, release + acquire, even to the same address, is no replacement for smp_mb(). > Isn't this allowed on x86 mapping smp_mb() to mfence, store-release to plain > store and load-acquire to plain load? All we're saying is that you can forward > from a release to an acquire, which is fine for RCpc semantics. > > e.g. > > X86 SB+mfence+po-rfi-po > "MFencedWR Fre PodWW Rfi PodRR Fre" > Generator=diyone7 (version 7.46+3) > Prefetch=0:x=F,0:y=T,1:y=F,1:x=T > Com=Fr Fr > Orig=MFencedWR Fre PodWW Rfi PodRR Fre > { > } > P0 | P1 ; > MOV [x],$1 | MOV [y],$1 ; > MFENCE | MOV [z],$1 ; > MOV EAX,[y] | MOV EAX,[z] ; > | MOV EBX,[x] ; > exists > (0:EAX=0 /\ 1:EAX=1 /\ 1:EBX=0) > > which herd says is allowed: > > Test SB+mfence+po-rfi-po Allowed > States 4 > 0:EAX=0; 1:EAX=1; 1:EBX=0; > 0:EAX=0; 1:EAX=1; 1:EBX=1; > 0:EAX=1; 1:EAX=1; 1:EBX=0; > 0:EAX=1; 1:EAX=1; 1:EBX=1; > Ok > Witnesses > Positive: 1 Negative: 3 > Condition exists (0:EAX=0 /\ 1:EAX=1 /\ 1:EBX=0) > Observation SB+mfence+po-rfi-po Sometimes 1 3 > Time SB+mfence+po-rfi-po 0.00 > Hash=0f983e2d7579e5c04c332f9ac620c31f > > and I can reproduce using litmus to actually run it on my x86 box: > > %% > % Results for SB+mfence+po-rfi-po.litmus % > %% > X86 SB+mfence+po-rfi-po > "MFencedWR Fre PodWW Rfi PodRR Fre" > > {} > > P0 | P1 ; > MOV [x],$1 | MOV [y],$1 ; > MFENCE | MOV [z],$1 ; > MOV EAX,[y] | MOV EAX,[z] ; > | MOV EBX,[x] ; > > exists (0:EAX=0 /\ 1:EAX=1 /\ 1:EBX=0) > Generated assembler > #START _litmus_P1 > movl $1,(%r8,%rcx) > movl $1,(%r9,%rcx) > movl (%r9,%rcx),%eax > movl (%rdi,%rcx),%edx > #START _litmus_P0 > movl $1,(%rdx,%rcx) > mfence > movl (%rdi,%rcx),%eax > > Test SB+mfence+po-rfi-po Allowed > Histogram (4 states) > 8 *>0:EAX=0; 1:EAX=1; 1:EBX=0; > 1999851:>0:EAX=1; 1:EAX=1; 1:EBX=0; > 1999549:>0:EAX=0; 1:EAX=1; 1:EBX=1; > 592 :>0:EAX=1; 1:EAX=1; 1:EBX=1; > Ok > > Witnesses > Positive: 8, Negative: 392 > Condition exists (0:EAX=0 /\ 1:EAX=1 /\ 1:EBX=0) is validated > Hash=0f983e2d7579e5c04c332f9ac620c31f > Generator=diyone7 (version 7.46+3) > Com=Fr Fr > Orig=MFencedWR Fre PodWW Rfi PodRR Fre > Observation SB+mfence+po-rfi-po Sometimes 8 392 > Time SB+mfence+po-rfi-po 0.17 Yes, you are quite correct. Thanks for pointing out my mistake. Alan Stern
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Thu, 2 Nov 2017, Will Deacon wrote: > > Right. To address your point: release + acquire isn't the same as a > > full barrier either. The SB pattern illustrates the difference: > > > > P0 P1 > > Write x=1 Write y=1 > > Release a smp_mb > > Acquire b Read x=0 > > Read y=0 > > > > This would not be allowed if the release + acquire sequence was > > replaced by smp_mb. But as it stands, this is allowed because nothing > > prevents the CPU from interchanging the order of the release and the > > acquire -- and then you're back to the acquire + release case. > > > > However, there is one circumstance where this interchange isn't > > allowed: when the release and acquire access the same memory > > location. Thus: > > > > P0(int *x, int *y, int *a) > > { > > int r0; > > > > WRITE_ONCE(*x, 1); > > smp_store_release(a, 1); > > smp_load_acquire(a); > > r0 = READ_ONCE(*y); > > } > > > > P1(int *x, int *y) > > { > > int r1; > > > > WRITE_ONCE(*y, 1); > > smp_mb(); > > r1 = READ_ONCE(*x); > > } > > > > exists (0:r0=0 /\ 1:r1=0) > > > > This is forbidden. It would remain forbidden even if the smp_mb in P1 > > were replaced by a similar release/acquire pair for the same memory > > location. I have to apologize; this was totally wrong. This test is not forbidden under the LKMM, and it certainly isn't forbidden if the smp_mb is replaced by a release/acquire pair. I was trying to think of something completely different. If you have a release/acquire to the same address, it creates a happens-before ordering: Access x Release a Acquire a Access y Here is the access to x happens-before the access to y. This is true even on x86, even in the presence of forwarding -- the CPU still has to execute the instructions in order. But if the release and acquire are to different addresses: Access x Release a Acquire b Access y then there is no happens-before ordering for x and y -- the CPU can execute the last two instructions before the first two. x86 and PowerPC won't do this, but I believe ARMv8 can. (Please correct me if it can't.) But happens-before is much weaker than a strong fence. So in short, release + acquire, even to the same address, is no replacement for smp_mb(). > Isn't this allowed on x86 mapping smp_mb() to mfence, store-release to plain > store and load-acquire to plain load? All we're saying is that you can forward > from a release to an acquire, which is fine for RCpc semantics. > > e.g. > > X86 SB+mfence+po-rfi-po > "MFencedWR Fre PodWW Rfi PodRR Fre" > Generator=diyone7 (version 7.46+3) > Prefetch=0:x=F,0:y=T,1:y=F,1:x=T > Com=Fr Fr > Orig=MFencedWR Fre PodWW Rfi PodRR Fre > { > } > P0 | P1 ; > MOV [x],$1 | MOV [y],$1 ; > MFENCE | MOV [z],$1 ; > MOV EAX,[y] | MOV EAX,[z] ; > | MOV EBX,[x] ; > exists > (0:EAX=0 /\ 1:EAX=1 /\ 1:EBX=0) > > which herd says is allowed: > > Test SB+mfence+po-rfi-po Allowed > States 4 > 0:EAX=0; 1:EAX=1; 1:EBX=0; > 0:EAX=0; 1:EAX=1; 1:EBX=1; > 0:EAX=1; 1:EAX=1; 1:EBX=0; > 0:EAX=1; 1:EAX=1; 1:EBX=1; > Ok > Witnesses > Positive: 1 Negative: 3 > Condition exists (0:EAX=0 /\ 1:EAX=1 /\ 1:EBX=0) > Observation SB+mfence+po-rfi-po Sometimes 1 3 > Time SB+mfence+po-rfi-po 0.00 > Hash=0f983e2d7579e5c04c332f9ac620c31f > > and I can reproduce using litmus to actually run it on my x86 box: > > %% > % Results for SB+mfence+po-rfi-po.litmus % > %% > X86 SB+mfence+po-rfi-po > "MFencedWR Fre PodWW Rfi PodRR Fre" > > {} > > P0 | P1 ; > MOV [x],$1 | MOV [y],$1 ; > MFENCE | MOV [z],$1 ; > MOV EAX,[y] | MOV EAX,[z] ; > | MOV EBX,[x] ; > > exists (0:EAX=0 /\ 1:EAX=1 /\ 1:EBX=0) > Generated assembler > #START _litmus_P1 > movl $1,(%r8,%rcx) > movl $1,(%r9,%rcx) > movl (%r9,%rcx),%eax > movl (%rdi,%rcx),%edx > #START _litmus_P0 > movl $1,(%rdx,%rcx) > mfence > movl (%rdi,%rcx),%eax > > Test SB+mfence+po-rfi-po Allowed > Histogram (4 states) > 8 *>0:EAX=0; 1:EAX=1; 1:EBX=0; > 1999851:>0:EAX=1; 1:EAX=1; 1:EBX=0; > 1999549:>0:EAX=0; 1:EAX=1; 1:EBX=1; > 592 :>0:EAX=1; 1:EAX=1; 1:EBX=1; > Ok > > Witnesses > Positive: 8, Negative: 392 > Condition exists (0:EAX=0 /\ 1:EAX=1 /\ 1:EBX=0) is validated > Hash=0f983e2d7579e5c04c332f9ac620c31f > Generator=diyone7 (version 7.46+3) > Com=Fr Fr > Orig=MFencedWR Fre PodWW Rfi PodRR Fre > Observation SB+mfence+po-rfi-po Sometimes 8 392 > Time SB+mfence+po-rfi-po 0.17 Yes, you are quite correct. Thanks for pointing out my mistake. Alan Stern
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Thu, Nov 02, 2017 at 01:08:52PM -0400, Alan Stern wrote: > On Thu, 2 Nov 2017, Peter Zijlstra wrote: > > > On Thu, Nov 02, 2017 at 11:40:35AM -0400, Alan Stern wrote: > > > On Thu, 2 Nov 2017, Peter Zijlstra wrote: > > > > > > > > Lock functions such as refcount_dec_and_lock() & > > > > > refcount_dec_and_mutex_lock() Provide exactly the same guarantees as > > > > > they atomic counterparts. > > > > > > > > Nope. The atomic_dec_and_lock() provides smp_mb() while > > > > refcount_dec_and_lock() merely orders all prior load/store's against all > > > > later load/store's. > > > > > > In fact there is no guaranteed ordering when refcount_dec_and_lock() > > > returns false; > > > > It should provide a release: > > > > - if !=1, dec_not_one will provide release > > - if ==1, dec_not_one will no-op, but then we'll acquire the lock and > >dec_and_test will provide the release, even if the test fails and we > >unlock again it should still dec. > > > > The one exception is when the counter is saturated, but in that case > > we'll never free the object and the ordering is moot in any case. > > Also if the counter is 0, but that will never happen if the > refcounting is correct. > > > > it provides ordering only if the return value is true. > > > In which case it provides acquire ordering (thanks to the spin_lock), > > > and both release ordering and a control dependency (thanks to the > > > refcount_dec_and_test). > > > > > > > The difference is subtle and involves at least 3 CPUs. I can't seem to > > > > write up anything simple, keeps turning into monsters :/ Will, Paul, > > > > have you got anything simple around? > > > > > > The combination of acquire + release is not the same as smp_mb, because > > > > acquire+release is nothing, its release+acquire that I meant which > > should order things locally, but now that you've got me looking at it > > again, we don't in fact do that. > > > > So refcount_dec_and_lock() will provide a release, irrespective of the > > return value (assuming we're not saturated). If it returns true, it also > > does an acquire for the lock. > > > > But combined they're acquire+release, which is unfortunate.. it means > > the lock section and the refcount stuff overlaps, but I don't suppose > > that's actually a problem. Need to consider more. > > Right. To address your point: release + acquire isn't the same as a > full barrier either. The SB pattern illustrates the difference: > > P0 P1 > Write x=1 Write y=1 > Release a smp_mb > Acquire b Read x=0 > Read y=0 > > This would not be allowed if the release + acquire sequence was > replaced by smp_mb. But as it stands, this is allowed because nothing > prevents the CPU from interchanging the order of the release and the > acquire -- and then you're back to the acquire + release case. > > However, there is one circumstance where this interchange isn't > allowed: when the release and acquire access the same memory > location. Thus: > > P0(int *x, int *y, int *a) > { > int r0; > > WRITE_ONCE(*x, 1); > smp_store_release(a, 1); > smp_load_acquire(a); > r0 = READ_ONCE(*y); > } > > P1(int *x, int *y) > { > int r1; > > WRITE_ONCE(*y, 1); > smp_mb(); > r1 = READ_ONCE(*x); > } > > exists (0:r0=0 /\ 1:r1=0) > > This is forbidden. It would remain forbidden even if the smp_mb in P1 > were replaced by a similar release/acquire pair for the same memory > location. Hopefully, the LKMM does not agree with this assessment... ;-) > > To see the difference between smp_mb and release/acquire requires three > threads: > > P0 P1 P2 > Write x=1 Read y=1Read z=1 > Release a data dep. smp_rmb > Acquire a Write z=1 Read x=0 > Write y=1 > > The Linux Kernel Memory Model allows this execution, although as far as > I know, no existing hardware will do it. But with smp_mb in P0, the > execution would be forbidden. Here's a two-threads example showing that "(w)mb is _not_ rfi-rel-acq": C rfi-rel-acq-is-not-mb {} P0(int *x, int *y, int *a) { WRITE_ONCE(*x, 1); smp_store_release(a, 1); r1 = smp_load_acquire(a); WRITE_ONCE(*y, 1); } P1(int *x, int *y) { int r0; int r1; r0 = READ_ONCE(*y); smp_rmb(); r1 = READ_ONCE(*x); } exists (1:r0=1 /\ 1:r1=0) Andrea > > None of this should be a problem for refcount_dec_and_lock, assuming it > is used purely for reference counting. > > Alan Stern >
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Thu, Nov 02, 2017 at 01:08:52PM -0400, Alan Stern wrote: > On Thu, 2 Nov 2017, Peter Zijlstra wrote: > > > On Thu, Nov 02, 2017 at 11:40:35AM -0400, Alan Stern wrote: > > > On Thu, 2 Nov 2017, Peter Zijlstra wrote: > > > > > > > > Lock functions such as refcount_dec_and_lock() & > > > > > refcount_dec_and_mutex_lock() Provide exactly the same guarantees as > > > > > they atomic counterparts. > > > > > > > > Nope. The atomic_dec_and_lock() provides smp_mb() while > > > > refcount_dec_and_lock() merely orders all prior load/store's against all > > > > later load/store's. > > > > > > In fact there is no guaranteed ordering when refcount_dec_and_lock() > > > returns false; > > > > It should provide a release: > > > > - if !=1, dec_not_one will provide release > > - if ==1, dec_not_one will no-op, but then we'll acquire the lock and > >dec_and_test will provide the release, even if the test fails and we > >unlock again it should still dec. > > > > The one exception is when the counter is saturated, but in that case > > we'll never free the object and the ordering is moot in any case. > > Also if the counter is 0, but that will never happen if the > refcounting is correct. > > > > it provides ordering only if the return value is true. > > > In which case it provides acquire ordering (thanks to the spin_lock), > > > and both release ordering and a control dependency (thanks to the > > > refcount_dec_and_test). > > > > > > > The difference is subtle and involves at least 3 CPUs. I can't seem to > > > > write up anything simple, keeps turning into monsters :/ Will, Paul, > > > > have you got anything simple around? > > > > > > The combination of acquire + release is not the same as smp_mb, because > > > > acquire+release is nothing, its release+acquire that I meant which > > should order things locally, but now that you've got me looking at it > > again, we don't in fact do that. > > > > So refcount_dec_and_lock() will provide a release, irrespective of the > > return value (assuming we're not saturated). If it returns true, it also > > does an acquire for the lock. > > > > But combined they're acquire+release, which is unfortunate.. it means > > the lock section and the refcount stuff overlaps, but I don't suppose > > that's actually a problem. Need to consider more. > > Right. To address your point: release + acquire isn't the same as a > full barrier either. The SB pattern illustrates the difference: > > P0 P1 > Write x=1 Write y=1 > Release a smp_mb > Acquire b Read x=0 > Read y=0 > > This would not be allowed if the release + acquire sequence was > replaced by smp_mb. But as it stands, this is allowed because nothing > prevents the CPU from interchanging the order of the release and the > acquire -- and then you're back to the acquire + release case. > > However, there is one circumstance where this interchange isn't > allowed: when the release and acquire access the same memory > location. Thus: > > P0(int *x, int *y, int *a) > { > int r0; > > WRITE_ONCE(*x, 1); > smp_store_release(a, 1); > smp_load_acquire(a); > r0 = READ_ONCE(*y); > } > > P1(int *x, int *y) > { > int r1; > > WRITE_ONCE(*y, 1); > smp_mb(); > r1 = READ_ONCE(*x); > } > > exists (0:r0=0 /\ 1:r1=0) > > This is forbidden. It would remain forbidden even if the smp_mb in P1 > were replaced by a similar release/acquire pair for the same memory > location. Hopefully, the LKMM does not agree with this assessment... ;-) > > To see the difference between smp_mb and release/acquire requires three > threads: > > P0 P1 P2 > Write x=1 Read y=1Read z=1 > Release a data dep. smp_rmb > Acquire a Write z=1 Read x=0 > Write y=1 > > The Linux Kernel Memory Model allows this execution, although as far as > I know, no existing hardware will do it. But with smp_mb in P0, the > execution would be forbidden. Here's a two-threads example showing that "(w)mb is _not_ rfi-rel-acq": C rfi-rel-acq-is-not-mb {} P0(int *x, int *y, int *a) { WRITE_ONCE(*x, 1); smp_store_release(a, 1); r1 = smp_load_acquire(a); WRITE_ONCE(*y, 1); } P1(int *x, int *y) { int r0; int r1; r0 = READ_ONCE(*y); smp_rmb(); r1 = READ_ONCE(*x); } exists (1:r0=1 /\ 1:r1=0) Andrea > > None of this should be a problem for refcount_dec_and_lock, assuming it > is used purely for reference counting. > > Alan Stern >
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Thu, Nov 02, 2017 at 05:16:44PM +, Will Deacon wrote: > On Thu, Nov 02, 2017 at 01:08:52PM -0400, Alan Stern wrote: > > Right. To address your point: release + acquire isn't the same as a > > full barrier either. The SB pattern illustrates the difference: > > > > P0 P1 > > Write x=1 Write y=1 > > Release a smp_mb > > Acquire b Read x=0 > > Read y=0 > > > > This would not be allowed if the release + acquire sequence was > > replaced by smp_mb. But as it stands, this is allowed because nothing > > prevents the CPU from interchanging the order of the release and the > > acquire -- and then you're back to the acquire + release case. > > > > However, there is one circumstance where this interchange isn't > > allowed: when the release and acquire access the same memory > > location. Thus: > > > > P0(int *x, int *y, int *a) > > { > > int r0; > > > > WRITE_ONCE(*x, 1); > > smp_store_release(a, 1); > > smp_load_acquire(a); > > r0 = READ_ONCE(*y); > > } > > > > P1(int *x, int *y) > > { > > int r1; > > > > WRITE_ONCE(*y, 1); > > smp_mb(); > > r1 = READ_ONCE(*x); > > } > > > > exists (0:r0=0 /\ 1:r1=0) > > > > This is forbidden. It would remain forbidden even if the smp_mb in P1 > > were replaced by a similar release/acquire pair for the same memory > > location. > > Isn't this allowed on x86 mapping smp_mb() to mfence, store-release to plain > store and load-acquire to plain load? All we're saying is that you can forward > from a release to an acquire, which is fine for RCpc semantics. Yeah, as it happens I talked to Will about that exact case while writing that email :-), this is why he has that thing handy.
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Thu, Nov 02, 2017 at 05:16:44PM +, Will Deacon wrote: > On Thu, Nov 02, 2017 at 01:08:52PM -0400, Alan Stern wrote: > > Right. To address your point: release + acquire isn't the same as a > > full barrier either. The SB pattern illustrates the difference: > > > > P0 P1 > > Write x=1 Write y=1 > > Release a smp_mb > > Acquire b Read x=0 > > Read y=0 > > > > This would not be allowed if the release + acquire sequence was > > replaced by smp_mb. But as it stands, this is allowed because nothing > > prevents the CPU from interchanging the order of the release and the > > acquire -- and then you're back to the acquire + release case. > > > > However, there is one circumstance where this interchange isn't > > allowed: when the release and acquire access the same memory > > location. Thus: > > > > P0(int *x, int *y, int *a) > > { > > int r0; > > > > WRITE_ONCE(*x, 1); > > smp_store_release(a, 1); > > smp_load_acquire(a); > > r0 = READ_ONCE(*y); > > } > > > > P1(int *x, int *y) > > { > > int r1; > > > > WRITE_ONCE(*y, 1); > > smp_mb(); > > r1 = READ_ONCE(*x); > > } > > > > exists (0:r0=0 /\ 1:r1=0) > > > > This is forbidden. It would remain forbidden even if the smp_mb in P1 > > were replaced by a similar release/acquire pair for the same memory > > location. > > Isn't this allowed on x86 mapping smp_mb() to mfence, store-release to plain > store and load-acquire to plain load? All we're saying is that you can forward > from a release to an acquire, which is fine for RCpc semantics. Yeah, as it happens I talked to Will about that exact case while writing that email :-), this is why he has that thing handy.
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Thu, Nov 02, 2017 at 01:08:52PM -0400, Alan Stern wrote: > On Thu, 2 Nov 2017, Peter Zijlstra wrote: > > > On Thu, Nov 02, 2017 at 11:40:35AM -0400, Alan Stern wrote: > > > On Thu, 2 Nov 2017, Peter Zijlstra wrote: > > > > > > > > Lock functions such as refcount_dec_and_lock() & > > > > > refcount_dec_and_mutex_lock() Provide exactly the same guarantees as > > > > > they atomic counterparts. > > > > > > > > Nope. The atomic_dec_and_lock() provides smp_mb() while > > > > refcount_dec_and_lock() merely orders all prior load/store's against all > > > > later load/store's. > > > > > > In fact there is no guaranteed ordering when refcount_dec_and_lock() > > > returns false; > > > > It should provide a release: > > > > - if !=1, dec_not_one will provide release > > - if ==1, dec_not_one will no-op, but then we'll acquire the lock and > >dec_and_test will provide the release, even if the test fails and we > >unlock again it should still dec. > > > > The one exception is when the counter is saturated, but in that case > > we'll never free the object and the ordering is moot in any case. > > Also if the counter is 0, but that will never happen if the > refcounting is correct. > > > > it provides ordering only if the return value is true. > > > In which case it provides acquire ordering (thanks to the spin_lock), > > > and both release ordering and a control dependency (thanks to the > > > refcount_dec_and_test). > > > > > > > The difference is subtle and involves at least 3 CPUs. I can't seem to > > > > write up anything simple, keeps turning into monsters :/ Will, Paul, > > > > have you got anything simple around? > > > > > > The combination of acquire + release is not the same as smp_mb, because > > > > acquire+release is nothing, its release+acquire that I meant which > > should order things locally, but now that you've got me looking at it > > again, we don't in fact do that. > > > > So refcount_dec_and_lock() will provide a release, irrespective of the > > return value (assuming we're not saturated). If it returns true, it also > > does an acquire for the lock. > > > > But combined they're acquire+release, which is unfortunate.. it means > > the lock section and the refcount stuff overlaps, but I don't suppose > > that's actually a problem. Need to consider more. > > Right. To address your point: release + acquire isn't the same as a > full barrier either. The SB pattern illustrates the difference: > > P0 P1 > Write x=1 Write y=1 > Release a smp_mb > Acquire b Read x=0 > Read y=0 > > This would not be allowed if the release + acquire sequence was > replaced by smp_mb. But as it stands, this is allowed because nothing > prevents the CPU from interchanging the order of the release and the > acquire -- and then you're back to the acquire + release case. > > However, there is one circumstance where this interchange isn't > allowed: when the release and acquire access the same memory > location. Thus: > > P0(int *x, int *y, int *a) > { > int r0; > > WRITE_ONCE(*x, 1); > smp_store_release(a, 1); > smp_load_acquire(a); > r0 = READ_ONCE(*y); > } > > P1(int *x, int *y) > { > int r1; > > WRITE_ONCE(*y, 1); > smp_mb(); > r1 = READ_ONCE(*x); > } > > exists (0:r0=0 /\ 1:r1=0) > > This is forbidden. It would remain forbidden even if the smp_mb in P1 > were replaced by a similar release/acquire pair for the same memory > location. Isn't this allowed on x86 mapping smp_mb() to mfence, store-release to plain store and load-acquire to plain load? All we're saying is that you can forward from a release to an acquire, which is fine for RCpc semantics. e.g. X86 SB+mfence+po-rfi-po "MFencedWR Fre PodWW Rfi PodRR Fre" Generator=diyone7 (version 7.46+3) Prefetch=0:x=F,0:y=T,1:y=F,1:x=T Com=Fr Fr Orig=MFencedWR Fre PodWW Rfi PodRR Fre { } P0 | P1 ; MOV [x],$1 | MOV [y],$1 ; MFENCE | MOV [z],$1 ; MOV EAX,[y] | MOV EAX,[z] ; | MOV EBX,[x] ; exists (0:EAX=0 /\ 1:EAX=1 /\ 1:EBX=0) which herd says is allowed: Test SB+mfence+po-rfi-po Allowed States 4 0:EAX=0; 1:EAX=1; 1:EBX=0; 0:EAX=0; 1:EAX=1; 1:EBX=1; 0:EAX=1; 1:EAX=1; 1:EBX=0; 0:EAX=1; 1:EAX=1; 1:EBX=1; Ok Witnesses Positive: 1 Negative: 3 Condition exists (0:EAX=0 /\ 1:EAX=1 /\ 1:EBX=0) Observation SB+mfence+po-rfi-po Sometimes 1 3 Time SB+mfence+po-rfi-po 0.00 Hash=0f983e2d7579e5c04c332f9ac620c31f and I can reproduce using litmus to actually run it on my x86 box: %% % Results for SB+mfence+po-rfi-po.litmus % %% X86 SB+mfence+po-rfi-po "MFencedWR Fre PodWW Rfi PodRR Fre" {} P0 | P1 ; MOV [x],$1 | MOV [y],$1 ; MFENCE | MOV [z],$1 ; MOV EAX,[y] |
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Thu, Nov 02, 2017 at 01:08:52PM -0400, Alan Stern wrote: > On Thu, 2 Nov 2017, Peter Zijlstra wrote: > > > On Thu, Nov 02, 2017 at 11:40:35AM -0400, Alan Stern wrote: > > > On Thu, 2 Nov 2017, Peter Zijlstra wrote: > > > > > > > > Lock functions such as refcount_dec_and_lock() & > > > > > refcount_dec_and_mutex_lock() Provide exactly the same guarantees as > > > > > they atomic counterparts. > > > > > > > > Nope. The atomic_dec_and_lock() provides smp_mb() while > > > > refcount_dec_and_lock() merely orders all prior load/store's against all > > > > later load/store's. > > > > > > In fact there is no guaranteed ordering when refcount_dec_and_lock() > > > returns false; > > > > It should provide a release: > > > > - if !=1, dec_not_one will provide release > > - if ==1, dec_not_one will no-op, but then we'll acquire the lock and > >dec_and_test will provide the release, even if the test fails and we > >unlock again it should still dec. > > > > The one exception is when the counter is saturated, but in that case > > we'll never free the object and the ordering is moot in any case. > > Also if the counter is 0, but that will never happen if the > refcounting is correct. > > > > it provides ordering only if the return value is true. > > > In which case it provides acquire ordering (thanks to the spin_lock), > > > and both release ordering and a control dependency (thanks to the > > > refcount_dec_and_test). > > > > > > > The difference is subtle and involves at least 3 CPUs. I can't seem to > > > > write up anything simple, keeps turning into monsters :/ Will, Paul, > > > > have you got anything simple around? > > > > > > The combination of acquire + release is not the same as smp_mb, because > > > > acquire+release is nothing, its release+acquire that I meant which > > should order things locally, but now that you've got me looking at it > > again, we don't in fact do that. > > > > So refcount_dec_and_lock() will provide a release, irrespective of the > > return value (assuming we're not saturated). If it returns true, it also > > does an acquire for the lock. > > > > But combined they're acquire+release, which is unfortunate.. it means > > the lock section and the refcount stuff overlaps, but I don't suppose > > that's actually a problem. Need to consider more. > > Right. To address your point: release + acquire isn't the same as a > full barrier either. The SB pattern illustrates the difference: > > P0 P1 > Write x=1 Write y=1 > Release a smp_mb > Acquire b Read x=0 > Read y=0 > > This would not be allowed if the release + acquire sequence was > replaced by smp_mb. But as it stands, this is allowed because nothing > prevents the CPU from interchanging the order of the release and the > acquire -- and then you're back to the acquire + release case. > > However, there is one circumstance where this interchange isn't > allowed: when the release and acquire access the same memory > location. Thus: > > P0(int *x, int *y, int *a) > { > int r0; > > WRITE_ONCE(*x, 1); > smp_store_release(a, 1); > smp_load_acquire(a); > r0 = READ_ONCE(*y); > } > > P1(int *x, int *y) > { > int r1; > > WRITE_ONCE(*y, 1); > smp_mb(); > r1 = READ_ONCE(*x); > } > > exists (0:r0=0 /\ 1:r1=0) > > This is forbidden. It would remain forbidden even if the smp_mb in P1 > were replaced by a similar release/acquire pair for the same memory > location. Isn't this allowed on x86 mapping smp_mb() to mfence, store-release to plain store and load-acquire to plain load? All we're saying is that you can forward from a release to an acquire, which is fine for RCpc semantics. e.g. X86 SB+mfence+po-rfi-po "MFencedWR Fre PodWW Rfi PodRR Fre" Generator=diyone7 (version 7.46+3) Prefetch=0:x=F,0:y=T,1:y=F,1:x=T Com=Fr Fr Orig=MFencedWR Fre PodWW Rfi PodRR Fre { } P0 | P1 ; MOV [x],$1 | MOV [y],$1 ; MFENCE | MOV [z],$1 ; MOV EAX,[y] | MOV EAX,[z] ; | MOV EBX,[x] ; exists (0:EAX=0 /\ 1:EAX=1 /\ 1:EBX=0) which herd says is allowed: Test SB+mfence+po-rfi-po Allowed States 4 0:EAX=0; 1:EAX=1; 1:EBX=0; 0:EAX=0; 1:EAX=1; 1:EBX=1; 0:EAX=1; 1:EAX=1; 1:EBX=0; 0:EAX=1; 1:EAX=1; 1:EBX=1; Ok Witnesses Positive: 1 Negative: 3 Condition exists (0:EAX=0 /\ 1:EAX=1 /\ 1:EBX=0) Observation SB+mfence+po-rfi-po Sometimes 1 3 Time SB+mfence+po-rfi-po 0.00 Hash=0f983e2d7579e5c04c332f9ac620c31f and I can reproduce using litmus to actually run it on my x86 box: %% % Results for SB+mfence+po-rfi-po.litmus % %% X86 SB+mfence+po-rfi-po "MFencedWR Fre PodWW Rfi PodRR Fre" {} P0 | P1 ; MOV [x],$1 | MOV [y],$1 ; MFENCE | MOV [z],$1 ; MOV EAX,[y] |
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Thu, 2 Nov 2017, Peter Zijlstra wrote: > On Thu, Nov 02, 2017 at 11:40:35AM -0400, Alan Stern wrote: > > On Thu, 2 Nov 2017, Peter Zijlstra wrote: > > > > > > Lock functions such as refcount_dec_and_lock() & > > > > refcount_dec_and_mutex_lock() Provide exactly the same guarantees as > > > > they atomic counterparts. > > > > > > Nope. The atomic_dec_and_lock() provides smp_mb() while > > > refcount_dec_and_lock() merely orders all prior load/store's against all > > > later load/store's. > > > > In fact there is no guaranteed ordering when refcount_dec_and_lock() > > returns false; > > It should provide a release: > > - if !=1, dec_not_one will provide release > - if ==1, dec_not_one will no-op, but then we'll acquire the lock and >dec_and_test will provide the release, even if the test fails and we >unlock again it should still dec. > > The one exception is when the counter is saturated, but in that case > we'll never free the object and the ordering is moot in any case. Also if the counter is 0, but that will never happen if the refcounting is correct. > > it provides ordering only if the return value is true. > > In which case it provides acquire ordering (thanks to the spin_lock), > > and both release ordering and a control dependency (thanks to the > > refcount_dec_and_test). > > > > > The difference is subtle and involves at least 3 CPUs. I can't seem to > > > write up anything simple, keeps turning into monsters :/ Will, Paul, > > > have you got anything simple around? > > > > The combination of acquire + release is not the same as smp_mb, because > > acquire+release is nothing, its release+acquire that I meant which > should order things locally, but now that you've got me looking at it > again, we don't in fact do that. > > So refcount_dec_and_lock() will provide a release, irrespective of the > return value (assuming we're not saturated). If it returns true, it also > does an acquire for the lock. > > But combined they're acquire+release, which is unfortunate.. it means > the lock section and the refcount stuff overlaps, but I don't suppose > that's actually a problem. Need to consider more. Right. To address your point: release + acquire isn't the same as a full barrier either. The SB pattern illustrates the difference: P0 P1 Write x=1 Write y=1 Release a smp_mb Acquire b Read x=0 Read y=0 This would not be allowed if the release + acquire sequence was replaced by smp_mb. But as it stands, this is allowed because nothing prevents the CPU from interchanging the order of the release and the acquire -- and then you're back to the acquire + release case. However, there is one circumstance where this interchange isn't allowed: when the release and acquire access the same memory location. Thus: P0(int *x, int *y, int *a) { int r0; WRITE_ONCE(*x, 1); smp_store_release(a, 1); smp_load_acquire(a); r0 = READ_ONCE(*y); } P1(int *x, int *y) { int r1; WRITE_ONCE(*y, 1); smp_mb(); r1 = READ_ONCE(*x); } exists (0:r0=0 /\ 1:r1=0) This is forbidden. It would remain forbidden even if the smp_mb in P1 were replaced by a similar release/acquire pair for the same memory location. To see the difference between smp_mb and release/acquire requires three threads: P0 P1 P2 Write x=1 Read y=1Read z=1 Release a data dep. smp_rmb Acquire a Write z=1 Read x=0 Write y=1 The Linux Kernel Memory Model allows this execution, although as far as I know, no existing hardware will do it. But with smp_mb in P0, the execution would be forbidden. None of this should be a problem for refcount_dec_and_lock, assuming it is used purely for reference counting. Alan Stern
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Thu, 2 Nov 2017, Peter Zijlstra wrote: > On Thu, Nov 02, 2017 at 11:40:35AM -0400, Alan Stern wrote: > > On Thu, 2 Nov 2017, Peter Zijlstra wrote: > > > > > > Lock functions such as refcount_dec_and_lock() & > > > > refcount_dec_and_mutex_lock() Provide exactly the same guarantees as > > > > they atomic counterparts. > > > > > > Nope. The atomic_dec_and_lock() provides smp_mb() while > > > refcount_dec_and_lock() merely orders all prior load/store's against all > > > later load/store's. > > > > In fact there is no guaranteed ordering when refcount_dec_and_lock() > > returns false; > > It should provide a release: > > - if !=1, dec_not_one will provide release > - if ==1, dec_not_one will no-op, but then we'll acquire the lock and >dec_and_test will provide the release, even if the test fails and we >unlock again it should still dec. > > The one exception is when the counter is saturated, but in that case > we'll never free the object and the ordering is moot in any case. Also if the counter is 0, but that will never happen if the refcounting is correct. > > it provides ordering only if the return value is true. > > In which case it provides acquire ordering (thanks to the spin_lock), > > and both release ordering and a control dependency (thanks to the > > refcount_dec_and_test). > > > > > The difference is subtle and involves at least 3 CPUs. I can't seem to > > > write up anything simple, keeps turning into monsters :/ Will, Paul, > > > have you got anything simple around? > > > > The combination of acquire + release is not the same as smp_mb, because > > acquire+release is nothing, its release+acquire that I meant which > should order things locally, but now that you've got me looking at it > again, we don't in fact do that. > > So refcount_dec_and_lock() will provide a release, irrespective of the > return value (assuming we're not saturated). If it returns true, it also > does an acquire for the lock. > > But combined they're acquire+release, which is unfortunate.. it means > the lock section and the refcount stuff overlaps, but I don't suppose > that's actually a problem. Need to consider more. Right. To address your point: release + acquire isn't the same as a full barrier either. The SB pattern illustrates the difference: P0 P1 Write x=1 Write y=1 Release a smp_mb Acquire b Read x=0 Read y=0 This would not be allowed if the release + acquire sequence was replaced by smp_mb. But as it stands, this is allowed because nothing prevents the CPU from interchanging the order of the release and the acquire -- and then you're back to the acquire + release case. However, there is one circumstance where this interchange isn't allowed: when the release and acquire access the same memory location. Thus: P0(int *x, int *y, int *a) { int r0; WRITE_ONCE(*x, 1); smp_store_release(a, 1); smp_load_acquire(a); r0 = READ_ONCE(*y); } P1(int *x, int *y) { int r1; WRITE_ONCE(*y, 1); smp_mb(); r1 = READ_ONCE(*x); } exists (0:r0=0 /\ 1:r1=0) This is forbidden. It would remain forbidden even if the smp_mb in P1 were replaced by a similar release/acquire pair for the same memory location. To see the difference between smp_mb and release/acquire requires three threads: P0 P1 P2 Write x=1 Read y=1Read z=1 Release a data dep. smp_rmb Acquire a Write z=1 Read x=0 Write y=1 The Linux Kernel Memory Model allows this execution, although as far as I know, no existing hardware will do it. But with smp_mb in P0, the execution would be forbidden. None of this should be a problem for refcount_dec_and_lock, assuming it is used purely for reference counting. Alan Stern
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Thu, Nov 02, 2017 at 05:02:37PM +0100, Peter Zijlstra wrote: > On Thu, Nov 02, 2017 at 11:40:35AM -0400, Alan Stern wrote: > > On Thu, 2 Nov 2017, Peter Zijlstra wrote: > > > > > > Lock functions such as refcount_dec_and_lock() & > > > > refcount_dec_and_mutex_lock() Provide exactly the same guarantees as > > > > they atomic counterparts. > > > > > > Nope. The atomic_dec_and_lock() provides smp_mb() while > > > refcount_dec_and_lock() merely orders all prior load/store's against all > > > later load/store's. > > > > In fact there is no guaranteed ordering when refcount_dec_and_lock() > > returns false; > > It should provide a release: > > - if !=1, dec_not_one will provide release > - if ==1, dec_not_one will no-op, but then we'll acquire the lock and >dec_and_test will provide the release, even if the test fails and we >unlock again it should still dec. > > The one exception is when the counter is saturated, but in that case > we'll never free the object and the ordering is moot in any case. > > > it provides ordering only if the return value is true. > > In which case it provides acquire ordering (thanks to the spin_lock), > > and both release ordering and a control dependency (thanks to the > > refcount_dec_and_test). > > > > > The difference is subtle and involves at least 3 CPUs. I can't seem to > > > write up anything simple, keeps turning into monsters :/ Will, Paul, > > > have you got anything simple around? > > > > The combination of acquire + release is not the same as smp_mb, because > > acquire+release is nothing, its release+acquire that I meant which > should order things locally, but now that you've got me looking at it > again, we don't in fact do that. > > So refcount_dec_and_lock() will provide a release, irrespective of the > return value (assuming we're not saturated). If it returns true, it also > does an acquire for the lock. > > But combined they're acquire+release, which is unfortunate.. it means > the lock section and the refcount stuff overlaps, but I don't suppose > that's actually a problem. Need to consider more. Right, so in that case we have refcount==0 and are guaranteed no concurrency. So its fine.
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Thu, Nov 02, 2017 at 05:02:37PM +0100, Peter Zijlstra wrote: > On Thu, Nov 02, 2017 at 11:40:35AM -0400, Alan Stern wrote: > > On Thu, 2 Nov 2017, Peter Zijlstra wrote: > > > > > > Lock functions such as refcount_dec_and_lock() & > > > > refcount_dec_and_mutex_lock() Provide exactly the same guarantees as > > > > they atomic counterparts. > > > > > > Nope. The atomic_dec_and_lock() provides smp_mb() while > > > refcount_dec_and_lock() merely orders all prior load/store's against all > > > later load/store's. > > > > In fact there is no guaranteed ordering when refcount_dec_and_lock() > > returns false; > > It should provide a release: > > - if !=1, dec_not_one will provide release > - if ==1, dec_not_one will no-op, but then we'll acquire the lock and >dec_and_test will provide the release, even if the test fails and we >unlock again it should still dec. > > The one exception is when the counter is saturated, but in that case > we'll never free the object and the ordering is moot in any case. > > > it provides ordering only if the return value is true. > > In which case it provides acquire ordering (thanks to the spin_lock), > > and both release ordering and a control dependency (thanks to the > > refcount_dec_and_test). > > > > > The difference is subtle and involves at least 3 CPUs. I can't seem to > > > write up anything simple, keeps turning into monsters :/ Will, Paul, > > > have you got anything simple around? > > > > The combination of acquire + release is not the same as smp_mb, because > > acquire+release is nothing, its release+acquire that I meant which > should order things locally, but now that you've got me looking at it > again, we don't in fact do that. > > So refcount_dec_and_lock() will provide a release, irrespective of the > return value (assuming we're not saturated). If it returns true, it also > does an acquire for the lock. > > But combined they're acquire+release, which is unfortunate.. it means > the lock section and the refcount stuff overlaps, but I don't suppose > that's actually a problem. Need to consider more. Right, so in that case we have refcount==0 and are guaranteed no concurrency. So its fine.
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Thu, Nov 02, 2017 at 11:40:35AM -0400, Alan Stern wrote: > On Thu, 2 Nov 2017, Peter Zijlstra wrote: > > > > Lock functions such as refcount_dec_and_lock() & > > > refcount_dec_and_mutex_lock() Provide exactly the same guarantees as > > > they atomic counterparts. > > > > Nope. The atomic_dec_and_lock() provides smp_mb() while > > refcount_dec_and_lock() merely orders all prior load/store's against all > > later load/store's. > > In fact there is no guaranteed ordering when refcount_dec_and_lock() > returns false; It should provide a release: - if !=1, dec_not_one will provide release - if ==1, dec_not_one will no-op, but then we'll acquire the lock and dec_and_test will provide the release, even if the test fails and we unlock again it should still dec. The one exception is when the counter is saturated, but in that case we'll never free the object and the ordering is moot in any case. > it provides ordering only if the return value is true. > In which case it provides acquire ordering (thanks to the spin_lock), > and both release ordering and a control dependency (thanks to the > refcount_dec_and_test). > > > The difference is subtle and involves at least 3 CPUs. I can't seem to > > write up anything simple, keeps turning into monsters :/ Will, Paul, > > have you got anything simple around? > > The combination of acquire + release is not the same as smp_mb, because acquire+release is nothing, its release+acquire that I meant which should order things locally, but now that you've got me looking at it again, we don't in fact do that. So refcount_dec_and_lock() will provide a release, irrespective of the return value (assuming we're not saturated). If it returns true, it also does an acquire for the lock. But combined they're acquire+release, which is unfortunate.. it means the lock section and the refcount stuff overlaps, but I don't suppose that's actually a problem. Need to consider more.
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Thu, Nov 02, 2017 at 11:40:35AM -0400, Alan Stern wrote: > On Thu, 2 Nov 2017, Peter Zijlstra wrote: > > > > Lock functions such as refcount_dec_and_lock() & > > > refcount_dec_and_mutex_lock() Provide exactly the same guarantees as > > > they atomic counterparts. > > > > Nope. The atomic_dec_and_lock() provides smp_mb() while > > refcount_dec_and_lock() merely orders all prior load/store's against all > > later load/store's. > > In fact there is no guaranteed ordering when refcount_dec_and_lock() > returns false; It should provide a release: - if !=1, dec_not_one will provide release - if ==1, dec_not_one will no-op, but then we'll acquire the lock and dec_and_test will provide the release, even if the test fails and we unlock again it should still dec. The one exception is when the counter is saturated, but in that case we'll never free the object and the ordering is moot in any case. > it provides ordering only if the return value is true. > In which case it provides acquire ordering (thanks to the spin_lock), > and both release ordering and a control dependency (thanks to the > refcount_dec_and_test). > > > The difference is subtle and involves at least 3 CPUs. I can't seem to > > write up anything simple, keeps turning into monsters :/ Will, Paul, > > have you got anything simple around? > > The combination of acquire + release is not the same as smp_mb, because acquire+release is nothing, its release+acquire that I meant which should order things locally, but now that you've got me looking at it again, we don't in fact do that. So refcount_dec_and_lock() will provide a release, irrespective of the return value (assuming we're not saturated). If it returns true, it also does an acquire for the lock. But combined they're acquire+release, which is unfortunate.. it means the lock section and the refcount stuff overlaps, but I don't suppose that's actually a problem. Need to consider more.
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Thu, 2 Nov 2017, Peter Zijlstra wrote: > > Lock functions such as refcount_dec_and_lock() & > > refcount_dec_and_mutex_lock() Provide exactly the same guarantees as > > they atomic counterparts. > > Nope. The atomic_dec_and_lock() provides smp_mb() while > refcount_dec_and_lock() merely orders all prior load/store's against all > later load/store's. In fact there is no guaranteed ordering when refcount_dec_and_lock() returns false; it provides ordering only if the return value is true. In which case it provides acquire ordering (thanks to the spin_lock), and both release ordering and a control dependency (thanks to the refcount_dec_and_test). > The difference is subtle and involves at least 3 CPUs. I can't seem to > write up anything simple, keeps turning into monsters :/ Will, Paul, > have you got anything simple around? The combination of acquire + release is not the same as smp_mb, because they allow things to pass by in one direction. Example: C C-refcount-vs-atomic-dec-and-lock { } P0(int *x, int *y, refcount_t *r) { refcount_set(r, 1); WRITE_ONCE(*x, 1); smp_wmb(); WRITE_ONCE(*y, 1); } P1(int *x, int *y, refcount_t *r, spinlock_t *s) { int rx, ry; bool r1; ry = READ_ONCE(*y); r1 = refcount_dec_and_lock(r, s); if (r1) rx = READ_ONCE(*x); } exists (1:ry=1 /\ 1:r1=1 /\ 1:rx=0) This is allowed. The idea is that the CPU can take: Read y Acquire Release Read x and execute the first read after the Acquire and the second read before the Release: Acquire Read y Read x Release and then the CPU can reorder the reads: Acquire Read x Read y Release If the program had used atomic_dec_and_lock() instead, which provides a full smp_mb barrier, this outcome would not be possible. Alan Stern
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Thu, 2 Nov 2017, Peter Zijlstra wrote: > > Lock functions such as refcount_dec_and_lock() & > > refcount_dec_and_mutex_lock() Provide exactly the same guarantees as > > they atomic counterparts. > > Nope. The atomic_dec_and_lock() provides smp_mb() while > refcount_dec_and_lock() merely orders all prior load/store's against all > later load/store's. In fact there is no guaranteed ordering when refcount_dec_and_lock() returns false; it provides ordering only if the return value is true. In which case it provides acquire ordering (thanks to the spin_lock), and both release ordering and a control dependency (thanks to the refcount_dec_and_test). > The difference is subtle and involves at least 3 CPUs. I can't seem to > write up anything simple, keeps turning into monsters :/ Will, Paul, > have you got anything simple around? The combination of acquire + release is not the same as smp_mb, because they allow things to pass by in one direction. Example: C C-refcount-vs-atomic-dec-and-lock { } P0(int *x, int *y, refcount_t *r) { refcount_set(r, 1); WRITE_ONCE(*x, 1); smp_wmb(); WRITE_ONCE(*y, 1); } P1(int *x, int *y, refcount_t *r, spinlock_t *s) { int rx, ry; bool r1; ry = READ_ONCE(*y); r1 = refcount_dec_and_lock(r, s); if (r1) rx = READ_ONCE(*x); } exists (1:ry=1 /\ 1:r1=1 /\ 1:rx=0) This is allowed. The idea is that the CPU can take: Read y Acquire Release Read x and execute the first read after the Acquire and the second read before the Release: Acquire Read y Read x Release and then the CPU can reorder the reads: Acquire Read x Read y Release If the program had used atomic_dec_and_lock() instead, which provides a full smp_mb barrier, this outcome would not be possible. Alan Stern
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Thu, Nov 02, 2017 at 11:04:53AM +, Reshetova, Elena wrote: > Well refcount_dec_and_test() is not the only function that has different > memory ordering specifics. So, the full answer then for any arbitrary case > according to your points above would be smth like: > > for each substituted function (atomic_* --> refcount_*) that actually > has changes in memory ordering *** perform the following: > - mention the difference > - mention the actual code place where the change would affect ( >various put and etc. functions) > - potentially try to understand if it would make a difference for > this code (again here is where I am not sure how to do it properly) > > > *** the actual list of these functions to me looks like: > 1) atomic_inc_not_zero -> refcount_inc_not_zero. Change is from > underlying atomic_cmpxchg() to atomic_try_cmpxchg_relaxed () First > one implies SMP-conditional general memory barrier (smp_mb()) on each > side of the actual operation (at least according to documentation, In > reality it goes through so many changes, ifdefs and conditionals that > one gets lost Easily in the process). That matches _inc(), which doesn't imply anything. The reasoning being that you must already have obtained a pointer to the object; and that will necessarily include enough ordering to observe the object. Therefore increasing the refcount doesn't require further constraints. > Second one according to the comment implies no memory barriers > whatsoever, BUT "provides a control dependency which will order future > stores against the inc" So, every store (not load) operation (I guess > we are talking here only about store operations that touch the same > object, but I wonder how it is defined in terms of memory location? Memory location is irrelevant. > (overlapping?) that comes inside "if refcount_inc_not_zero(){}" cause > would only be executed if functions returns true. The point is that a CPU must NOT speculate on stores. So while it can speculate a branch, any store inside the branch must not become visible until it can commit to that branch. The whole point being that possible modifications to the object to which we've _conditionally_ acquired a reference, will only happen after we're sure to indeed have acquired this reference. Otherwise its similar to _inc(). > So, practically what we might "loose" here is any updates on the > object protected by this refcounter done by another CPU. But for this > purpose we expect the developer to take some other lock/memory barrier > into use, right? Correct, object modification had better be serialized. Refcounts cannot (even with atomic_t) help with that. > We only care of incrementing the refcount atomically and make sure we > don't do anything with object unless it is ready for us to be used. Just so.. > If yes, then I guess it might be a big change for the code that > previously relied on atomic-provided smp_mb() barriers and now instead > needs to take an explicit locks/barriers by itself. Right, however such memory ordering should be explicitly documented. Unknown and hidden memory ordering is a straight up bug, because modifications to the code (be they introducing refcount_t or anything else) can easily break things. > 2) atomic_** -> refcount_add_not_zero. Fortunately these are super > rare and need to see per each case dependent on actual atomic function > substituted. See inc_not_zero. > 3) atomic_add() --> refcount_add(). This should not make any change > since both do not provide memory ordering at all, but there is a > comment in the refcount.c that says that refcount_add " provide a > control dependency and thereby orders future stores". How is it done > given that refcount_add is void returning function?? I am fully > confused with this one. Weird, mostly comes from being implemented using add_not_zero I suppose. > 4) atomic_dec () --> refcount_dec (). This one we have discussed > extensively before. Again first one implies SMP-conditional general > memory barrier (smp_mb()) on each side of the actual operation. No, atomic_dec() does not in fact imply anything.. > Second one only provides "release" ordering meaning that prior > both loads and stores must be completed before the operation. > So, is it correct to express the difference in this way: > > atomic_dec ():refcount_dec (): > > smp_mb(); smp_mb(); > do_atomic_dec_operation; do_atomic_dec_operation; > smp_mb(); /*note no any barrier here! */ No, on two points: atomic_dec() does not imply _anything_ and while refcount_dec() does the release, that is distinctly different from smp_mb() before. C C-peterz-release-vs-mb { } P0(int *a, int *b, int *c) { WRITE_ONCE(*a, 1); //smp_mb(); smp_store_release(b, 1); WRITE_ONCE(*c, 1); } P1(int *a, int *b, int *c) { r3 = READ_ONCE(*c);
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Thu, Nov 02, 2017 at 11:04:53AM +, Reshetova, Elena wrote: > Well refcount_dec_and_test() is not the only function that has different > memory ordering specifics. So, the full answer then for any arbitrary case > according to your points above would be smth like: > > for each substituted function (atomic_* --> refcount_*) that actually > has changes in memory ordering *** perform the following: > - mention the difference > - mention the actual code place where the change would affect ( >various put and etc. functions) > - potentially try to understand if it would make a difference for > this code (again here is where I am not sure how to do it properly) > > > *** the actual list of these functions to me looks like: > 1) atomic_inc_not_zero -> refcount_inc_not_zero. Change is from > underlying atomic_cmpxchg() to atomic_try_cmpxchg_relaxed () First > one implies SMP-conditional general memory barrier (smp_mb()) on each > side of the actual operation (at least according to documentation, In > reality it goes through so many changes, ifdefs and conditionals that > one gets lost Easily in the process). That matches _inc(), which doesn't imply anything. The reasoning being that you must already have obtained a pointer to the object; and that will necessarily include enough ordering to observe the object. Therefore increasing the refcount doesn't require further constraints. > Second one according to the comment implies no memory barriers > whatsoever, BUT "provides a control dependency which will order future > stores against the inc" So, every store (not load) operation (I guess > we are talking here only about store operations that touch the same > object, but I wonder how it is defined in terms of memory location? Memory location is irrelevant. > (overlapping?) that comes inside "if refcount_inc_not_zero(){}" cause > would only be executed if functions returns true. The point is that a CPU must NOT speculate on stores. So while it can speculate a branch, any store inside the branch must not become visible until it can commit to that branch. The whole point being that possible modifications to the object to which we've _conditionally_ acquired a reference, will only happen after we're sure to indeed have acquired this reference. Otherwise its similar to _inc(). > So, practically what we might "loose" here is any updates on the > object protected by this refcounter done by another CPU. But for this > purpose we expect the developer to take some other lock/memory barrier > into use, right? Correct, object modification had better be serialized. Refcounts cannot (even with atomic_t) help with that. > We only care of incrementing the refcount atomically and make sure we > don't do anything with object unless it is ready for us to be used. Just so.. > If yes, then I guess it might be a big change for the code that > previously relied on atomic-provided smp_mb() barriers and now instead > needs to take an explicit locks/barriers by itself. Right, however such memory ordering should be explicitly documented. Unknown and hidden memory ordering is a straight up bug, because modifications to the code (be they introducing refcount_t or anything else) can easily break things. > 2) atomic_** -> refcount_add_not_zero. Fortunately these are super > rare and need to see per each case dependent on actual atomic function > substituted. See inc_not_zero. > 3) atomic_add() --> refcount_add(). This should not make any change > since both do not provide memory ordering at all, but there is a > comment in the refcount.c that says that refcount_add " provide a > control dependency and thereby orders future stores". How is it done > given that refcount_add is void returning function?? I am fully > confused with this one. Weird, mostly comes from being implemented using add_not_zero I suppose. > 4) atomic_dec () --> refcount_dec (). This one we have discussed > extensively before. Again first one implies SMP-conditional general > memory barrier (smp_mb()) on each side of the actual operation. No, atomic_dec() does not in fact imply anything.. > Second one only provides "release" ordering meaning that prior > both loads and stores must be completed before the operation. > So, is it correct to express the difference in this way: > > atomic_dec ():refcount_dec (): > > smp_mb(); smp_mb(); > do_atomic_dec_operation; do_atomic_dec_operation; > smp_mb(); /*note no any barrier here! */ No, on two points: atomic_dec() does not imply _anything_ and while refcount_dec() does the release, that is distinctly different from smp_mb() before. C C-peterz-release-vs-mb { } P0(int *a, int *b, int *c) { WRITE_ONCE(*a, 1); //smp_mb(); smp_store_release(b, 1); WRITE_ONCE(*c, 1); } P1(int *a, int *b, int *c) { r3 = READ_ONCE(*c);
RE: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
Sorry for delayed reply, but I was actually reading and trying to understand all the involved notions, so it took a while... > On Fri, Oct 27, 2017 at 06:49:55AM +, Reshetova, Elena wrote: > > Could we possibly have a bit more elaborate discussion on this? > > > > Or alternatively, what then should be the correct way for a certain > > variable (that behaves like a standard refcounter) to check if the > > relaxed memory ordering is ok? > > The refcount_t includes sufficient ordering for normal reference > counting. It provides a happens-before relation wrt. dropping the > reference, such that all object accesses happen before we drop the > reference; because at that point the object can go away and any further > access would be UAF. > > So if the thing being replaced is purely a reference count, all should > be well. Yes, I understand this, but people are asking a totally different level of detail and inside analysis and given that there are changes, they have a point. So, that's why I am trying to figure the "whole story" below. > > > This is what Thomas was asking me to do for core kernel conversions > > and this is what I don't know how to do correctly. > > Thomas asked you to verify that nothing relies on the stronger ordering > provided by atomic_dec_and_test(). This means you have to audit the code > with an eye towards memory ordering. > > Now the futex one is actually OK. Thomas just wants you to mention that > refcount_dec_and_test is weaker and explain that in this case that is > fine since pi_state::refcount is in fact a pure refcount and > put_pi_state() does not in fact rely on further ordering. > > Just mentioning the above gives the maintainer: > > 1) the information he needs to perform review, mentioning memory > ordering changes means he needs to look at it. > > 2) you identify the place where it changes (put_pi_state) which again > aids him in review by limiting the amount of code he needs to double > check. > > 3) that you actually looked at the problem; giving more trust you > actually know wth you're doing. Well refcount_dec_and_test() is not the only function that has different memory ordering specifics. So, the full answer then for any arbitrary case according to your points above would be smth like: for each substituted function (atomic_* --> refcount_*) that actually has changes in memory ordering *** perform the following: - mention the difference - mention the actual code place where the change would affect ( various put and etc. functions) - potentially try to understand if it would make a difference for this code (again here is where I am not sure how to do it properly) *** the actual list of these functions to me looks like: 1) atomic_inc_not_zero -> refcount_inc_not_zero. Change is from underlying atomic_cmpxchg() to atomic_try_cmpxchg_relaxed () First one implies SMP-conditional general memory barrier (smp_mb()) on each side of the actual operation (at least according to documentation, In reality it goes through so many changes, ifdefs and conditionals that one gets lost Easily in the process). Second one according to the comment implies no memory barriers whatsoever, BUT "provides a control dependency which will order future stores against the inc" So, every store (not load) operation (I guess we are talking here only about store operations that touch the same object, but I wonder how it is defined in terms of memory location? (overlapping?) that comes inside "if refcount_inc_not_zero(){}" cause would only be executed if functions returns true. So, practically what we might "loose" here is any updates on the object protected by this refcounter done by another CPU. But for this purpose we expect the developer to take some other lock/memory barrier into use, right? We only care of incrementing the refcount atomically and make sure we don't do anything with object unless it is ready for us to be used. If yes, then I guess it might be a big change for the code that previously relied on atomic-provided smp_mb() barriers and now instead needs to take an explicit locks/barriers by itself. 2) atomic_** -> refcount_add_not_zero. Fortunately these are super rare and need to see per each case dependent on actual atomic function substituted. 3) atomic_add() --> refcount_add(). This should not make any change since both do not provide memory ordering at all, but there is a comment in the refcount.c that says that refcount_add " provide a control dependency and thereby orders future stores". How is it done given that refcount_add is void returning function?? I am fully confused with this one. 4) atomic_dec () --> refcount_dec (). This one we have discussed extensively before. Again first one implies SMP-conditional general memory barrier (smp_mb()) on each side of the actual operation. Second one only provides "release" ordering meaning that prior both loads and stores must be completed before the operation.
RE: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
Sorry for delayed reply, but I was actually reading and trying to understand all the involved notions, so it took a while... > On Fri, Oct 27, 2017 at 06:49:55AM +, Reshetova, Elena wrote: > > Could we possibly have a bit more elaborate discussion on this? > > > > Or alternatively, what then should be the correct way for a certain > > variable (that behaves like a standard refcounter) to check if the > > relaxed memory ordering is ok? > > The refcount_t includes sufficient ordering for normal reference > counting. It provides a happens-before relation wrt. dropping the > reference, such that all object accesses happen before we drop the > reference; because at that point the object can go away and any further > access would be UAF. > > So if the thing being replaced is purely a reference count, all should > be well. Yes, I understand this, but people are asking a totally different level of detail and inside analysis and given that there are changes, they have a point. So, that's why I am trying to figure the "whole story" below. > > > This is what Thomas was asking me to do for core kernel conversions > > and this is what I don't know how to do correctly. > > Thomas asked you to verify that nothing relies on the stronger ordering > provided by atomic_dec_and_test(). This means you have to audit the code > with an eye towards memory ordering. > > Now the futex one is actually OK. Thomas just wants you to mention that > refcount_dec_and_test is weaker and explain that in this case that is > fine since pi_state::refcount is in fact a pure refcount and > put_pi_state() does not in fact rely on further ordering. > > Just mentioning the above gives the maintainer: > > 1) the information he needs to perform review, mentioning memory > ordering changes means he needs to look at it. > > 2) you identify the place where it changes (put_pi_state) which again > aids him in review by limiting the amount of code he needs to double > check. > > 3) that you actually looked at the problem; giving more trust you > actually know wth you're doing. Well refcount_dec_and_test() is not the only function that has different memory ordering specifics. So, the full answer then for any arbitrary case according to your points above would be smth like: for each substituted function (atomic_* --> refcount_*) that actually has changes in memory ordering *** perform the following: - mention the difference - mention the actual code place where the change would affect ( various put and etc. functions) - potentially try to understand if it would make a difference for this code (again here is where I am not sure how to do it properly) *** the actual list of these functions to me looks like: 1) atomic_inc_not_zero -> refcount_inc_not_zero. Change is from underlying atomic_cmpxchg() to atomic_try_cmpxchg_relaxed () First one implies SMP-conditional general memory barrier (smp_mb()) on each side of the actual operation (at least according to documentation, In reality it goes through so many changes, ifdefs and conditionals that one gets lost Easily in the process). Second one according to the comment implies no memory barriers whatsoever, BUT "provides a control dependency which will order future stores against the inc" So, every store (not load) operation (I guess we are talking here only about store operations that touch the same object, but I wonder how it is defined in terms of memory location? (overlapping?) that comes inside "if refcount_inc_not_zero(){}" cause would only be executed if functions returns true. So, practically what we might "loose" here is any updates on the object protected by this refcounter done by another CPU. But for this purpose we expect the developer to take some other lock/memory barrier into use, right? We only care of incrementing the refcount atomically and make sure we don't do anything with object unless it is ready for us to be used. If yes, then I guess it might be a big change for the code that previously relied on atomic-provided smp_mb() barriers and now instead needs to take an explicit locks/barriers by itself. 2) atomic_** -> refcount_add_not_zero. Fortunately these are super rare and need to see per each case dependent on actual atomic function substituted. 3) atomic_add() --> refcount_add(). This should not make any change since both do not provide memory ordering at all, but there is a comment in the refcount.c that says that refcount_add " provide a control dependency and thereby orders future stores". How is it done given that refcount_add is void returning function?? I am fully confused with this one. 4) atomic_dec () --> refcount_dec (). This one we have discussed extensively before. Again first one implies SMP-conditional general memory barrier (smp_mb()) on each side of the actual operation. Second one only provides "release" ordering meaning that prior both loads and stores must be completed before the operation.
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Fri, Oct 27, 2017 at 06:49:55AM +, Reshetova, Elena wrote: > Could we possibly have a bit more elaborate discussion on this? > > Or alternatively, what then should be the correct way for a certain > variable (that behaves like a standard refcounter) to check if the > relaxed memory ordering is ok? The refcount_t includes sufficient ordering for normal reference counting. It provides a happens-before relation wrt. dropping the reference, such that all object accesses happen before we drop the reference; because at that point the object can go away and any further access would be UAF. So if the thing being replaced is purely a reference count, all should be well. > This is what Thomas was asking me to do for core kernel conversions > and this is what I don't know how to do correctly. Thomas asked you to verify that nothing relies on the stronger ordering provided by atomic_dec_and_test(). This means you have to audit the code with an eye towards memory ordering. Now the futex one is actually OK. Thomas just wants you to mention that refcount_dec_and_test is weaker and explain that in this case that is fine since pi_state::refcount is in fact a pure refcount and put_pi_state() does not in fact rely on further ordering. Just mentioning the above gives the maintainer: 1) the information he needs to perform review, mentioning memory ordering changes means he needs to look at it. 2) you identify the place where it changes (put_pi_state) which again aids him in review by limiting the amount of code he needs to double check. 3) that you actually looked at the problem; giving more trust you actually know wth you're doing. > Also, I got exactly the same question from xfs maintainer, > so if we provide an API that we hope will be used correctly in the > future, we should have a way for people to verify it. I actually replied to Dave (but have not checked if there's further email on the thread). I object to the claim that the code is correct if he cannot explain the memory ordering. Aside from that; if/when he explain the ordering requirements of those sites and those do indeed require more than refcount_dec_and_test() provides we should add a comment exactly explaining the ordering along with additional barriers, such as smp_mb__after_atomic(). > Maybe it is just me, but I would think that having a way to verify > that your code is ok with this refcounter-specific relaxed memory > ordering applies to any memory ordering requirements, refcounters are > just a subset with certain properties, so is then the full answer is > to figure out how to verify any memory ordering requirement of your > code? Yes; and this can be quite tricky, but undocumented ordering requirements are a bug that need to be fixed. Mostly the code is 'simple' and refcount really does the right and sufficient thing. If the maintainer knows or suspects more ordering is required he can help out; but for that to happen you have to, at the very least, do 1&2 above. > We can also document this logic in more docs or even maybe try to > create a better documentation for current memory ordering bits since > it is not the most easiest read at the moment. Otherwise this might be > just bad enough reason for many people to avoid refcount_t type if it > is just easier to tell "let's take just old atomic, we knew it somehow > worked before" That's just voodoo programming; and while that is rife I have absolutely no sympathy for it. Memory ordering is hard, but that doesn't mean we should just blindly sprinkle memory barriers around until it 'works'. So while memory-barriers.txt is a longish document, it is readable with a bit of time and effort. There are also tools being developed that can help people validate their code. There are enough people around that actually get this stuff (it really is not _that_ hard, but it does require a functional brain) so if you can describe the problem with sufficient detail we can help out getting the ordering right (or state its broken :-).
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Fri, Oct 27, 2017 at 06:49:55AM +, Reshetova, Elena wrote: > Could we possibly have a bit more elaborate discussion on this? > > Or alternatively, what then should be the correct way for a certain > variable (that behaves like a standard refcounter) to check if the > relaxed memory ordering is ok? The refcount_t includes sufficient ordering for normal reference counting. It provides a happens-before relation wrt. dropping the reference, such that all object accesses happen before we drop the reference; because at that point the object can go away and any further access would be UAF. So if the thing being replaced is purely a reference count, all should be well. > This is what Thomas was asking me to do for core kernel conversions > and this is what I don't know how to do correctly. Thomas asked you to verify that nothing relies on the stronger ordering provided by atomic_dec_and_test(). This means you have to audit the code with an eye towards memory ordering. Now the futex one is actually OK. Thomas just wants you to mention that refcount_dec_and_test is weaker and explain that in this case that is fine since pi_state::refcount is in fact a pure refcount and put_pi_state() does not in fact rely on further ordering. Just mentioning the above gives the maintainer: 1) the information he needs to perform review, mentioning memory ordering changes means he needs to look at it. 2) you identify the place where it changes (put_pi_state) which again aids him in review by limiting the amount of code he needs to double check. 3) that you actually looked at the problem; giving more trust you actually know wth you're doing. > Also, I got exactly the same question from xfs maintainer, > so if we provide an API that we hope will be used correctly in the > future, we should have a way for people to verify it. I actually replied to Dave (but have not checked if there's further email on the thread). I object to the claim that the code is correct if he cannot explain the memory ordering. Aside from that; if/when he explain the ordering requirements of those sites and those do indeed require more than refcount_dec_and_test() provides we should add a comment exactly explaining the ordering along with additional barriers, such as smp_mb__after_atomic(). > Maybe it is just me, but I would think that having a way to verify > that your code is ok with this refcounter-specific relaxed memory > ordering applies to any memory ordering requirements, refcounters are > just a subset with certain properties, so is then the full answer is > to figure out how to verify any memory ordering requirement of your > code? Yes; and this can be quite tricky, but undocumented ordering requirements are a bug that need to be fixed. Mostly the code is 'simple' and refcount really does the right and sufficient thing. If the maintainer knows or suspects more ordering is required he can help out; but for that to happen you have to, at the very least, do 1&2 above. > We can also document this logic in more docs or even maybe try to > create a better documentation for current memory ordering bits since > it is not the most easiest read at the moment. Otherwise this might be > just bad enough reason for many people to avoid refcount_t type if it > is just easier to tell "let's take just old atomic, we knew it somehow > worked before" That's just voodoo programming; and while that is rife I have absolutely no sympathy for it. Memory ordering is hard, but that doesn't mean we should just blindly sprinkle memory barriers around until it 'works'. So while memory-barriers.txt is a longish document, it is readable with a bit of time and effort. There are also tools being developed that can help people validate their code. There are enough people around that actually get this stuff (it really is not _that_ hard, but it does require a functional brain) so if you can describe the problem with sufficient detail we can help out getting the ordering right (or state its broken :-).
RE: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
> On Mon, Oct 23, 2017 at 02:09:44PM +0300, Elena Reshetova wrote: > > Currently arch. independent implementation of refcount_t in > > lib/refcount.c provides weak memory ordering guarantees > > compare to its analog atomic_t implementations. > > While it should not be a problem for most of the actual > > cases of refcounters, it is more understandable for everyone > > (and more error-prone for future users) to provide exactly > > same memory ordering guarantees as atomics. > > > > If speed is of a concern, then either more efficient arch. > > dependent refcount_t implementation should be used or if there > > are enough users in the future we might need to provide both > > strict and relaxed refcount_t APIs. > > > > Suggested-by: Kees Cook> > NAK Could we possibly have a bit more elaborate discussion on this? Or alternatively, what then should be the correct way for a certain variable (that behaves like a standard refcounter) to check if the relaxed memory ordering is ok? This is what Thomas was asking me to do for core kernel conversions and this is what I don't know how to do correctly. Also, I got exactly the same question from xfs maintainer, so if we provide an API that we hope will be used correctly in the future, we should have a way for people to verify it. Maybe it is just me, but I would think that having a way to verify that your code is ok with this refcounter-specific relaxed memory ordering applies to any memory ordering requirements, refcounters are just a subset with certain properties, so is then the full answer is to figure out how to verify any memory ordering requirement of your code? We can also document this logic in more docs or even maybe try to create a better documentation for current memory ordering bits since it is not the most easiest read at the moment. Otherwise this might be just bad enough reason for many people to avoid refcount_t type if it is just easier to tell "let's take just old atomic, we knew it somehow worked before" Best Regards, Elena.
RE: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
> On Mon, Oct 23, 2017 at 02:09:44PM +0300, Elena Reshetova wrote: > > Currently arch. independent implementation of refcount_t in > > lib/refcount.c provides weak memory ordering guarantees > > compare to its analog atomic_t implementations. > > While it should not be a problem for most of the actual > > cases of refcounters, it is more understandable for everyone > > (and more error-prone for future users) to provide exactly > > same memory ordering guarantees as atomics. > > > > If speed is of a concern, then either more efficient arch. > > dependent refcount_t implementation should be used or if there > > are enough users in the future we might need to provide both > > strict and relaxed refcount_t APIs. > > > > Suggested-by: Kees Cook > > NAK Could we possibly have a bit more elaborate discussion on this? Or alternatively, what then should be the correct way for a certain variable (that behaves like a standard refcounter) to check if the relaxed memory ordering is ok? This is what Thomas was asking me to do for core kernel conversions and this is what I don't know how to do correctly. Also, I got exactly the same question from xfs maintainer, so if we provide an API that we hope will be used correctly in the future, we should have a way for people to verify it. Maybe it is just me, but I would think that having a way to verify that your code is ok with this refcounter-specific relaxed memory ordering applies to any memory ordering requirements, refcounters are just a subset with certain properties, so is then the full answer is to figure out how to verify any memory ordering requirement of your code? We can also document this logic in more docs or even maybe try to create a better documentation for current memory ordering bits since it is not the most easiest read at the moment. Otherwise this might be just bad enough reason for many people to avoid refcount_t type if it is just easier to tell "let's take just old atomic, we knew it somehow worked before" Best Regards, Elena.
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Mon, Oct 23, 2017 at 02:09:44PM +0300, Elena Reshetova wrote: > Currently arch. independent implementation of refcount_t in > lib/refcount.c provides weak memory ordering guarantees > compare to its analog atomic_t implementations. > While it should not be a problem for most of the actual > cases of refcounters, it is more understandable for everyone > (and more error-prone for future users) to provide exactly > same memory ordering guarantees as atomics. > > If speed is of a concern, then either more efficient arch. > dependent refcount_t implementation should be used or if there > are enough users in the future we might need to provide both > strict and relaxed refcount_t APIs. > > Suggested-by: Kees CookNAK
Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t
On Mon, Oct 23, 2017 at 02:09:44PM +0300, Elena Reshetova wrote: > Currently arch. independent implementation of refcount_t in > lib/refcount.c provides weak memory ordering guarantees > compare to its analog atomic_t implementations. > While it should not be a problem for most of the actual > cases of refcounters, it is more understandable for everyone > (and more error-prone for future users) to provide exactly > same memory ordering guarantees as atomics. > > If speed is of a concern, then either more efficient arch. > dependent refcount_t implementation should be used or if there > are enough users in the future we might need to provide both > strict and relaxed refcount_t APIs. > > Suggested-by: Kees Cook NAK