Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-11-04 Thread Joel Fernandes
On Sun, Nov 04, 2018 at 07:43:30PM -0800, Paul E. McKenney wrote:
[...]
> > > > > > > > Also about GP memory ordering and RCU-tree-locking, I think you 
> > > > > > > > mentioned to
> > > > > > > > me that the RCU reader-sections are virtually extended both 
> > > > > > > > forward and
> > > > > > > > backward and whereever it ends, those paths do heavy-weight 
> > > > > > > > synchronization
> > > > > > > > that should be sufficient to prevent memory ordering issues 
> > > > > > > > (such as those
> > > > > > > > you mentioned in the Requierments document). That is exactly 
> > > > > > > > why we don't
> > > > > > > > need explicit barriers during rcu_read_unlock. If I recall I 
> > > > > > > > asked you why
> > > > > > > > those are not needed. So that answer made sense, but then now 
> > > > > > > > on going
> > > > > > > > through the 'Memory Ordering' document, I see that you 
> > > > > > > > mentioned there is
> > > > > > > > reliance on the locking. Is that reliance on locking necessary 
> > > > > > > > to maintain
> > > > > > > > ordering then?
> > > > > > > 
> > > > > > > There is a "network" of locking augmented by 
> > > > > > > smp_mb__after_unlock_lock()
> > > > > > > that implements the all-to-all memory ordering mentioned above.  
> > > > > > > But it
> > > > > > > also needs to handle all the possible 
> > > > > > > complete()/wait_for_completion()
> > > > > > > races, even those assisted by hypervisor vCPU preemption.
> > > > > > 
> > > > > > I see, so it sounds like the lock network is just a partial 
> > > > > > solution. For
> > > > > > some reason I thought before that complete() was even called on the 
> > > > > > CPU
> > > > > > executing the callback, all the CPUs would have acquired and 
> > > > > > released a lock
> > > > > > in the "lock network" atleast once thus ensuring the ordering (due 
> > > > > > to the
> > > > > > fact that the quiescent state reporting has to travel up the tree 
> > > > > > starting
> > > > > > from the leaves), but I think that's not necessarily true so I see 
> > > > > > your point
> > > > > > now.
> > > > > 
> > > > > There is indeed a lock that is unconditionally acquired and released 
> > > > > by
> > > > > wait_for_completion(), but it lacks the smp_mb__after_unlock_lock() 
> > > > > that
> > > > > is required to get full-up any-to-any ordering.  And unfortunate 
> > > > > timing
> > > > > (as well as spurious wakeups) allow the interaction to have only 
> > > > > normal
> > > > > lock-release/acquire ordering, which does not suffice in all cases.
> > > > 
> > > > Sorry to be so persistent, but I did spend some time on this and I still
> > > > don't get why every CPU would _not_ have executed 
> > > > smp_mb__after_unlock_lock at least
> > > > once before the wait_for_completion() returns, because every CPU should 
> > > > have
> > > > atleast called rcu_report_qs_rdp() -> rcu_report_qs_rnp() atleast once 
> > > > to
> > > > report its QS up the tree right?. Before that procedure, the complete()
> > > > cannot happen because the complete() itself is in an RCU callback which 
> > > > is
> > > > executed only once all the QS(s) have been reported.
> > > > 
> > > > So I still couldn't see how the synchronize_rcu can return without the
> > > > rcu_report_qs_rnp called atleast once on the CPU reporting its QS 
> > > > during a
> > > > grace period.
> > > > 
> > > > Would it be possible to provide a small example showing this in least 
> > > > number
> > > > of steps? I appreciate your time and it would be really helpful. If you 
> > > > feel
> > > > its too complicated, then feel free to keep this for LPC discussion :)
> > > 
> > > The key point is that "at least once" does not suffice, other than for the
> > > CPU that detects the end of the grace period.  The rest of the CPUs must
> > > do at least -two- full barriers, which could of course be either smp_mb()
> > > on the one hand or smp_mb__after_unlock_lock() after a lock on the other.
> > 
> > I thought I'll atleast get an understanding of the "atleast two full
> > barriers" point and ask you any questions at LPC, because that's what I'm
> > missing I think. Trying to understand what can go wrong without two full
> > barriers. I'm sure an RCU implementation BoF could really in this regard.
> > 
> > I guess its also documented somewhere in Tree-RCU-Memory-Ordering.html but a
> > quick search through that document didn't show a mention of the two full
> > barriers need.. I think its also a great idea for us to document it there
> > and/or discuss it during the conference.
> > 
> > I went through the litmus test here for some hints on the two-barriers but
> > couldn't find any:
> > https://lkml.org/lkml/2017/10/5/636
> > 
> > Atleast this commit made me think no extra memory barrier is needed for
> > tree RCU:  :-\
> > https://lore.kernel.org/patchwork/patch/813386/
> > 
> > I'm sure your last email will be useful to me in the future once I can make
> > more sense of the ordering and the need for two 

Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-11-04 Thread Joel Fernandes
On Sun, Nov 04, 2018 at 07:43:30PM -0800, Paul E. McKenney wrote:
[...]
> > > > > > > > Also about GP memory ordering and RCU-tree-locking, I think you 
> > > > > > > > mentioned to
> > > > > > > > me that the RCU reader-sections are virtually extended both 
> > > > > > > > forward and
> > > > > > > > backward and whereever it ends, those paths do heavy-weight 
> > > > > > > > synchronization
> > > > > > > > that should be sufficient to prevent memory ordering issues 
> > > > > > > > (such as those
> > > > > > > > you mentioned in the Requierments document). That is exactly 
> > > > > > > > why we don't
> > > > > > > > need explicit barriers during rcu_read_unlock. If I recall I 
> > > > > > > > asked you why
> > > > > > > > those are not needed. So that answer made sense, but then now 
> > > > > > > > on going
> > > > > > > > through the 'Memory Ordering' document, I see that you 
> > > > > > > > mentioned there is
> > > > > > > > reliance on the locking. Is that reliance on locking necessary 
> > > > > > > > to maintain
> > > > > > > > ordering then?
> > > > > > > 
> > > > > > > There is a "network" of locking augmented by 
> > > > > > > smp_mb__after_unlock_lock()
> > > > > > > that implements the all-to-all memory ordering mentioned above.  
> > > > > > > But it
> > > > > > > also needs to handle all the possible 
> > > > > > > complete()/wait_for_completion()
> > > > > > > races, even those assisted by hypervisor vCPU preemption.
> > > > > > 
> > > > > > I see, so it sounds like the lock network is just a partial 
> > > > > > solution. For
> > > > > > some reason I thought before that complete() was even called on the 
> > > > > > CPU
> > > > > > executing the callback, all the CPUs would have acquired and 
> > > > > > released a lock
> > > > > > in the "lock network" atleast once thus ensuring the ordering (due 
> > > > > > to the
> > > > > > fact that the quiescent state reporting has to travel up the tree 
> > > > > > starting
> > > > > > from the leaves), but I think that's not necessarily true so I see 
> > > > > > your point
> > > > > > now.
> > > > > 
> > > > > There is indeed a lock that is unconditionally acquired and released 
> > > > > by
> > > > > wait_for_completion(), but it lacks the smp_mb__after_unlock_lock() 
> > > > > that
> > > > > is required to get full-up any-to-any ordering.  And unfortunate 
> > > > > timing
> > > > > (as well as spurious wakeups) allow the interaction to have only 
> > > > > normal
> > > > > lock-release/acquire ordering, which does not suffice in all cases.
> > > > 
> > > > Sorry to be so persistent, but I did spend some time on this and I still
> > > > don't get why every CPU would _not_ have executed 
> > > > smp_mb__after_unlock_lock at least
> > > > once before the wait_for_completion() returns, because every CPU should 
> > > > have
> > > > atleast called rcu_report_qs_rdp() -> rcu_report_qs_rnp() atleast once 
> > > > to
> > > > report its QS up the tree right?. Before that procedure, the complete()
> > > > cannot happen because the complete() itself is in an RCU callback which 
> > > > is
> > > > executed only once all the QS(s) have been reported.
> > > > 
> > > > So I still couldn't see how the synchronize_rcu can return without the
> > > > rcu_report_qs_rnp called atleast once on the CPU reporting its QS 
> > > > during a
> > > > grace period.
> > > > 
> > > > Would it be possible to provide a small example showing this in least 
> > > > number
> > > > of steps? I appreciate your time and it would be really helpful. If you 
> > > > feel
> > > > its too complicated, then feel free to keep this for LPC discussion :)
> > > 
> > > The key point is that "at least once" does not suffice, other than for the
> > > CPU that detects the end of the grace period.  The rest of the CPUs must
> > > do at least -two- full barriers, which could of course be either smp_mb()
> > > on the one hand or smp_mb__after_unlock_lock() after a lock on the other.
> > 
> > I thought I'll atleast get an understanding of the "atleast two full
> > barriers" point and ask you any questions at LPC, because that's what I'm
> > missing I think. Trying to understand what can go wrong without two full
> > barriers. I'm sure an RCU implementation BoF could really in this regard.
> > 
> > I guess its also documented somewhere in Tree-RCU-Memory-Ordering.html but a
> > quick search through that document didn't show a mention of the two full
> > barriers need.. I think its also a great idea for us to document it there
> > and/or discuss it during the conference.
> > 
> > I went through the litmus test here for some hints on the two-barriers but
> > couldn't find any:
> > https://lkml.org/lkml/2017/10/5/636
> > 
> > Atleast this commit made me think no extra memory barrier is needed for
> > tree RCU:  :-\
> > https://lore.kernel.org/patchwork/patch/813386/
> > 
> > I'm sure your last email will be useful to me in the future once I can make
> > more sense of the ordering and the need for two 

Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-11-04 Thread Paul E. McKenney
On Sat, Nov 03, 2018 at 08:49:56PM -0700, Joel Fernandes wrote:
> On Sat, Nov 03, 2018 at 04:22:59PM -0700, Paul E. McKenney wrote:
> > On Fri, Nov 02, 2018 at 10:12:26PM -0700, Joel Fernandes wrote:
> > > On Thu, Nov 01, 2018 at 09:13:07AM -0700, Paul E. McKenney wrote:
> > > > On Wed, Oct 31, 2018 at 10:00:19PM -0700, Joel Fernandes wrote:
> > > > > On Wed, Oct 31, 2018 at 11:17:48AM -0700, Paul E. McKenney wrote:
> > > > > > On Tue, Oct 30, 2018 at 06:11:19PM -0700, Joel Fernandes wrote:
> > > > > > > Hi Paul,
> > > > > > > 
> > > > > > > On Tue, Oct 30, 2018 at 04:43:36PM -0700, Paul E. McKenney wrote:
> > > > > > > > On Tue, Oct 30, 2018 at 03:26:49PM -0700, Joel Fernandes wrote:
> > > > > > > > > Hi Paul,
> > > > > > > > > 
> > > > > > > > > On Sat, Oct 27, 2018 at 09:30:46PM -0700, Joel Fernandes 
> > > > > > > > > (Google) wrote:
> > > > > > > > > > As per this thread [1], it seems this smp_mb isn't needed 
> > > > > > > > > > anymore:
> > > > > > > > > > "So the smp_mb() that I was trying to add doesn't need to 
> > > > > > > > > > be there."
> > > > > > > > > > 
> > > > > > > > > > So let us remove this part from the memory ordering 
> > > > > > > > > > documentation.
> > > > > > > > > > 
> > > > > > > > > > [1] https://lkml.org/lkml/2017/10/6/707
> > > > > > > > > > 
> > > > > > > > > > Signed-off-by: Joel Fernandes (Google) 
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > I was just checking about this patch. Do you feel it is 
> > > > > > > > > correct to remove
> > > > > > > > > this part from the docs? Are you satisified that a barrier 
> > > > > > > > > isn't needed there
> > > > > > > > > now? Or did I miss something?
> > > > > > > > 
> > > > > > > > Apologies, it got lost in the shuffle.  I have now applied it 
> > > > > > > > with a
> > > > > > > > bit of rework to the commit log, thank you!
> > > > > > > 
> > > > > > > No worries, thanks for taking it!
> > > > > > > 
> > > > > > > Just wanted to update you on my progress reading/correcting the 
> > > > > > > docs. The
> > > > > > > 'Memory Ordering' is taking a bit of time so I paused that and 
> > > > > > > I'm focusing
> > > > > > > on finishing all the other low hanging fruit. This activity is 
> > > > > > > mostly during
> > > > > > > night hours after the baby is asleep but some times I also manage 
> > > > > > > to sneak it
> > > > > > > into the day job ;-)
> > > > > > 
> > > > > > If there is anything I can do to make this a more sustainable task 
> > > > > > for
> > > > > > you, please do not keep it a secret!!!
> > > > > 
> > > > > Thanks a lot, that means a lot to me! Will do!
> > > > > 
> > > > > > > BTW I do want to discuss about this smp_mb patch above with you 
> > > > > > > at LPC if you
> > > > > > > had time, even though we are removing it from the documentation. 
> > > > > > > I thought
> > > > > > > about it a few times, and I was not able to fully appreciate the 
> > > > > > > need for the
> > > > > > > barrier (that is even assuming that complete() etc did not do the 
> > > > > > > right
> > > > > > > thing).  Specifically I was wondering same thing Peter said in 
> > > > > > > the above
> > > > > > > thread I think that - if that rcu_read_unlock() triggered all the 
> > > > > > > spin
> > > > > > > locking up the tree of nodes, then why is that locking not 
> > > > > > > sufficient to
> > > > > > > prevent reads from the read-side section from bleeding out? That 
> > > > > > > would
> > > > > > > prevent the reader that just unlocked from seeing anything that 
> > > > > > > happens
> > > > > > > _after_ the synchronize_rcu.
> > > > > > 
> > > > > > Actually, I recall an smp_mb() being added, but am not seeing it 
> > > > > > anywhere
> > > > > > relevant to wait_for_completion().  So I might need to add the 
> > > > > > smp_mb()
> > > > > > to synchronize_rcu() and remove the patch (retaining the typo fix). 
> > > > > >  :-/
> > > > > 
> > > > > No problem, I'm glad atleast the patch resurfaced the topic of the 
> > > > > potential
> > > > > issue :-)
> > > > 
> > > > And an smp_mb() is needed in Tree RCU's __wait_rcu_gp().  This is
> > > > because wait_for_completion() might get a "fly-by" wakeup, which would
> > > > mean no ordering for code naively thinking that it was ordered after a
> > > > grace period.
> > > > 
> > > > > > The short form answer is that anything before a grace period on any 
> > > > > > CPU
> > > > > > must be seen by any CPU as being before anything on any CPU after 
> > > > > > that
> > > > > > same grace period.  This guarantee requires a rather big hammer.
> > > > > > 
> > > > > > But yes, let's talk at LPC!
> > > > > 
> > > > > Sounds great, looking forward to discussing this.
> > > > 
> > > > Would it make sense to have an RCU-implementation BoF?
> > > > 
> > > > > > > Also about GP memory ordering and RCU-tree-locking, I think you 
> > > > > > > mentioned to
> > > > > > > me that the RCU reader-sections are virtually extended both 
> > > > > > > forward and
> > > > > > > 

Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-11-04 Thread Paul E. McKenney
On Sat, Nov 03, 2018 at 08:49:56PM -0700, Joel Fernandes wrote:
> On Sat, Nov 03, 2018 at 04:22:59PM -0700, Paul E. McKenney wrote:
> > On Fri, Nov 02, 2018 at 10:12:26PM -0700, Joel Fernandes wrote:
> > > On Thu, Nov 01, 2018 at 09:13:07AM -0700, Paul E. McKenney wrote:
> > > > On Wed, Oct 31, 2018 at 10:00:19PM -0700, Joel Fernandes wrote:
> > > > > On Wed, Oct 31, 2018 at 11:17:48AM -0700, Paul E. McKenney wrote:
> > > > > > On Tue, Oct 30, 2018 at 06:11:19PM -0700, Joel Fernandes wrote:
> > > > > > > Hi Paul,
> > > > > > > 
> > > > > > > On Tue, Oct 30, 2018 at 04:43:36PM -0700, Paul E. McKenney wrote:
> > > > > > > > On Tue, Oct 30, 2018 at 03:26:49PM -0700, Joel Fernandes wrote:
> > > > > > > > > Hi Paul,
> > > > > > > > > 
> > > > > > > > > On Sat, Oct 27, 2018 at 09:30:46PM -0700, Joel Fernandes 
> > > > > > > > > (Google) wrote:
> > > > > > > > > > As per this thread [1], it seems this smp_mb isn't needed 
> > > > > > > > > > anymore:
> > > > > > > > > > "So the smp_mb() that I was trying to add doesn't need to 
> > > > > > > > > > be there."
> > > > > > > > > > 
> > > > > > > > > > So let us remove this part from the memory ordering 
> > > > > > > > > > documentation.
> > > > > > > > > > 
> > > > > > > > > > [1] https://lkml.org/lkml/2017/10/6/707
> > > > > > > > > > 
> > > > > > > > > > Signed-off-by: Joel Fernandes (Google) 
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > I was just checking about this patch. Do you feel it is 
> > > > > > > > > correct to remove
> > > > > > > > > this part from the docs? Are you satisified that a barrier 
> > > > > > > > > isn't needed there
> > > > > > > > > now? Or did I miss something?
> > > > > > > > 
> > > > > > > > Apologies, it got lost in the shuffle.  I have now applied it 
> > > > > > > > with a
> > > > > > > > bit of rework to the commit log, thank you!
> > > > > > > 
> > > > > > > No worries, thanks for taking it!
> > > > > > > 
> > > > > > > Just wanted to update you on my progress reading/correcting the 
> > > > > > > docs. The
> > > > > > > 'Memory Ordering' is taking a bit of time so I paused that and 
> > > > > > > I'm focusing
> > > > > > > on finishing all the other low hanging fruit. This activity is 
> > > > > > > mostly during
> > > > > > > night hours after the baby is asleep but some times I also manage 
> > > > > > > to sneak it
> > > > > > > into the day job ;-)
> > > > > > 
> > > > > > If there is anything I can do to make this a more sustainable task 
> > > > > > for
> > > > > > you, please do not keep it a secret!!!
> > > > > 
> > > > > Thanks a lot, that means a lot to me! Will do!
> > > > > 
> > > > > > > BTW I do want to discuss about this smp_mb patch above with you 
> > > > > > > at LPC if you
> > > > > > > had time, even though we are removing it from the documentation. 
> > > > > > > I thought
> > > > > > > about it a few times, and I was not able to fully appreciate the 
> > > > > > > need for the
> > > > > > > barrier (that is even assuming that complete() etc did not do the 
> > > > > > > right
> > > > > > > thing).  Specifically I was wondering same thing Peter said in 
> > > > > > > the above
> > > > > > > thread I think that - if that rcu_read_unlock() triggered all the 
> > > > > > > spin
> > > > > > > locking up the tree of nodes, then why is that locking not 
> > > > > > > sufficient to
> > > > > > > prevent reads from the read-side section from bleeding out? That 
> > > > > > > would
> > > > > > > prevent the reader that just unlocked from seeing anything that 
> > > > > > > happens
> > > > > > > _after_ the synchronize_rcu.
> > > > > > 
> > > > > > Actually, I recall an smp_mb() being added, but am not seeing it 
> > > > > > anywhere
> > > > > > relevant to wait_for_completion().  So I might need to add the 
> > > > > > smp_mb()
> > > > > > to synchronize_rcu() and remove the patch (retaining the typo fix). 
> > > > > >  :-/
> > > > > 
> > > > > No problem, I'm glad atleast the patch resurfaced the topic of the 
> > > > > potential
> > > > > issue :-)
> > > > 
> > > > And an smp_mb() is needed in Tree RCU's __wait_rcu_gp().  This is
> > > > because wait_for_completion() might get a "fly-by" wakeup, which would
> > > > mean no ordering for code naively thinking that it was ordered after a
> > > > grace period.
> > > > 
> > > > > > The short form answer is that anything before a grace period on any 
> > > > > > CPU
> > > > > > must be seen by any CPU as being before anything on any CPU after 
> > > > > > that
> > > > > > same grace period.  This guarantee requires a rather big hammer.
> > > > > > 
> > > > > > But yes, let's talk at LPC!
> > > > > 
> > > > > Sounds great, looking forward to discussing this.
> > > > 
> > > > Would it make sense to have an RCU-implementation BoF?
> > > > 
> > > > > > > Also about GP memory ordering and RCU-tree-locking, I think you 
> > > > > > > mentioned to
> > > > > > > me that the RCU reader-sections are virtually extended both 
> > > > > > > forward and
> > > > > > > 

Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-11-03 Thread Joel Fernandes
On Sat, Nov 03, 2018 at 04:22:59PM -0700, Paul E. McKenney wrote:
> On Fri, Nov 02, 2018 at 10:12:26PM -0700, Joel Fernandes wrote:
> > On Thu, Nov 01, 2018 at 09:13:07AM -0700, Paul E. McKenney wrote:
> > > On Wed, Oct 31, 2018 at 10:00:19PM -0700, Joel Fernandes wrote:
> > > > On Wed, Oct 31, 2018 at 11:17:48AM -0700, Paul E. McKenney wrote:
> > > > > On Tue, Oct 30, 2018 at 06:11:19PM -0700, Joel Fernandes wrote:
> > > > > > Hi Paul,
> > > > > > 
> > > > > > On Tue, Oct 30, 2018 at 04:43:36PM -0700, Paul E. McKenney wrote:
> > > > > > > On Tue, Oct 30, 2018 at 03:26:49PM -0700, Joel Fernandes wrote:
> > > > > > > > Hi Paul,
> > > > > > > > 
> > > > > > > > On Sat, Oct 27, 2018 at 09:30:46PM -0700, Joel Fernandes 
> > > > > > > > (Google) wrote:
> > > > > > > > > As per this thread [1], it seems this smp_mb isn't needed 
> > > > > > > > > anymore:
> > > > > > > > > "So the smp_mb() that I was trying to add doesn't need to be 
> > > > > > > > > there."
> > > > > > > > > 
> > > > > > > > > So let us remove this part from the memory ordering 
> > > > > > > > > documentation.
> > > > > > > > > 
> > > > > > > > > [1] https://lkml.org/lkml/2017/10/6/707
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Joel Fernandes (Google) 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > > I was just checking about this patch. Do you feel it is correct 
> > > > > > > > to remove
> > > > > > > > this part from the docs? Are you satisified that a barrier 
> > > > > > > > isn't needed there
> > > > > > > > now? Or did I miss something?
> > > > > > > 
> > > > > > > Apologies, it got lost in the shuffle.  I have now applied it 
> > > > > > > with a
> > > > > > > bit of rework to the commit log, thank you!
> > > > > > 
> > > > > > No worries, thanks for taking it!
> > > > > > 
> > > > > > Just wanted to update you on my progress reading/correcting the 
> > > > > > docs. The
> > > > > > 'Memory Ordering' is taking a bit of time so I paused that and I'm 
> > > > > > focusing
> > > > > > on finishing all the other low hanging fruit. This activity is 
> > > > > > mostly during
> > > > > > night hours after the baby is asleep but some times I also manage 
> > > > > > to sneak it
> > > > > > into the day job ;-)
> > > > > 
> > > > > If there is anything I can do to make this a more sustainable task for
> > > > > you, please do not keep it a secret!!!
> > > > 
> > > > Thanks a lot, that means a lot to me! Will do!
> > > > 
> > > > > > BTW I do want to discuss about this smp_mb patch above with you at 
> > > > > > LPC if you
> > > > > > had time, even though we are removing it from the documentation. I 
> > > > > > thought
> > > > > > about it a few times, and I was not able to fully appreciate the 
> > > > > > need for the
> > > > > > barrier (that is even assuming that complete() etc did not do the 
> > > > > > right
> > > > > > thing).  Specifically I was wondering same thing Peter said in the 
> > > > > > above
> > > > > > thread I think that - if that rcu_read_unlock() triggered all the 
> > > > > > spin
> > > > > > locking up the tree of nodes, then why is that locking not 
> > > > > > sufficient to
> > > > > > prevent reads from the read-side section from bleeding out? That 
> > > > > > would
> > > > > > prevent the reader that just unlocked from seeing anything that 
> > > > > > happens
> > > > > > _after_ the synchronize_rcu.
> > > > > 
> > > > > Actually, I recall an smp_mb() being added, but am not seeing it 
> > > > > anywhere
> > > > > relevant to wait_for_completion().  So I might need to add the 
> > > > > smp_mb()
> > > > > to synchronize_rcu() and remove the patch (retaining the typo fix).  
> > > > > :-/
> > > > 
> > > > No problem, I'm glad atleast the patch resurfaced the topic of the 
> > > > potential
> > > > issue :-)
> > > 
> > > And an smp_mb() is needed in Tree RCU's __wait_rcu_gp().  This is
> > > because wait_for_completion() might get a "fly-by" wakeup, which would
> > > mean no ordering for code naively thinking that it was ordered after a
> > > grace period.
> > > 
> > > > > The short form answer is that anything before a grace period on any 
> > > > > CPU
> > > > > must be seen by any CPU as being before anything on any CPU after that
> > > > > same grace period.  This guarantee requires a rather big hammer.
> > > > > 
> > > > > But yes, let's talk at LPC!
> > > > 
> > > > Sounds great, looking forward to discussing this.
> > > 
> > > Would it make sense to have an RCU-implementation BoF?
> > > 
> > > > > > Also about GP memory ordering and RCU-tree-locking, I think you 
> > > > > > mentioned to
> > > > > > me that the RCU reader-sections are virtually extended both forward 
> > > > > > and
> > > > > > backward and whereever it ends, those paths do heavy-weight 
> > > > > > synchronization
> > > > > > that should be sufficient to prevent memory ordering issues (such 
> > > > > > as those
> > > > > > you mentioned in the Requierments document). That is exactly why we 
> > > > > > don't
> > > > > > need 

Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-11-03 Thread Joel Fernandes
On Sat, Nov 03, 2018 at 04:22:59PM -0700, Paul E. McKenney wrote:
> On Fri, Nov 02, 2018 at 10:12:26PM -0700, Joel Fernandes wrote:
> > On Thu, Nov 01, 2018 at 09:13:07AM -0700, Paul E. McKenney wrote:
> > > On Wed, Oct 31, 2018 at 10:00:19PM -0700, Joel Fernandes wrote:
> > > > On Wed, Oct 31, 2018 at 11:17:48AM -0700, Paul E. McKenney wrote:
> > > > > On Tue, Oct 30, 2018 at 06:11:19PM -0700, Joel Fernandes wrote:
> > > > > > Hi Paul,
> > > > > > 
> > > > > > On Tue, Oct 30, 2018 at 04:43:36PM -0700, Paul E. McKenney wrote:
> > > > > > > On Tue, Oct 30, 2018 at 03:26:49PM -0700, Joel Fernandes wrote:
> > > > > > > > Hi Paul,
> > > > > > > > 
> > > > > > > > On Sat, Oct 27, 2018 at 09:30:46PM -0700, Joel Fernandes 
> > > > > > > > (Google) wrote:
> > > > > > > > > As per this thread [1], it seems this smp_mb isn't needed 
> > > > > > > > > anymore:
> > > > > > > > > "So the smp_mb() that I was trying to add doesn't need to be 
> > > > > > > > > there."
> > > > > > > > > 
> > > > > > > > > So let us remove this part from the memory ordering 
> > > > > > > > > documentation.
> > > > > > > > > 
> > > > > > > > > [1] https://lkml.org/lkml/2017/10/6/707
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Joel Fernandes (Google) 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > > I was just checking about this patch. Do you feel it is correct 
> > > > > > > > to remove
> > > > > > > > this part from the docs? Are you satisified that a barrier 
> > > > > > > > isn't needed there
> > > > > > > > now? Or did I miss something?
> > > > > > > 
> > > > > > > Apologies, it got lost in the shuffle.  I have now applied it 
> > > > > > > with a
> > > > > > > bit of rework to the commit log, thank you!
> > > > > > 
> > > > > > No worries, thanks for taking it!
> > > > > > 
> > > > > > Just wanted to update you on my progress reading/correcting the 
> > > > > > docs. The
> > > > > > 'Memory Ordering' is taking a bit of time so I paused that and I'm 
> > > > > > focusing
> > > > > > on finishing all the other low hanging fruit. This activity is 
> > > > > > mostly during
> > > > > > night hours after the baby is asleep but some times I also manage 
> > > > > > to sneak it
> > > > > > into the day job ;-)
> > > > > 
> > > > > If there is anything I can do to make this a more sustainable task for
> > > > > you, please do not keep it a secret!!!
> > > > 
> > > > Thanks a lot, that means a lot to me! Will do!
> > > > 
> > > > > > BTW I do want to discuss about this smp_mb patch above with you at 
> > > > > > LPC if you
> > > > > > had time, even though we are removing it from the documentation. I 
> > > > > > thought
> > > > > > about it a few times, and I was not able to fully appreciate the 
> > > > > > need for the
> > > > > > barrier (that is even assuming that complete() etc did not do the 
> > > > > > right
> > > > > > thing).  Specifically I was wondering same thing Peter said in the 
> > > > > > above
> > > > > > thread I think that - if that rcu_read_unlock() triggered all the 
> > > > > > spin
> > > > > > locking up the tree of nodes, then why is that locking not 
> > > > > > sufficient to
> > > > > > prevent reads from the read-side section from bleeding out? That 
> > > > > > would
> > > > > > prevent the reader that just unlocked from seeing anything that 
> > > > > > happens
> > > > > > _after_ the synchronize_rcu.
> > > > > 
> > > > > Actually, I recall an smp_mb() being added, but am not seeing it 
> > > > > anywhere
> > > > > relevant to wait_for_completion().  So I might need to add the 
> > > > > smp_mb()
> > > > > to synchronize_rcu() and remove the patch (retaining the typo fix).  
> > > > > :-/
> > > > 
> > > > No problem, I'm glad atleast the patch resurfaced the topic of the 
> > > > potential
> > > > issue :-)
> > > 
> > > And an smp_mb() is needed in Tree RCU's __wait_rcu_gp().  This is
> > > because wait_for_completion() might get a "fly-by" wakeup, which would
> > > mean no ordering for code naively thinking that it was ordered after a
> > > grace period.
> > > 
> > > > > The short form answer is that anything before a grace period on any 
> > > > > CPU
> > > > > must be seen by any CPU as being before anything on any CPU after that
> > > > > same grace period.  This guarantee requires a rather big hammer.
> > > > > 
> > > > > But yes, let's talk at LPC!
> > > > 
> > > > Sounds great, looking forward to discussing this.
> > > 
> > > Would it make sense to have an RCU-implementation BoF?
> > > 
> > > > > > Also about GP memory ordering and RCU-tree-locking, I think you 
> > > > > > mentioned to
> > > > > > me that the RCU reader-sections are virtually extended both forward 
> > > > > > and
> > > > > > backward and whereever it ends, those paths do heavy-weight 
> > > > > > synchronization
> > > > > > that should be sufficient to prevent memory ordering issues (such 
> > > > > > as those
> > > > > > you mentioned in the Requierments document). That is exactly why we 
> > > > > > don't
> > > > > > need 

Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-11-03 Thread Paul E. McKenney
On Fri, Nov 02, 2018 at 10:12:26PM -0700, Joel Fernandes wrote:
> On Thu, Nov 01, 2018 at 09:13:07AM -0700, Paul E. McKenney wrote:
> > On Wed, Oct 31, 2018 at 10:00:19PM -0700, Joel Fernandes wrote:
> > > On Wed, Oct 31, 2018 at 11:17:48AM -0700, Paul E. McKenney wrote:
> > > > On Tue, Oct 30, 2018 at 06:11:19PM -0700, Joel Fernandes wrote:
> > > > > Hi Paul,
> > > > > 
> > > > > On Tue, Oct 30, 2018 at 04:43:36PM -0700, Paul E. McKenney wrote:
> > > > > > On Tue, Oct 30, 2018 at 03:26:49PM -0700, Joel Fernandes wrote:
> > > > > > > Hi Paul,
> > > > > > > 
> > > > > > > On Sat, Oct 27, 2018 at 09:30:46PM -0700, Joel Fernandes (Google) 
> > > > > > > wrote:
> > > > > > > > As per this thread [1], it seems this smp_mb isn't needed 
> > > > > > > > anymore:
> > > > > > > > "So the smp_mb() that I was trying to add doesn't need to be 
> > > > > > > > there."
> > > > > > > > 
> > > > > > > > So let us remove this part from the memory ordering 
> > > > > > > > documentation.
> > > > > > > > 
> > > > > > > > [1] https://lkml.org/lkml/2017/10/6/707
> > > > > > > > 
> > > > > > > > Signed-off-by: Joel Fernandes (Google) 
> > > > > > > 
> > > > > > > I was just checking about this patch. Do you feel it is correct 
> > > > > > > to remove
> > > > > > > this part from the docs? Are you satisified that a barrier isn't 
> > > > > > > needed there
> > > > > > > now? Or did I miss something?
> > > > > > 
> > > > > > Apologies, it got lost in the shuffle.  I have now applied it with a
> > > > > > bit of rework to the commit log, thank you!
> > > > > 
> > > > > No worries, thanks for taking it!
> > > > > 
> > > > > Just wanted to update you on my progress reading/correcting the docs. 
> > > > > The
> > > > > 'Memory Ordering' is taking a bit of time so I paused that and I'm 
> > > > > focusing
> > > > > on finishing all the other low hanging fruit. This activity is mostly 
> > > > > during
> > > > > night hours after the baby is asleep but some times I also manage to 
> > > > > sneak it
> > > > > into the day job ;-)
> > > > 
> > > > If there is anything I can do to make this a more sustainable task for
> > > > you, please do not keep it a secret!!!
> > > 
> > > Thanks a lot, that means a lot to me! Will do!
> > > 
> > > > > BTW I do want to discuss about this smp_mb patch above with you at 
> > > > > LPC if you
> > > > > had time, even though we are removing it from the documentation. I 
> > > > > thought
> > > > > about it a few times, and I was not able to fully appreciate the need 
> > > > > for the
> > > > > barrier (that is even assuming that complete() etc did not do the 
> > > > > right
> > > > > thing).  Specifically I was wondering same thing Peter said in the 
> > > > > above
> > > > > thread I think that - if that rcu_read_unlock() triggered all the spin
> > > > > locking up the tree of nodes, then why is that locking not sufficient 
> > > > > to
> > > > > prevent reads from the read-side section from bleeding out? That would
> > > > > prevent the reader that just unlocked from seeing anything that 
> > > > > happens
> > > > > _after_ the synchronize_rcu.
> > > > 
> > > > Actually, I recall an smp_mb() being added, but am not seeing it 
> > > > anywhere
> > > > relevant to wait_for_completion().  So I might need to add the smp_mb()
> > > > to synchronize_rcu() and remove the patch (retaining the typo fix).  :-/
> > > 
> > > No problem, I'm glad atleast the patch resurfaced the topic of the 
> > > potential
> > > issue :-)
> > 
> > And an smp_mb() is needed in Tree RCU's __wait_rcu_gp().  This is
> > because wait_for_completion() might get a "fly-by" wakeup, which would
> > mean no ordering for code naively thinking that it was ordered after a
> > grace period.
> > 
> > > > The short form answer is that anything before a grace period on any CPU
> > > > must be seen by any CPU as being before anything on any CPU after that
> > > > same grace period.  This guarantee requires a rather big hammer.
> > > > 
> > > > But yes, let's talk at LPC!
> > > 
> > > Sounds great, looking forward to discussing this.
> > 
> > Would it make sense to have an RCU-implementation BoF?
> > 
> > > > > Also about GP memory ordering and RCU-tree-locking, I think you 
> > > > > mentioned to
> > > > > me that the RCU reader-sections are virtually extended both forward 
> > > > > and
> > > > > backward and whereever it ends, those paths do heavy-weight 
> > > > > synchronization
> > > > > that should be sufficient to prevent memory ordering issues (such as 
> > > > > those
> > > > > you mentioned in the Requierments document). That is exactly why we 
> > > > > don't
> > > > > need explicit barriers during rcu_read_unlock. If I recall I asked 
> > > > > you why
> > > > > those are not needed. So that answer made sense, but then now on going
> > > > > through the 'Memory Ordering' document, I see that you mentioned 
> > > > > there is
> > > > > reliance on the locking. Is that reliance on locking necessary to 
> > > > > maintain
> > > > > 

Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-11-03 Thread Paul E. McKenney
On Fri, Nov 02, 2018 at 10:12:26PM -0700, Joel Fernandes wrote:
> On Thu, Nov 01, 2018 at 09:13:07AM -0700, Paul E. McKenney wrote:
> > On Wed, Oct 31, 2018 at 10:00:19PM -0700, Joel Fernandes wrote:
> > > On Wed, Oct 31, 2018 at 11:17:48AM -0700, Paul E. McKenney wrote:
> > > > On Tue, Oct 30, 2018 at 06:11:19PM -0700, Joel Fernandes wrote:
> > > > > Hi Paul,
> > > > > 
> > > > > On Tue, Oct 30, 2018 at 04:43:36PM -0700, Paul E. McKenney wrote:
> > > > > > On Tue, Oct 30, 2018 at 03:26:49PM -0700, Joel Fernandes wrote:
> > > > > > > Hi Paul,
> > > > > > > 
> > > > > > > On Sat, Oct 27, 2018 at 09:30:46PM -0700, Joel Fernandes (Google) 
> > > > > > > wrote:
> > > > > > > > As per this thread [1], it seems this smp_mb isn't needed 
> > > > > > > > anymore:
> > > > > > > > "So the smp_mb() that I was trying to add doesn't need to be 
> > > > > > > > there."
> > > > > > > > 
> > > > > > > > So let us remove this part from the memory ordering 
> > > > > > > > documentation.
> > > > > > > > 
> > > > > > > > [1] https://lkml.org/lkml/2017/10/6/707
> > > > > > > > 
> > > > > > > > Signed-off-by: Joel Fernandes (Google) 
> > > > > > > 
> > > > > > > I was just checking about this patch. Do you feel it is correct 
> > > > > > > to remove
> > > > > > > this part from the docs? Are you satisified that a barrier isn't 
> > > > > > > needed there
> > > > > > > now? Or did I miss something?
> > > > > > 
> > > > > > Apologies, it got lost in the shuffle.  I have now applied it with a
> > > > > > bit of rework to the commit log, thank you!
> > > > > 
> > > > > No worries, thanks for taking it!
> > > > > 
> > > > > Just wanted to update you on my progress reading/correcting the docs. 
> > > > > The
> > > > > 'Memory Ordering' is taking a bit of time so I paused that and I'm 
> > > > > focusing
> > > > > on finishing all the other low hanging fruit. This activity is mostly 
> > > > > during
> > > > > night hours after the baby is asleep but some times I also manage to 
> > > > > sneak it
> > > > > into the day job ;-)
> > > > 
> > > > If there is anything I can do to make this a more sustainable task for
> > > > you, please do not keep it a secret!!!
> > > 
> > > Thanks a lot, that means a lot to me! Will do!
> > > 
> > > > > BTW I do want to discuss about this smp_mb patch above with you at 
> > > > > LPC if you
> > > > > had time, even though we are removing it from the documentation. I 
> > > > > thought
> > > > > about it a few times, and I was not able to fully appreciate the need 
> > > > > for the
> > > > > barrier (that is even assuming that complete() etc did not do the 
> > > > > right
> > > > > thing).  Specifically I was wondering same thing Peter said in the 
> > > > > above
> > > > > thread I think that - if that rcu_read_unlock() triggered all the spin
> > > > > locking up the tree of nodes, then why is that locking not sufficient 
> > > > > to
> > > > > prevent reads from the read-side section from bleeding out? That would
> > > > > prevent the reader that just unlocked from seeing anything that 
> > > > > happens
> > > > > _after_ the synchronize_rcu.
> > > > 
> > > > Actually, I recall an smp_mb() being added, but am not seeing it 
> > > > anywhere
> > > > relevant to wait_for_completion().  So I might need to add the smp_mb()
> > > > to synchronize_rcu() and remove the patch (retaining the typo fix).  :-/
> > > 
> > > No problem, I'm glad atleast the patch resurfaced the topic of the 
> > > potential
> > > issue :-)
> > 
> > And an smp_mb() is needed in Tree RCU's __wait_rcu_gp().  This is
> > because wait_for_completion() might get a "fly-by" wakeup, which would
> > mean no ordering for code naively thinking that it was ordered after a
> > grace period.
> > 
> > > > The short form answer is that anything before a grace period on any CPU
> > > > must be seen by any CPU as being before anything on any CPU after that
> > > > same grace period.  This guarantee requires a rather big hammer.
> > > > 
> > > > But yes, let's talk at LPC!
> > > 
> > > Sounds great, looking forward to discussing this.
> > 
> > Would it make sense to have an RCU-implementation BoF?
> > 
> > > > > Also about GP memory ordering and RCU-tree-locking, I think you 
> > > > > mentioned to
> > > > > me that the RCU reader-sections are virtually extended both forward 
> > > > > and
> > > > > backward and whereever it ends, those paths do heavy-weight 
> > > > > synchronization
> > > > > that should be sufficient to prevent memory ordering issues (such as 
> > > > > those
> > > > > you mentioned in the Requierments document). That is exactly why we 
> > > > > don't
> > > > > need explicit barriers during rcu_read_unlock. If I recall I asked 
> > > > > you why
> > > > > those are not needed. So that answer made sense, but then now on going
> > > > > through the 'Memory Ordering' document, I see that you mentioned 
> > > > > there is
> > > > > reliance on the locking. Is that reliance on locking necessary to 
> > > > > maintain
> > > > > 

Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-11-02 Thread Joel Fernandes
On Thu, Nov 01, 2018 at 09:13:07AM -0700, Paul E. McKenney wrote:
> On Wed, Oct 31, 2018 at 10:00:19PM -0700, Joel Fernandes wrote:
> > On Wed, Oct 31, 2018 at 11:17:48AM -0700, Paul E. McKenney wrote:
> > > On Tue, Oct 30, 2018 at 06:11:19PM -0700, Joel Fernandes wrote:
> > > > Hi Paul,
> > > > 
> > > > On Tue, Oct 30, 2018 at 04:43:36PM -0700, Paul E. McKenney wrote:
> > > > > On Tue, Oct 30, 2018 at 03:26:49PM -0700, Joel Fernandes wrote:
> > > > > > Hi Paul,
> > > > > > 
> > > > > > On Sat, Oct 27, 2018 at 09:30:46PM -0700, Joel Fernandes (Google) 
> > > > > > wrote:
> > > > > > > As per this thread [1], it seems this smp_mb isn't needed anymore:
> > > > > > > "So the smp_mb() that I was trying to add doesn't need to be 
> > > > > > > there."
> > > > > > > 
> > > > > > > So let us remove this part from the memory ordering documentation.
> > > > > > > 
> > > > > > > [1] https://lkml.org/lkml/2017/10/6/707
> > > > > > > 
> > > > > > > Signed-off-by: Joel Fernandes (Google) 
> > > > > > 
> > > > > > I was just checking about this patch. Do you feel it is correct to 
> > > > > > remove
> > > > > > this part from the docs? Are you satisified that a barrier isn't 
> > > > > > needed there
> > > > > > now? Or did I miss something?
> > > > > 
> > > > > Apologies, it got lost in the shuffle.  I have now applied it with a
> > > > > bit of rework to the commit log, thank you!
> > > > 
> > > > No worries, thanks for taking it!
> > > > 
> > > > Just wanted to update you on my progress reading/correcting the docs. 
> > > > The
> > > > 'Memory Ordering' is taking a bit of time so I paused that and I'm 
> > > > focusing
> > > > on finishing all the other low hanging fruit. This activity is mostly 
> > > > during
> > > > night hours after the baby is asleep but some times I also manage to 
> > > > sneak it
> > > > into the day job ;-)
> > > 
> > > If there is anything I can do to make this a more sustainable task for
> > > you, please do not keep it a secret!!!
> > 
> > Thanks a lot, that means a lot to me! Will do!
> > 
> > > > BTW I do want to discuss about this smp_mb patch above with you at LPC 
> > > > if you
> > > > had time, even though we are removing it from the documentation. I 
> > > > thought
> > > > about it a few times, and I was not able to fully appreciate the need 
> > > > for the
> > > > barrier (that is even assuming that complete() etc did not do the right
> > > > thing).  Specifically I was wondering same thing Peter said in the above
> > > > thread I think that - if that rcu_read_unlock() triggered all the spin
> > > > locking up the tree of nodes, then why is that locking not sufficient to
> > > > prevent reads from the read-side section from bleeding out? That would
> > > > prevent the reader that just unlocked from seeing anything that happens
> > > > _after_ the synchronize_rcu.
> > > 
> > > Actually, I recall an smp_mb() being added, but am not seeing it anywhere
> > > relevant to wait_for_completion().  So I might need to add the smp_mb()
> > > to synchronize_rcu() and remove the patch (retaining the typo fix).  :-/
> > 
> > No problem, I'm glad atleast the patch resurfaced the topic of the potential
> > issue :-)
> 
> And an smp_mb() is needed in Tree RCU's __wait_rcu_gp().  This is
> because wait_for_completion() might get a "fly-by" wakeup, which would
> mean no ordering for code naively thinking that it was ordered after a
> grace period.
> 
> > > The short form answer is that anything before a grace period on any CPU
> > > must be seen by any CPU as being before anything on any CPU after that
> > > same grace period.  This guarantee requires a rather big hammer.
> > > 
> > > But yes, let's talk at LPC!
> > 
> > Sounds great, looking forward to discussing this.
> 
> Would it make sense to have an RCU-implementation BoF?
> 
> > > > Also about GP memory ordering and RCU-tree-locking, I think you 
> > > > mentioned to
> > > > me that the RCU reader-sections are virtually extended both forward and
> > > > backward and whereever it ends, those paths do heavy-weight 
> > > > synchronization
> > > > that should be sufficient to prevent memory ordering issues (such as 
> > > > those
> > > > you mentioned in the Requierments document). That is exactly why we 
> > > > don't
> > > > need explicit barriers during rcu_read_unlock. If I recall I asked you 
> > > > why
> > > > those are not needed. So that answer made sense, but then now on going
> > > > through the 'Memory Ordering' document, I see that you mentioned there 
> > > > is
> > > > reliance on the locking. Is that reliance on locking necessary to 
> > > > maintain
> > > > ordering then?
> > > 
> > > There is a "network" of locking augmented by smp_mb__after_unlock_lock()
> > > that implements the all-to-all memory ordering mentioned above.  But it
> > > also needs to handle all the possible complete()/wait_for_completion()
> > > races, even those assisted by hypervisor vCPU preemption.
> > 
> > I see, so it sounds like the lock network is 

Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-11-02 Thread Joel Fernandes
On Thu, Nov 01, 2018 at 09:13:07AM -0700, Paul E. McKenney wrote:
> On Wed, Oct 31, 2018 at 10:00:19PM -0700, Joel Fernandes wrote:
> > On Wed, Oct 31, 2018 at 11:17:48AM -0700, Paul E. McKenney wrote:
> > > On Tue, Oct 30, 2018 at 06:11:19PM -0700, Joel Fernandes wrote:
> > > > Hi Paul,
> > > > 
> > > > On Tue, Oct 30, 2018 at 04:43:36PM -0700, Paul E. McKenney wrote:
> > > > > On Tue, Oct 30, 2018 at 03:26:49PM -0700, Joel Fernandes wrote:
> > > > > > Hi Paul,
> > > > > > 
> > > > > > On Sat, Oct 27, 2018 at 09:30:46PM -0700, Joel Fernandes (Google) 
> > > > > > wrote:
> > > > > > > As per this thread [1], it seems this smp_mb isn't needed anymore:
> > > > > > > "So the smp_mb() that I was trying to add doesn't need to be 
> > > > > > > there."
> > > > > > > 
> > > > > > > So let us remove this part from the memory ordering documentation.
> > > > > > > 
> > > > > > > [1] https://lkml.org/lkml/2017/10/6/707
> > > > > > > 
> > > > > > > Signed-off-by: Joel Fernandes (Google) 
> > > > > > 
> > > > > > I was just checking about this patch. Do you feel it is correct to 
> > > > > > remove
> > > > > > this part from the docs? Are you satisified that a barrier isn't 
> > > > > > needed there
> > > > > > now? Or did I miss something?
> > > > > 
> > > > > Apologies, it got lost in the shuffle.  I have now applied it with a
> > > > > bit of rework to the commit log, thank you!
> > > > 
> > > > No worries, thanks for taking it!
> > > > 
> > > > Just wanted to update you on my progress reading/correcting the docs. 
> > > > The
> > > > 'Memory Ordering' is taking a bit of time so I paused that and I'm 
> > > > focusing
> > > > on finishing all the other low hanging fruit. This activity is mostly 
> > > > during
> > > > night hours after the baby is asleep but some times I also manage to 
> > > > sneak it
> > > > into the day job ;-)
> > > 
> > > If there is anything I can do to make this a more sustainable task for
> > > you, please do not keep it a secret!!!
> > 
> > Thanks a lot, that means a lot to me! Will do!
> > 
> > > > BTW I do want to discuss about this smp_mb patch above with you at LPC 
> > > > if you
> > > > had time, even though we are removing it from the documentation. I 
> > > > thought
> > > > about it a few times, and I was not able to fully appreciate the need 
> > > > for the
> > > > barrier (that is even assuming that complete() etc did not do the right
> > > > thing).  Specifically I was wondering same thing Peter said in the above
> > > > thread I think that - if that rcu_read_unlock() triggered all the spin
> > > > locking up the tree of nodes, then why is that locking not sufficient to
> > > > prevent reads from the read-side section from bleeding out? That would
> > > > prevent the reader that just unlocked from seeing anything that happens
> > > > _after_ the synchronize_rcu.
> > > 
> > > Actually, I recall an smp_mb() being added, but am not seeing it anywhere
> > > relevant to wait_for_completion().  So I might need to add the smp_mb()
> > > to synchronize_rcu() and remove the patch (retaining the typo fix).  :-/
> > 
> > No problem, I'm glad atleast the patch resurfaced the topic of the potential
> > issue :-)
> 
> And an smp_mb() is needed in Tree RCU's __wait_rcu_gp().  This is
> because wait_for_completion() might get a "fly-by" wakeup, which would
> mean no ordering for code naively thinking that it was ordered after a
> grace period.
> 
> > > The short form answer is that anything before a grace period on any CPU
> > > must be seen by any CPU as being before anything on any CPU after that
> > > same grace period.  This guarantee requires a rather big hammer.
> > > 
> > > But yes, let's talk at LPC!
> > 
> > Sounds great, looking forward to discussing this.
> 
> Would it make sense to have an RCU-implementation BoF?
> 
> > > > Also about GP memory ordering and RCU-tree-locking, I think you 
> > > > mentioned to
> > > > me that the RCU reader-sections are virtually extended both forward and
> > > > backward and whereever it ends, those paths do heavy-weight 
> > > > synchronization
> > > > that should be sufficient to prevent memory ordering issues (such as 
> > > > those
> > > > you mentioned in the Requierments document). That is exactly why we 
> > > > don't
> > > > need explicit barriers during rcu_read_unlock. If I recall I asked you 
> > > > why
> > > > those are not needed. So that answer made sense, but then now on going
> > > > through the 'Memory Ordering' document, I see that you mentioned there 
> > > > is
> > > > reliance on the locking. Is that reliance on locking necessary to 
> > > > maintain
> > > > ordering then?
> > > 
> > > There is a "network" of locking augmented by smp_mb__after_unlock_lock()
> > > that implements the all-to-all memory ordering mentioned above.  But it
> > > also needs to handle all the possible complete()/wait_for_completion()
> > > races, even those assisted by hypervisor vCPU preemption.
> > 
> > I see, so it sounds like the lock network is 

Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-11-02 Thread Paul E. McKenney
On Fri, Nov 02, 2018 at 03:14:29PM -0700, Joel Fernandes wrote:
> On Fri, Nov 02, 2018 at 01:00:00PM -0700, Paul E. McKenney wrote:
> [..] 
> > > I think it would make sense also to combine it with other memory-ordering
> > > topics like the memory model and rseq/cpu-opv things that Mathieu was 
> > > doing
> > > (if it makes sense to combine). But yes, I am definitely interested in an
> > > RCU-implementation BoF session.
> > 
> > There is an LKMM kernel summit track presentation.  I believe that
> > Mathieu's rseq/cpu-opv would be a good one as well, but Mathieu needs
> > to lead this up and it should be a separate BoF.  Please do feel free
> > to reach out to him.  I am sure that he would be particularly interested
> > in potential uses of rseq and especially cpu-opv.
> 
> Cool! Looking forward to LKMM talk and I'll keep in mind to reach out to
> Mathieu about rseq usecases.

Sounds good!

> > > > > > > Also about GP memory ordering and RCU-tree-locking, I think you 
> > > > > > > mentioned to
> > > > > > > me that the RCU reader-sections are virtually extended both 
> > > > > > > forward and
> > > > > > > backward and whereever it ends, those paths do heavy-weight 
> > > > > > > synchronization
> > > > > > > that should be sufficient to prevent memory ordering issues (such 
> > > > > > > as those
> > > > > > > you mentioned in the Requierments document). That is exactly why 
> > > > > > > we don't
> > > > > > > need explicit barriers during rcu_read_unlock. If I recall I 
> > > > > > > asked you why
> > > > > > > those are not needed. So that answer made sense, but then now on 
> > > > > > > going
> > > > > > > through the 'Memory Ordering' document, I see that you mentioned 
> > > > > > > there is
> > > > > > > reliance on the locking. Is that reliance on locking necessary to 
> > > > > > > maintain
> > > > > > > ordering then?
> > > > > > 
> > > > > > There is a "network" of locking augmented by 
> > > > > > smp_mb__after_unlock_lock()
> > > > > > that implements the all-to-all memory ordering mentioned above.  
> > > > > > But it
> > > > > > also needs to handle all the possible 
> > > > > > complete()/wait_for_completion()
> > > > > > races, even those assisted by hypervisor vCPU preemption.
> > > > > 
> > > > > I see, so it sounds like the lock network is just a partial solution. 
> > > > > For
> > > > > some reason I thought before that complete() was even called on the 
> > > > > CPU
> > > > > executing the callback, all the CPUs would have acquired and released 
> > > > > a lock
> > > > > in the "lock network" atleast once thus ensuring the ordering (due to 
> > > > > the
> > > > > fact that the quiescent state reporting has to travel up the tree 
> > > > > starting
> > > > > from the leaves), but I think that's not necessarily true so I see 
> > > > > your point
> > > > > now.
> > > > 
> > > > There is indeed a lock that is unconditionally acquired and released by
> > > > wait_for_completion(), but it lacks the smp_mb__after_unlock_lock() that
> > > > is required to get full-up any-to-any ordering.  And unfortunate timing
> > > > (as well as spurious wakeups) allow the interaction to have only normal
> > > > lock-release/acquire ordering, which does not suffice in all cases.
> > > > 
> > > > SRCU and expedited RCU grace periods handle this correctly.  Only the
> > > > normal grace periods are missing the needed barrier.  The probability of
> > > > failure is extremely low in the common case, which involves all sorts
> > > > of synchronization on the wakeup path.  It would be quite strange (but
> > > > not impossible) for the wait_for_completion() exit path to -not- to do
> > > > a full wakeup.  Plus the bug requires a reader before the grace period
> > > > to do a store to some location that post-grace-period code loads from.
> > > > Which is a very rare use case.
> > > > 
> > > > But it still should be fixed.  ;-)
> > > > 
> > > > > Did you feel this will violate condition 1. or condition 2. in 
> > > > > "Memory-Barrier
> > > > > Guarantees"? Or both?
> > > > > https://www.kernel.org/doc/Documentation/RCU/Design/Requirements/Requirements.html#Memory-Barrier%20Guarantees
> > > > 
> > > > Condition 1.  There might be some strange combination of events that
> > > > could also cause it to also violate condition 2, but I am not 
> > > > immediately
> > > > seeing it.
> > > 
> > > In the previous paragraph, you mentioned the bug "requires a reader before
> > > the GP to do a store". However, condition 1 is really different - it is a
> > > reader holding a reference to a pointer that is used *after* the
> > > synchronize_rcu returns. So that reader's load of the pointer should have
> > > completed by the time GP ends, otherwise the reader can look at kfree'd 
> > > data.
> > > That's different right?
> > 
> > More specifically, the fix prevents a prior reader's -store- within its
> > critical section to be seen as happening after a load that follows the
> > end of the grace period.  So I stand by 

Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-11-02 Thread Paul E. McKenney
On Fri, Nov 02, 2018 at 03:14:29PM -0700, Joel Fernandes wrote:
> On Fri, Nov 02, 2018 at 01:00:00PM -0700, Paul E. McKenney wrote:
> [..] 
> > > I think it would make sense also to combine it with other memory-ordering
> > > topics like the memory model and rseq/cpu-opv things that Mathieu was 
> > > doing
> > > (if it makes sense to combine). But yes, I am definitely interested in an
> > > RCU-implementation BoF session.
> > 
> > There is an LKMM kernel summit track presentation.  I believe that
> > Mathieu's rseq/cpu-opv would be a good one as well, but Mathieu needs
> > to lead this up and it should be a separate BoF.  Please do feel free
> > to reach out to him.  I am sure that he would be particularly interested
> > in potential uses of rseq and especially cpu-opv.
> 
> Cool! Looking forward to LKMM talk and I'll keep in mind to reach out to
> Mathieu about rseq usecases.

Sounds good!

> > > > > > > Also about GP memory ordering and RCU-tree-locking, I think you 
> > > > > > > mentioned to
> > > > > > > me that the RCU reader-sections are virtually extended both 
> > > > > > > forward and
> > > > > > > backward and whereever it ends, those paths do heavy-weight 
> > > > > > > synchronization
> > > > > > > that should be sufficient to prevent memory ordering issues (such 
> > > > > > > as those
> > > > > > > you mentioned in the Requierments document). That is exactly why 
> > > > > > > we don't
> > > > > > > need explicit barriers during rcu_read_unlock. If I recall I 
> > > > > > > asked you why
> > > > > > > those are not needed. So that answer made sense, but then now on 
> > > > > > > going
> > > > > > > through the 'Memory Ordering' document, I see that you mentioned 
> > > > > > > there is
> > > > > > > reliance on the locking. Is that reliance on locking necessary to 
> > > > > > > maintain
> > > > > > > ordering then?
> > > > > > 
> > > > > > There is a "network" of locking augmented by 
> > > > > > smp_mb__after_unlock_lock()
> > > > > > that implements the all-to-all memory ordering mentioned above.  
> > > > > > But it
> > > > > > also needs to handle all the possible 
> > > > > > complete()/wait_for_completion()
> > > > > > races, even those assisted by hypervisor vCPU preemption.
> > > > > 
> > > > > I see, so it sounds like the lock network is just a partial solution. 
> > > > > For
> > > > > some reason I thought before that complete() was even called on the 
> > > > > CPU
> > > > > executing the callback, all the CPUs would have acquired and released 
> > > > > a lock
> > > > > in the "lock network" atleast once thus ensuring the ordering (due to 
> > > > > the
> > > > > fact that the quiescent state reporting has to travel up the tree 
> > > > > starting
> > > > > from the leaves), but I think that's not necessarily true so I see 
> > > > > your point
> > > > > now.
> > > > 
> > > > There is indeed a lock that is unconditionally acquired and released by
> > > > wait_for_completion(), but it lacks the smp_mb__after_unlock_lock() that
> > > > is required to get full-up any-to-any ordering.  And unfortunate timing
> > > > (as well as spurious wakeups) allow the interaction to have only normal
> > > > lock-release/acquire ordering, which does not suffice in all cases.
> > > > 
> > > > SRCU and expedited RCU grace periods handle this correctly.  Only the
> > > > normal grace periods are missing the needed barrier.  The probability of
> > > > failure is extremely low in the common case, which involves all sorts
> > > > of synchronization on the wakeup path.  It would be quite strange (but
> > > > not impossible) for the wait_for_completion() exit path to -not- to do
> > > > a full wakeup.  Plus the bug requires a reader before the grace period
> > > > to do a store to some location that post-grace-period code loads from.
> > > > Which is a very rare use case.
> > > > 
> > > > But it still should be fixed.  ;-)
> > > > 
> > > > > Did you feel this will violate condition 1. or condition 2. in 
> > > > > "Memory-Barrier
> > > > > Guarantees"? Or both?
> > > > > https://www.kernel.org/doc/Documentation/RCU/Design/Requirements/Requirements.html#Memory-Barrier%20Guarantees
> > > > 
> > > > Condition 1.  There might be some strange combination of events that
> > > > could also cause it to also violate condition 2, but I am not 
> > > > immediately
> > > > seeing it.
> > > 
> > > In the previous paragraph, you mentioned the bug "requires a reader before
> > > the GP to do a store". However, condition 1 is really different - it is a
> > > reader holding a reference to a pointer that is used *after* the
> > > synchronize_rcu returns. So that reader's load of the pointer should have
> > > completed by the time GP ends, otherwise the reader can look at kfree'd 
> > > data.
> > > That's different right?
> > 
> > More specifically, the fix prevents a prior reader's -store- within its
> > critical section to be seen as happening after a load that follows the
> > end of the grace period.  So I stand by 

Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-11-02 Thread Joel Fernandes
On Fri, Nov 02, 2018 at 01:00:00PM -0700, Paul E. McKenney wrote:
[..] 
> > I think it would make sense also to combine it with other memory-ordering
> > topics like the memory model and rseq/cpu-opv things that Mathieu was doing
> > (if it makes sense to combine). But yes, I am definitely interested in an
> > RCU-implementation BoF session.
> 
> There is an LKMM kernel summit track presentation.  I believe that
> Mathieu's rseq/cpu-opv would be a good one as well, but Mathieu needs
> to lead this up and it should be a separate BoF.  Please do feel free
> to reach out to him.  I am sure that he would be particularly interested
> in potential uses of rseq and especially cpu-opv.

Cool! Looking forward to LKMM talk and I'll keep in mind to reach out to
Mathieu about rseq usecases.

> > > > > > Also about GP memory ordering and RCU-tree-locking, I think you 
> > > > > > mentioned to
> > > > > > me that the RCU reader-sections are virtually extended both forward 
> > > > > > and
> > > > > > backward and whereever it ends, those paths do heavy-weight 
> > > > > > synchronization
> > > > > > that should be sufficient to prevent memory ordering issues (such 
> > > > > > as those
> > > > > > you mentioned in the Requierments document). That is exactly why we 
> > > > > > don't
> > > > > > need explicit barriers during rcu_read_unlock. If I recall I asked 
> > > > > > you why
> > > > > > those are not needed. So that answer made sense, but then now on 
> > > > > > going
> > > > > > through the 'Memory Ordering' document, I see that you mentioned 
> > > > > > there is
> > > > > > reliance on the locking. Is that reliance on locking necessary to 
> > > > > > maintain
> > > > > > ordering then?
> > > > > 
> > > > > There is a "network" of locking augmented by 
> > > > > smp_mb__after_unlock_lock()
> > > > > that implements the all-to-all memory ordering mentioned above.  But 
> > > > > it
> > > > > also needs to handle all the possible complete()/wait_for_completion()
> > > > > races, even those assisted by hypervisor vCPU preemption.
> > > > 
> > > > I see, so it sounds like the lock network is just a partial solution. 
> > > > For
> > > > some reason I thought before that complete() was even called on the CPU
> > > > executing the callback, all the CPUs would have acquired and released a 
> > > > lock
> > > > in the "lock network" atleast once thus ensuring the ordering (due to 
> > > > the
> > > > fact that the quiescent state reporting has to travel up the tree 
> > > > starting
> > > > from the leaves), but I think that's not necessarily true so I see your 
> > > > point
> > > > now.
> > > 
> > > There is indeed a lock that is unconditionally acquired and released by
> > > wait_for_completion(), but it lacks the smp_mb__after_unlock_lock() that
> > > is required to get full-up any-to-any ordering.  And unfortunate timing
> > > (as well as spurious wakeups) allow the interaction to have only normal
> > > lock-release/acquire ordering, which does not suffice in all cases.
> > > 
> > > SRCU and expedited RCU grace periods handle this correctly.  Only the
> > > normal grace periods are missing the needed barrier.  The probability of
> > > failure is extremely low in the common case, which involves all sorts
> > > of synchronization on the wakeup path.  It would be quite strange (but
> > > not impossible) for the wait_for_completion() exit path to -not- to do
> > > a full wakeup.  Plus the bug requires a reader before the grace period
> > > to do a store to some location that post-grace-period code loads from.
> > > Which is a very rare use case.
> > > 
> > > But it still should be fixed.  ;-)
> > > 
> > > > Did you feel this will violate condition 1. or condition 2. in 
> > > > "Memory-Barrier
> > > > Guarantees"? Or both?
> > > > https://www.kernel.org/doc/Documentation/RCU/Design/Requirements/Requirements.html#Memory-Barrier%20Guarantees
> > > 
> > > Condition 1.  There might be some strange combination of events that
> > > could also cause it to also violate condition 2, but I am not immediately
> > > seeing it.
> > 
> > In the previous paragraph, you mentioned the bug "requires a reader before
> > the GP to do a store". However, condition 1 is really different - it is a
> > reader holding a reference to a pointer that is used *after* the
> > synchronize_rcu returns. So that reader's load of the pointer should have
> > completed by the time GP ends, otherwise the reader can look at kfree'd 
> > data.
> > That's different right?
> 
> More specifically, the fix prevents a prior reader's -store- within its
> critical section to be seen as happening after a load that follows the
> end of the grace period.  So I stand by Condition 1.  ;-)
> And again, a store within an RCU read-side critical section is a bit
> on the strange side, but this sort of thing is perfectly legal and
> is used, albeit rather rarely.

Cool :) I never thought about condition 1 this way but good to know that's
possible :)

> > For condition 2, I 

Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-11-02 Thread Joel Fernandes
On Fri, Nov 02, 2018 at 01:00:00PM -0700, Paul E. McKenney wrote:
[..] 
> > I think it would make sense also to combine it with other memory-ordering
> > topics like the memory model and rseq/cpu-opv things that Mathieu was doing
> > (if it makes sense to combine). But yes, I am definitely interested in an
> > RCU-implementation BoF session.
> 
> There is an LKMM kernel summit track presentation.  I believe that
> Mathieu's rseq/cpu-opv would be a good one as well, but Mathieu needs
> to lead this up and it should be a separate BoF.  Please do feel free
> to reach out to him.  I am sure that he would be particularly interested
> in potential uses of rseq and especially cpu-opv.

Cool! Looking forward to LKMM talk and I'll keep in mind to reach out to
Mathieu about rseq usecases.

> > > > > > Also about GP memory ordering and RCU-tree-locking, I think you 
> > > > > > mentioned to
> > > > > > me that the RCU reader-sections are virtually extended both forward 
> > > > > > and
> > > > > > backward and whereever it ends, those paths do heavy-weight 
> > > > > > synchronization
> > > > > > that should be sufficient to prevent memory ordering issues (such 
> > > > > > as those
> > > > > > you mentioned in the Requierments document). That is exactly why we 
> > > > > > don't
> > > > > > need explicit barriers during rcu_read_unlock. If I recall I asked 
> > > > > > you why
> > > > > > those are not needed. So that answer made sense, but then now on 
> > > > > > going
> > > > > > through the 'Memory Ordering' document, I see that you mentioned 
> > > > > > there is
> > > > > > reliance on the locking. Is that reliance on locking necessary to 
> > > > > > maintain
> > > > > > ordering then?
> > > > > 
> > > > > There is a "network" of locking augmented by 
> > > > > smp_mb__after_unlock_lock()
> > > > > that implements the all-to-all memory ordering mentioned above.  But 
> > > > > it
> > > > > also needs to handle all the possible complete()/wait_for_completion()
> > > > > races, even those assisted by hypervisor vCPU preemption.
> > > > 
> > > > I see, so it sounds like the lock network is just a partial solution. 
> > > > For
> > > > some reason I thought before that complete() was even called on the CPU
> > > > executing the callback, all the CPUs would have acquired and released a 
> > > > lock
> > > > in the "lock network" atleast once thus ensuring the ordering (due to 
> > > > the
> > > > fact that the quiescent state reporting has to travel up the tree 
> > > > starting
> > > > from the leaves), but I think that's not necessarily true so I see your 
> > > > point
> > > > now.
> > > 
> > > There is indeed a lock that is unconditionally acquired and released by
> > > wait_for_completion(), but it lacks the smp_mb__after_unlock_lock() that
> > > is required to get full-up any-to-any ordering.  And unfortunate timing
> > > (as well as spurious wakeups) allow the interaction to have only normal
> > > lock-release/acquire ordering, which does not suffice in all cases.
> > > 
> > > SRCU and expedited RCU grace periods handle this correctly.  Only the
> > > normal grace periods are missing the needed barrier.  The probability of
> > > failure is extremely low in the common case, which involves all sorts
> > > of synchronization on the wakeup path.  It would be quite strange (but
> > > not impossible) for the wait_for_completion() exit path to -not- to do
> > > a full wakeup.  Plus the bug requires a reader before the grace period
> > > to do a store to some location that post-grace-period code loads from.
> > > Which is a very rare use case.
> > > 
> > > But it still should be fixed.  ;-)
> > > 
> > > > Did you feel this will violate condition 1. or condition 2. in 
> > > > "Memory-Barrier
> > > > Guarantees"? Or both?
> > > > https://www.kernel.org/doc/Documentation/RCU/Design/Requirements/Requirements.html#Memory-Barrier%20Guarantees
> > > 
> > > Condition 1.  There might be some strange combination of events that
> > > could also cause it to also violate condition 2, but I am not immediately
> > > seeing it.
> > 
> > In the previous paragraph, you mentioned the bug "requires a reader before
> > the GP to do a store". However, condition 1 is really different - it is a
> > reader holding a reference to a pointer that is used *after* the
> > synchronize_rcu returns. So that reader's load of the pointer should have
> > completed by the time GP ends, otherwise the reader can look at kfree'd 
> > data.
> > That's different right?
> 
> More specifically, the fix prevents a prior reader's -store- within its
> critical section to be seen as happening after a load that follows the
> end of the grace period.  So I stand by Condition 1.  ;-)
> And again, a store within an RCU read-side critical section is a bit
> on the strange side, but this sort of thing is perfectly legal and
> is used, albeit rather rarely.

Cool :) I never thought about condition 1 this way but good to know that's
possible :)

> > For condition 2, I 

Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-11-02 Thread Paul E. McKenney
On Thu, Nov 01, 2018 at 11:15:18PM -0700, Joel Fernandes wrote:
> On Thu, Nov 01, 2018 at 09:13:07AM -0700, Paul E. McKenney wrote:
> > > > > BTW I do want to discuss about this smp_mb patch above with you at 
> > > > > LPC if you
> > > > > had time, even though we are removing it from the documentation. I 
> > > > > thought
> > > > > about it a few times, and I was not able to fully appreciate the need 
> > > > > for the
> > > > > barrier (that is even assuming that complete() etc did not do the 
> > > > > right
> > > > > thing).  Specifically I was wondering same thing Peter said in the 
> > > > > above
> > > > > thread I think that - if that rcu_read_unlock() triggered all the spin
> > > > > locking up the tree of nodes, then why is that locking not sufficient 
> > > > > to
> > > > > prevent reads from the read-side section from bleeding out? That would
> > > > > prevent the reader that just unlocked from seeing anything that 
> > > > > happens
> > > > > _after_ the synchronize_rcu.
> > > > 
> > > > Actually, I recall an smp_mb() being added, but am not seeing it 
> > > > anywhere
> > > > relevant to wait_for_completion().  So I might need to add the smp_mb()
> > > > to synchronize_rcu() and remove the patch (retaining the typo fix).  :-/
> > > 
> > > No problem, I'm glad atleast the patch resurfaced the topic of the 
> > > potential
> > > issue :-)
> > 
> > And an smp_mb() is needed in Tree RCU's __wait_rcu_gp().  This is
> > because wait_for_completion() might get a "fly-by" wakeup, which would
> > mean no ordering for code naively thinking that it was ordered after a
> > grace period.
> 
> Makes sense.
> 
> > > > The short form answer is that anything before a grace period on any CPU
> > > > must be seen by any CPU as being before anything on any CPU after that
> > > > same grace period.  This guarantee requires a rather big hammer.
> > > > 
> > > > But yes, let's talk at LPC!
> > > 
> > > Sounds great, looking forward to discussing this.
> > 
> > Would it make sense to have an RCU-implementation BoF?
> 
> Yes, I would very much like that. I also spoke with my colleage Daniel
> Colascione and he said he would be interested too.

Sounds good!

> I think it would make sense also to combine it with other memory-ordering
> topics like the memory model and rseq/cpu-opv things that Mathieu was doing
> (if it makes sense to combine). But yes, I am definitely interested in an
> RCU-implementation BoF session.

There is an LKMM kernel summit track presentation.  I believe that
Mathieu's rseq/cpu-opv would be a good one as well, but Mathieu needs
to lead this up and it should be a separate BoF.  Please do feel free
to reach out to him.  I am sure that he would be particularly interested
in potential uses of rseq and especially cpu-opv.

> > > > > Also about GP memory ordering and RCU-tree-locking, I think you 
> > > > > mentioned to
> > > > > me that the RCU reader-sections are virtually extended both forward 
> > > > > and
> > > > > backward and whereever it ends, those paths do heavy-weight 
> > > > > synchronization
> > > > > that should be sufficient to prevent memory ordering issues (such as 
> > > > > those
> > > > > you mentioned in the Requierments document). That is exactly why we 
> > > > > don't
> > > > > need explicit barriers during rcu_read_unlock. If I recall I asked 
> > > > > you why
> > > > > those are not needed. So that answer made sense, but then now on going
> > > > > through the 'Memory Ordering' document, I see that you mentioned 
> > > > > there is
> > > > > reliance on the locking. Is that reliance on locking necessary to 
> > > > > maintain
> > > > > ordering then?
> > > > 
> > > > There is a "network" of locking augmented by smp_mb__after_unlock_lock()
> > > > that implements the all-to-all memory ordering mentioned above.  But it
> > > > also needs to handle all the possible complete()/wait_for_completion()
> > > > races, even those assisted by hypervisor vCPU preemption.
> > > 
> > > I see, so it sounds like the lock network is just a partial solution. For
> > > some reason I thought before that complete() was even called on the CPU
> > > executing the callback, all the CPUs would have acquired and released a 
> > > lock
> > > in the "lock network" atleast once thus ensuring the ordering (due to the
> > > fact that the quiescent state reporting has to travel up the tree starting
> > > from the leaves), but I think that's not necessarily true so I see your 
> > > point
> > > now.
> > 
> > There is indeed a lock that is unconditionally acquired and released by
> > wait_for_completion(), but it lacks the smp_mb__after_unlock_lock() that
> > is required to get full-up any-to-any ordering.  And unfortunate timing
> > (as well as spurious wakeups) allow the interaction to have only normal
> > lock-release/acquire ordering, which does not suffice in all cases.
> > 
> > SRCU and expedited RCU grace periods handle this correctly.  Only the
> > normal grace periods are missing the needed 

Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-11-02 Thread Paul E. McKenney
On Thu, Nov 01, 2018 at 11:15:18PM -0700, Joel Fernandes wrote:
> On Thu, Nov 01, 2018 at 09:13:07AM -0700, Paul E. McKenney wrote:
> > > > > BTW I do want to discuss about this smp_mb patch above with you at 
> > > > > LPC if you
> > > > > had time, even though we are removing it from the documentation. I 
> > > > > thought
> > > > > about it a few times, and I was not able to fully appreciate the need 
> > > > > for the
> > > > > barrier (that is even assuming that complete() etc did not do the 
> > > > > right
> > > > > thing).  Specifically I was wondering same thing Peter said in the 
> > > > > above
> > > > > thread I think that - if that rcu_read_unlock() triggered all the spin
> > > > > locking up the tree of nodes, then why is that locking not sufficient 
> > > > > to
> > > > > prevent reads from the read-side section from bleeding out? That would
> > > > > prevent the reader that just unlocked from seeing anything that 
> > > > > happens
> > > > > _after_ the synchronize_rcu.
> > > > 
> > > > Actually, I recall an smp_mb() being added, but am not seeing it 
> > > > anywhere
> > > > relevant to wait_for_completion().  So I might need to add the smp_mb()
> > > > to synchronize_rcu() and remove the patch (retaining the typo fix).  :-/
> > > 
> > > No problem, I'm glad atleast the patch resurfaced the topic of the 
> > > potential
> > > issue :-)
> > 
> > And an smp_mb() is needed in Tree RCU's __wait_rcu_gp().  This is
> > because wait_for_completion() might get a "fly-by" wakeup, which would
> > mean no ordering for code naively thinking that it was ordered after a
> > grace period.
> 
> Makes sense.
> 
> > > > The short form answer is that anything before a grace period on any CPU
> > > > must be seen by any CPU as being before anything on any CPU after that
> > > > same grace period.  This guarantee requires a rather big hammer.
> > > > 
> > > > But yes, let's talk at LPC!
> > > 
> > > Sounds great, looking forward to discussing this.
> > 
> > Would it make sense to have an RCU-implementation BoF?
> 
> Yes, I would very much like that. I also spoke with my colleage Daniel
> Colascione and he said he would be interested too.

Sounds good!

> I think it would make sense also to combine it with other memory-ordering
> topics like the memory model and rseq/cpu-opv things that Mathieu was doing
> (if it makes sense to combine). But yes, I am definitely interested in an
> RCU-implementation BoF session.

There is an LKMM kernel summit track presentation.  I believe that
Mathieu's rseq/cpu-opv would be a good one as well, but Mathieu needs
to lead this up and it should be a separate BoF.  Please do feel free
to reach out to him.  I am sure that he would be particularly interested
in potential uses of rseq and especially cpu-opv.

> > > > > Also about GP memory ordering and RCU-tree-locking, I think you 
> > > > > mentioned to
> > > > > me that the RCU reader-sections are virtually extended both forward 
> > > > > and
> > > > > backward and whereever it ends, those paths do heavy-weight 
> > > > > synchronization
> > > > > that should be sufficient to prevent memory ordering issues (such as 
> > > > > those
> > > > > you mentioned in the Requierments document). That is exactly why we 
> > > > > don't
> > > > > need explicit barriers during rcu_read_unlock. If I recall I asked 
> > > > > you why
> > > > > those are not needed. So that answer made sense, but then now on going
> > > > > through the 'Memory Ordering' document, I see that you mentioned 
> > > > > there is
> > > > > reliance on the locking. Is that reliance on locking necessary to 
> > > > > maintain
> > > > > ordering then?
> > > > 
> > > > There is a "network" of locking augmented by smp_mb__after_unlock_lock()
> > > > that implements the all-to-all memory ordering mentioned above.  But it
> > > > also needs to handle all the possible complete()/wait_for_completion()
> > > > races, even those assisted by hypervisor vCPU preemption.
> > > 
> > > I see, so it sounds like the lock network is just a partial solution. For
> > > some reason I thought before that complete() was even called on the CPU
> > > executing the callback, all the CPUs would have acquired and released a 
> > > lock
> > > in the "lock network" atleast once thus ensuring the ordering (due to the
> > > fact that the quiescent state reporting has to travel up the tree starting
> > > from the leaves), but I think that's not necessarily true so I see your 
> > > point
> > > now.
> > 
> > There is indeed a lock that is unconditionally acquired and released by
> > wait_for_completion(), but it lacks the smp_mb__after_unlock_lock() that
> > is required to get full-up any-to-any ordering.  And unfortunate timing
> > (as well as spurious wakeups) allow the interaction to have only normal
> > lock-release/acquire ordering, which does not suffice in all cases.
> > 
> > SRCU and expedited RCU grace periods handle this correctly.  Only the
> > normal grace periods are missing the needed 

Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-11-02 Thread Joel Fernandes
On Thu, Nov 01, 2018 at 09:13:07AM -0700, Paul E. McKenney wrote:
> > > > BTW I do want to discuss about this smp_mb patch above with you at LPC 
> > > > if you
> > > > had time, even though we are removing it from the documentation. I 
> > > > thought
> > > > about it a few times, and I was not able to fully appreciate the need 
> > > > for the
> > > > barrier (that is even assuming that complete() etc did not do the right
> > > > thing).  Specifically I was wondering same thing Peter said in the above
> > > > thread I think that - if that rcu_read_unlock() triggered all the spin
> > > > locking up the tree of nodes, then why is that locking not sufficient to
> > > > prevent reads from the read-side section from bleeding out? That would
> > > > prevent the reader that just unlocked from seeing anything that happens
> > > > _after_ the synchronize_rcu.
> > > 
> > > Actually, I recall an smp_mb() being added, but am not seeing it anywhere
> > > relevant to wait_for_completion().  So I might need to add the smp_mb()
> > > to synchronize_rcu() and remove the patch (retaining the typo fix).  :-/
> > 
> > No problem, I'm glad atleast the patch resurfaced the topic of the potential
> > issue :-)
> 
> And an smp_mb() is needed in Tree RCU's __wait_rcu_gp().  This is
> because wait_for_completion() might get a "fly-by" wakeup, which would
> mean no ordering for code naively thinking that it was ordered after a
> grace period.

Makes sense.

> > > The short form answer is that anything before a grace period on any CPU
> > > must be seen by any CPU as being before anything on any CPU after that
> > > same grace period.  This guarantee requires a rather big hammer.
> > > 
> > > But yes, let's talk at LPC!
> > 
> > Sounds great, looking forward to discussing this.
> 
> Would it make sense to have an RCU-implementation BoF?

Yes, I would very much like that. I also spoke with my colleage Daniel
Colascione and he said he would be interested too.

I think it would make sense also to combine it with other memory-ordering
topics like the memory model and rseq/cpu-opv things that Mathieu was doing
(if it makes sense to combine). But yes, I am definitely interested in an
RCU-implementation BoF session.

> > > > Also about GP memory ordering and RCU-tree-locking, I think you 
> > > > mentioned to
> > > > me that the RCU reader-sections are virtually extended both forward and
> > > > backward and whereever it ends, those paths do heavy-weight 
> > > > synchronization
> > > > that should be sufficient to prevent memory ordering issues (such as 
> > > > those
> > > > you mentioned in the Requierments document). That is exactly why we 
> > > > don't
> > > > need explicit barriers during rcu_read_unlock. If I recall I asked you 
> > > > why
> > > > those are not needed. So that answer made sense, but then now on going
> > > > through the 'Memory Ordering' document, I see that you mentioned there 
> > > > is
> > > > reliance on the locking. Is that reliance on locking necessary to 
> > > > maintain
> > > > ordering then?
> > > 
> > > There is a "network" of locking augmented by smp_mb__after_unlock_lock()
> > > that implements the all-to-all memory ordering mentioned above.  But it
> > > also needs to handle all the possible complete()/wait_for_completion()
> > > races, even those assisted by hypervisor vCPU preemption.
> > 
> > I see, so it sounds like the lock network is just a partial solution. For
> > some reason I thought before that complete() was even called on the CPU
> > executing the callback, all the CPUs would have acquired and released a lock
> > in the "lock network" atleast once thus ensuring the ordering (due to the
> > fact that the quiescent state reporting has to travel up the tree starting
> > from the leaves), but I think that's not necessarily true so I see your 
> > point
> > now.
> 
> There is indeed a lock that is unconditionally acquired and released by
> wait_for_completion(), but it lacks the smp_mb__after_unlock_lock() that
> is required to get full-up any-to-any ordering.  And unfortunate timing
> (as well as spurious wakeups) allow the interaction to have only normal
> lock-release/acquire ordering, which does not suffice in all cases.
> 
> SRCU and expedited RCU grace periods handle this correctly.  Only the
> normal grace periods are missing the needed barrier.  The probability of
> failure is extremely low in the common case, which involves all sorts
> of synchronization on the wakeup path.  It would be quite strange (but
> not impossible) for the wait_for_completion() exit path to -not- to do
> a full wakeup.  Plus the bug requires a reader before the grace period
> to do a store to some location that post-grace-period code loads from.
> Which is a very rare use case.
> 
> But it still should be fixed.  ;-)
> 
> > Did you feel this will violate condition 1. or condition 2. in 
> > "Memory-Barrier
> > Guarantees"? Or both?
> > 

Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-11-02 Thread Joel Fernandes
On Thu, Nov 01, 2018 at 09:13:07AM -0700, Paul E. McKenney wrote:
> > > > BTW I do want to discuss about this smp_mb patch above with you at LPC 
> > > > if you
> > > > had time, even though we are removing it from the documentation. I 
> > > > thought
> > > > about it a few times, and I was not able to fully appreciate the need 
> > > > for the
> > > > barrier (that is even assuming that complete() etc did not do the right
> > > > thing).  Specifically I was wondering same thing Peter said in the above
> > > > thread I think that - if that rcu_read_unlock() triggered all the spin
> > > > locking up the tree of nodes, then why is that locking not sufficient to
> > > > prevent reads from the read-side section from bleeding out? That would
> > > > prevent the reader that just unlocked from seeing anything that happens
> > > > _after_ the synchronize_rcu.
> > > 
> > > Actually, I recall an smp_mb() being added, but am not seeing it anywhere
> > > relevant to wait_for_completion().  So I might need to add the smp_mb()
> > > to synchronize_rcu() and remove the patch (retaining the typo fix).  :-/
> > 
> > No problem, I'm glad atleast the patch resurfaced the topic of the potential
> > issue :-)
> 
> And an smp_mb() is needed in Tree RCU's __wait_rcu_gp().  This is
> because wait_for_completion() might get a "fly-by" wakeup, which would
> mean no ordering for code naively thinking that it was ordered after a
> grace period.

Makes sense.

> > > The short form answer is that anything before a grace period on any CPU
> > > must be seen by any CPU as being before anything on any CPU after that
> > > same grace period.  This guarantee requires a rather big hammer.
> > > 
> > > But yes, let's talk at LPC!
> > 
> > Sounds great, looking forward to discussing this.
> 
> Would it make sense to have an RCU-implementation BoF?

Yes, I would very much like that. I also spoke with my colleage Daniel
Colascione and he said he would be interested too.

I think it would make sense also to combine it with other memory-ordering
topics like the memory model and rseq/cpu-opv things that Mathieu was doing
(if it makes sense to combine). But yes, I am definitely interested in an
RCU-implementation BoF session.

> > > > Also about GP memory ordering and RCU-tree-locking, I think you 
> > > > mentioned to
> > > > me that the RCU reader-sections are virtually extended both forward and
> > > > backward and whereever it ends, those paths do heavy-weight 
> > > > synchronization
> > > > that should be sufficient to prevent memory ordering issues (such as 
> > > > those
> > > > you mentioned in the Requierments document). That is exactly why we 
> > > > don't
> > > > need explicit barriers during rcu_read_unlock. If I recall I asked you 
> > > > why
> > > > those are not needed. So that answer made sense, but then now on going
> > > > through the 'Memory Ordering' document, I see that you mentioned there 
> > > > is
> > > > reliance on the locking. Is that reliance on locking necessary to 
> > > > maintain
> > > > ordering then?
> > > 
> > > There is a "network" of locking augmented by smp_mb__after_unlock_lock()
> > > that implements the all-to-all memory ordering mentioned above.  But it
> > > also needs to handle all the possible complete()/wait_for_completion()
> > > races, even those assisted by hypervisor vCPU preemption.
> > 
> > I see, so it sounds like the lock network is just a partial solution. For
> > some reason I thought before that complete() was even called on the CPU
> > executing the callback, all the CPUs would have acquired and released a lock
> > in the "lock network" atleast once thus ensuring the ordering (due to the
> > fact that the quiescent state reporting has to travel up the tree starting
> > from the leaves), but I think that's not necessarily true so I see your 
> > point
> > now.
> 
> There is indeed a lock that is unconditionally acquired and released by
> wait_for_completion(), but it lacks the smp_mb__after_unlock_lock() that
> is required to get full-up any-to-any ordering.  And unfortunate timing
> (as well as spurious wakeups) allow the interaction to have only normal
> lock-release/acquire ordering, which does not suffice in all cases.
> 
> SRCU and expedited RCU grace periods handle this correctly.  Only the
> normal grace periods are missing the needed barrier.  The probability of
> failure is extremely low in the common case, which involves all sorts
> of synchronization on the wakeup path.  It would be quite strange (but
> not impossible) for the wait_for_completion() exit path to -not- to do
> a full wakeup.  Plus the bug requires a reader before the grace period
> to do a store to some location that post-grace-period code loads from.
> Which is a very rare use case.
> 
> But it still should be fixed.  ;-)
> 
> > Did you feel this will violate condition 1. or condition 2. in 
> > "Memory-Barrier
> > Guarantees"? Or both?
> > 

Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-11-01 Thread Paul E. McKenney
On Wed, Oct 31, 2018 at 10:00:19PM -0700, Joel Fernandes wrote:
> On Wed, Oct 31, 2018 at 11:17:48AM -0700, Paul E. McKenney wrote:
> > On Tue, Oct 30, 2018 at 06:11:19PM -0700, Joel Fernandes wrote:
> > > Hi Paul,
> > > 
> > > On Tue, Oct 30, 2018 at 04:43:36PM -0700, Paul E. McKenney wrote:
> > > > On Tue, Oct 30, 2018 at 03:26:49PM -0700, Joel Fernandes wrote:
> > > > > Hi Paul,
> > > > > 
> > > > > On Sat, Oct 27, 2018 at 09:30:46PM -0700, Joel Fernandes (Google) 
> > > > > wrote:
> > > > > > As per this thread [1], it seems this smp_mb isn't needed anymore:
> > > > > > "So the smp_mb() that I was trying to add doesn't need to be there."
> > > > > > 
> > > > > > So let us remove this part from the memory ordering documentation.
> > > > > > 
> > > > > > [1] https://lkml.org/lkml/2017/10/6/707
> > > > > > 
> > > > > > Signed-off-by: Joel Fernandes (Google) 
> > > > > 
> > > > > I was just checking about this patch. Do you feel it is correct to 
> > > > > remove
> > > > > this part from the docs? Are you satisified that a barrier isn't 
> > > > > needed there
> > > > > now? Or did I miss something?
> > > > 
> > > > Apologies, it got lost in the shuffle.  I have now applied it with a
> > > > bit of rework to the commit log, thank you!
> > > 
> > > No worries, thanks for taking it!
> > > 
> > > Just wanted to update you on my progress reading/correcting the docs. The
> > > 'Memory Ordering' is taking a bit of time so I paused that and I'm 
> > > focusing
> > > on finishing all the other low hanging fruit. This activity is mostly 
> > > during
> > > night hours after the baby is asleep but some times I also manage to 
> > > sneak it
> > > into the day job ;-)
> > 
> > If there is anything I can do to make this a more sustainable task for
> > you, please do not keep it a secret!!!
> 
> Thanks a lot, that means a lot to me! Will do!
> 
> > > BTW I do want to discuss about this smp_mb patch above with you at LPC if 
> > > you
> > > had time, even though we are removing it from the documentation. I thought
> > > about it a few times, and I was not able to fully appreciate the need for 
> > > the
> > > barrier (that is even assuming that complete() etc did not do the right
> > > thing).  Specifically I was wondering same thing Peter said in the above
> > > thread I think that - if that rcu_read_unlock() triggered all the spin
> > > locking up the tree of nodes, then why is that locking not sufficient to
> > > prevent reads from the read-side section from bleeding out? That would
> > > prevent the reader that just unlocked from seeing anything that happens
> > > _after_ the synchronize_rcu.
> > 
> > Actually, I recall an smp_mb() being added, but am not seeing it anywhere
> > relevant to wait_for_completion().  So I might need to add the smp_mb()
> > to synchronize_rcu() and remove the patch (retaining the typo fix).  :-/
> 
> No problem, I'm glad atleast the patch resurfaced the topic of the potential
> issue :-)

And an smp_mb() is needed in Tree RCU's __wait_rcu_gp().  This is
because wait_for_completion() might get a "fly-by" wakeup, which would
mean no ordering for code naively thinking that it was ordered after a
grace period.

> > The short form answer is that anything before a grace period on any CPU
> > must be seen by any CPU as being before anything on any CPU after that
> > same grace period.  This guarantee requires a rather big hammer.
> > 
> > But yes, let's talk at LPC!
> 
> Sounds great, looking forward to discussing this.

Would it make sense to have an RCU-implementation BoF?

> > > Also about GP memory ordering and RCU-tree-locking, I think you mentioned 
> > > to
> > > me that the RCU reader-sections are virtually extended both forward and
> > > backward and whereever it ends, those paths do heavy-weight 
> > > synchronization
> > > that should be sufficient to prevent memory ordering issues (such as those
> > > you mentioned in the Requierments document). That is exactly why we don't
> > > need explicit barriers during rcu_read_unlock. If I recall I asked you why
> > > those are not needed. So that answer made sense, but then now on going
> > > through the 'Memory Ordering' document, I see that you mentioned there is
> > > reliance on the locking. Is that reliance on locking necessary to maintain
> > > ordering then?
> > 
> > There is a "network" of locking augmented by smp_mb__after_unlock_lock()
> > that implements the all-to-all memory ordering mentioned above.  But it
> > also needs to handle all the possible complete()/wait_for_completion()
> > races, even those assisted by hypervisor vCPU preemption.
> 
> I see, so it sounds like the lock network is just a partial solution. For
> some reason I thought before that complete() was even called on the CPU
> executing the callback, all the CPUs would have acquired and released a lock
> in the "lock network" atleast once thus ensuring the ordering (due to the
> fact that the quiescent state reporting has to travel up the tree starting
> from 

Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-11-01 Thread Paul E. McKenney
On Wed, Oct 31, 2018 at 10:00:19PM -0700, Joel Fernandes wrote:
> On Wed, Oct 31, 2018 at 11:17:48AM -0700, Paul E. McKenney wrote:
> > On Tue, Oct 30, 2018 at 06:11:19PM -0700, Joel Fernandes wrote:
> > > Hi Paul,
> > > 
> > > On Tue, Oct 30, 2018 at 04:43:36PM -0700, Paul E. McKenney wrote:
> > > > On Tue, Oct 30, 2018 at 03:26:49PM -0700, Joel Fernandes wrote:
> > > > > Hi Paul,
> > > > > 
> > > > > On Sat, Oct 27, 2018 at 09:30:46PM -0700, Joel Fernandes (Google) 
> > > > > wrote:
> > > > > > As per this thread [1], it seems this smp_mb isn't needed anymore:
> > > > > > "So the smp_mb() that I was trying to add doesn't need to be there."
> > > > > > 
> > > > > > So let us remove this part from the memory ordering documentation.
> > > > > > 
> > > > > > [1] https://lkml.org/lkml/2017/10/6/707
> > > > > > 
> > > > > > Signed-off-by: Joel Fernandes (Google) 
> > > > > 
> > > > > I was just checking about this patch. Do you feel it is correct to 
> > > > > remove
> > > > > this part from the docs? Are you satisified that a barrier isn't 
> > > > > needed there
> > > > > now? Or did I miss something?
> > > > 
> > > > Apologies, it got lost in the shuffle.  I have now applied it with a
> > > > bit of rework to the commit log, thank you!
> > > 
> > > No worries, thanks for taking it!
> > > 
> > > Just wanted to update you on my progress reading/correcting the docs. The
> > > 'Memory Ordering' is taking a bit of time so I paused that and I'm 
> > > focusing
> > > on finishing all the other low hanging fruit. This activity is mostly 
> > > during
> > > night hours after the baby is asleep but some times I also manage to 
> > > sneak it
> > > into the day job ;-)
> > 
> > If there is anything I can do to make this a more sustainable task for
> > you, please do not keep it a secret!!!
> 
> Thanks a lot, that means a lot to me! Will do!
> 
> > > BTW I do want to discuss about this smp_mb patch above with you at LPC if 
> > > you
> > > had time, even though we are removing it from the documentation. I thought
> > > about it a few times, and I was not able to fully appreciate the need for 
> > > the
> > > barrier (that is even assuming that complete() etc did not do the right
> > > thing).  Specifically I was wondering same thing Peter said in the above
> > > thread I think that - if that rcu_read_unlock() triggered all the spin
> > > locking up the tree of nodes, then why is that locking not sufficient to
> > > prevent reads from the read-side section from bleeding out? That would
> > > prevent the reader that just unlocked from seeing anything that happens
> > > _after_ the synchronize_rcu.
> > 
> > Actually, I recall an smp_mb() being added, but am not seeing it anywhere
> > relevant to wait_for_completion().  So I might need to add the smp_mb()
> > to synchronize_rcu() and remove the patch (retaining the typo fix).  :-/
> 
> No problem, I'm glad atleast the patch resurfaced the topic of the potential
> issue :-)

And an smp_mb() is needed in Tree RCU's __wait_rcu_gp().  This is
because wait_for_completion() might get a "fly-by" wakeup, which would
mean no ordering for code naively thinking that it was ordered after a
grace period.

> > The short form answer is that anything before a grace period on any CPU
> > must be seen by any CPU as being before anything on any CPU after that
> > same grace period.  This guarantee requires a rather big hammer.
> > 
> > But yes, let's talk at LPC!
> 
> Sounds great, looking forward to discussing this.

Would it make sense to have an RCU-implementation BoF?

> > > Also about GP memory ordering and RCU-tree-locking, I think you mentioned 
> > > to
> > > me that the RCU reader-sections are virtually extended both forward and
> > > backward and whereever it ends, those paths do heavy-weight 
> > > synchronization
> > > that should be sufficient to prevent memory ordering issues (such as those
> > > you mentioned in the Requierments document). That is exactly why we don't
> > > need explicit barriers during rcu_read_unlock. If I recall I asked you why
> > > those are not needed. So that answer made sense, but then now on going
> > > through the 'Memory Ordering' document, I see that you mentioned there is
> > > reliance on the locking. Is that reliance on locking necessary to maintain
> > > ordering then?
> > 
> > There is a "network" of locking augmented by smp_mb__after_unlock_lock()
> > that implements the all-to-all memory ordering mentioned above.  But it
> > also needs to handle all the possible complete()/wait_for_completion()
> > races, even those assisted by hypervisor vCPU preemption.
> 
> I see, so it sounds like the lock network is just a partial solution. For
> some reason I thought before that complete() was even called on the CPU
> executing the callback, all the CPUs would have acquired and released a lock
> in the "lock network" atleast once thus ensuring the ordering (due to the
> fact that the quiescent state reporting has to travel up the tree starting
> from 

Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-10-31 Thread Joel Fernandes
On Wed, Oct 31, 2018 at 11:17:48AM -0700, Paul E. McKenney wrote:
> On Tue, Oct 30, 2018 at 06:11:19PM -0700, Joel Fernandes wrote:
> > Hi Paul,
> > 
> > On Tue, Oct 30, 2018 at 04:43:36PM -0700, Paul E. McKenney wrote:
> > > On Tue, Oct 30, 2018 at 03:26:49PM -0700, Joel Fernandes wrote:
> > > > Hi Paul,
> > > > 
> > > > On Sat, Oct 27, 2018 at 09:30:46PM -0700, Joel Fernandes (Google) wrote:
> > > > > As per this thread [1], it seems this smp_mb isn't needed anymore:
> > > > > "So the smp_mb() that I was trying to add doesn't need to be there."
> > > > > 
> > > > > So let us remove this part from the memory ordering documentation.
> > > > > 
> > > > > [1] https://lkml.org/lkml/2017/10/6/707
> > > > > 
> > > > > Signed-off-by: Joel Fernandes (Google) 
> > > > 
> > > > I was just checking about this patch. Do you feel it is correct to 
> > > > remove
> > > > this part from the docs? Are you satisified that a barrier isn't needed 
> > > > there
> > > > now? Or did I miss something?
> > > 
> > > Apologies, it got lost in the shuffle.  I have now applied it with a
> > > bit of rework to the commit log, thank you!
> > 
> > No worries, thanks for taking it!
> > 
> > Just wanted to update you on my progress reading/correcting the docs. The
> > 'Memory Ordering' is taking a bit of time so I paused that and I'm focusing
> > on finishing all the other low hanging fruit. This activity is mostly during
> > night hours after the baby is asleep but some times I also manage to sneak 
> > it
> > into the day job ;-)
> 
> If there is anything I can do to make this a more sustainable task for
> you, please do not keep it a secret!!!

Thanks a lot, that means a lot to me! Will do!

> > BTW I do want to discuss about this smp_mb patch above with you at LPC if 
> > you
> > had time, even though we are removing it from the documentation. I thought
> > about it a few times, and I was not able to fully appreciate the need for 
> > the
> > barrier (that is even assuming that complete() etc did not do the right
> > thing).  Specifically I was wondering same thing Peter said in the above
> > thread I think that - if that rcu_read_unlock() triggered all the spin
> > locking up the tree of nodes, then why is that locking not sufficient to
> > prevent reads from the read-side section from bleeding out? That would
> > prevent the reader that just unlocked from seeing anything that happens
> > _after_ the synchronize_rcu.
> 
> Actually, I recall an smp_mb() being added, but am not seeing it anywhere
> relevant to wait_for_completion().  So I might need to add the smp_mb()
> to synchronize_rcu() and remove the patch (retaining the typo fix).  :-/

No problem, I'm glad atleast the patch resurfaced the topic of the potential
issue :-)

> The short form answer is that anything before a grace period on any CPU
> must be seen by any CPU as being before anything on any CPU after that
> same grace period.  This guarantee requires a rather big hammer.
> 
> But yes, let's talk at LPC!

Sounds great, looking forward to discussing this.

> > Also about GP memory ordering and RCU-tree-locking, I think you mentioned to
> > me that the RCU reader-sections are virtually extended both forward and
> > backward and whereever it ends, those paths do heavy-weight synchronization
> > that should be sufficient to prevent memory ordering issues (such as those
> > you mentioned in the Requierments document). That is exactly why we don't
> > need explicit barriers during rcu_read_unlock. If I recall I asked you why
> > those are not needed. So that answer made sense, but then now on going
> > through the 'Memory Ordering' document, I see that you mentioned there is
> > reliance on the locking. Is that reliance on locking necessary to maintain
> > ordering then?
> 
> There is a "network" of locking augmented by smp_mb__after_unlock_lock()
> that implements the all-to-all memory ordering mentioned above.  But it
> also needs to handle all the possible complete()/wait_for_completion()
> races, even those assisted by hypervisor vCPU preemption.

I see, so it sounds like the lock network is just a partial solution. For
some reason I thought before that complete() was even called on the CPU
executing the callback, all the CPUs would have acquired and released a lock
in the "lock network" atleast once thus ensuring the ordering (due to the
fact that the quiescent state reporting has to travel up the tree starting
from the leaves), but I think that's not necessarily true so I see your point
now.

Did you feel this will violate condition 1. or condition 2. in "Memory-Barrier
Guarantees"? Or both?
https://www.kernel.org/doc/Documentation/RCU/Design/Requirements/Requirements.html#Memory-Barrier%20Guarantees

> > --
> > TODO list of the index file marking which ones I have finished perusing:
> > 
> > arrayRCU.txtDONE
> > checklist.txt   DONE
> > listRCU.txt DONE
> > lockdep.txt 

Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-10-31 Thread Joel Fernandes
On Wed, Oct 31, 2018 at 11:17:48AM -0700, Paul E. McKenney wrote:
> On Tue, Oct 30, 2018 at 06:11:19PM -0700, Joel Fernandes wrote:
> > Hi Paul,
> > 
> > On Tue, Oct 30, 2018 at 04:43:36PM -0700, Paul E. McKenney wrote:
> > > On Tue, Oct 30, 2018 at 03:26:49PM -0700, Joel Fernandes wrote:
> > > > Hi Paul,
> > > > 
> > > > On Sat, Oct 27, 2018 at 09:30:46PM -0700, Joel Fernandes (Google) wrote:
> > > > > As per this thread [1], it seems this smp_mb isn't needed anymore:
> > > > > "So the smp_mb() that I was trying to add doesn't need to be there."
> > > > > 
> > > > > So let us remove this part from the memory ordering documentation.
> > > > > 
> > > > > [1] https://lkml.org/lkml/2017/10/6/707
> > > > > 
> > > > > Signed-off-by: Joel Fernandes (Google) 
> > > > 
> > > > I was just checking about this patch. Do you feel it is correct to 
> > > > remove
> > > > this part from the docs? Are you satisified that a barrier isn't needed 
> > > > there
> > > > now? Or did I miss something?
> > > 
> > > Apologies, it got lost in the shuffle.  I have now applied it with a
> > > bit of rework to the commit log, thank you!
> > 
> > No worries, thanks for taking it!
> > 
> > Just wanted to update you on my progress reading/correcting the docs. The
> > 'Memory Ordering' is taking a bit of time so I paused that and I'm focusing
> > on finishing all the other low hanging fruit. This activity is mostly during
> > night hours after the baby is asleep but some times I also manage to sneak 
> > it
> > into the day job ;-)
> 
> If there is anything I can do to make this a more sustainable task for
> you, please do not keep it a secret!!!

Thanks a lot, that means a lot to me! Will do!

> > BTW I do want to discuss about this smp_mb patch above with you at LPC if 
> > you
> > had time, even though we are removing it from the documentation. I thought
> > about it a few times, and I was not able to fully appreciate the need for 
> > the
> > barrier (that is even assuming that complete() etc did not do the right
> > thing).  Specifically I was wondering same thing Peter said in the above
> > thread I think that - if that rcu_read_unlock() triggered all the spin
> > locking up the tree of nodes, then why is that locking not sufficient to
> > prevent reads from the read-side section from bleeding out? That would
> > prevent the reader that just unlocked from seeing anything that happens
> > _after_ the synchronize_rcu.
> 
> Actually, I recall an smp_mb() being added, but am not seeing it anywhere
> relevant to wait_for_completion().  So I might need to add the smp_mb()
> to synchronize_rcu() and remove the patch (retaining the typo fix).  :-/

No problem, I'm glad atleast the patch resurfaced the topic of the potential
issue :-)

> The short form answer is that anything before a grace period on any CPU
> must be seen by any CPU as being before anything on any CPU after that
> same grace period.  This guarantee requires a rather big hammer.
> 
> But yes, let's talk at LPC!

Sounds great, looking forward to discussing this.

> > Also about GP memory ordering and RCU-tree-locking, I think you mentioned to
> > me that the RCU reader-sections are virtually extended both forward and
> > backward and whereever it ends, those paths do heavy-weight synchronization
> > that should be sufficient to prevent memory ordering issues (such as those
> > you mentioned in the Requierments document). That is exactly why we don't
> > need explicit barriers during rcu_read_unlock. If I recall I asked you why
> > those are not needed. So that answer made sense, but then now on going
> > through the 'Memory Ordering' document, I see that you mentioned there is
> > reliance on the locking. Is that reliance on locking necessary to maintain
> > ordering then?
> 
> There is a "network" of locking augmented by smp_mb__after_unlock_lock()
> that implements the all-to-all memory ordering mentioned above.  But it
> also needs to handle all the possible complete()/wait_for_completion()
> races, even those assisted by hypervisor vCPU preemption.

I see, so it sounds like the lock network is just a partial solution. For
some reason I thought before that complete() was even called on the CPU
executing the callback, all the CPUs would have acquired and released a lock
in the "lock network" atleast once thus ensuring the ordering (due to the
fact that the quiescent state reporting has to travel up the tree starting
from the leaves), but I think that's not necessarily true so I see your point
now.

Did you feel this will violate condition 1. or condition 2. in "Memory-Barrier
Guarantees"? Or both?
https://www.kernel.org/doc/Documentation/RCU/Design/Requirements/Requirements.html#Memory-Barrier%20Guarantees

> > --
> > TODO list of the index file marking which ones I have finished perusing:
> > 
> > arrayRCU.txtDONE
> > checklist.txt   DONE
> > listRCU.txt DONE
> > lockdep.txt 

Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-10-31 Thread Paul E. McKenney
On Tue, Oct 30, 2018 at 06:11:19PM -0700, Joel Fernandes wrote:
> Hi Paul,
> 
> On Tue, Oct 30, 2018 at 04:43:36PM -0700, Paul E. McKenney wrote:
> > On Tue, Oct 30, 2018 at 03:26:49PM -0700, Joel Fernandes wrote:
> > > Hi Paul,
> > > 
> > > On Sat, Oct 27, 2018 at 09:30:46PM -0700, Joel Fernandes (Google) wrote:
> > > > As per this thread [1], it seems this smp_mb isn't needed anymore:
> > > > "So the smp_mb() that I was trying to add doesn't need to be there."
> > > > 
> > > > So let us remove this part from the memory ordering documentation.
> > > > 
> > > > [1] https://lkml.org/lkml/2017/10/6/707
> > > > 
> > > > Signed-off-by: Joel Fernandes (Google) 
> > > 
> > > I was just checking about this patch. Do you feel it is correct to remove
> > > this part from the docs? Are you satisified that a barrier isn't needed 
> > > there
> > > now? Or did I miss something?
> > 
> > Apologies, it got lost in the shuffle.  I have now applied it with a
> > bit of rework to the commit log, thank you!
> 
> No worries, thanks for taking it!
> 
> Just wanted to update you on my progress reading/correcting the docs. The
> 'Memory Ordering' is taking a bit of time so I paused that and I'm focusing
> on finishing all the other low hanging fruit. This activity is mostly during
> night hours after the baby is asleep but some times I also manage to sneak it
> into the day job ;-)

If there is anything I can do to make this a more sustainable task for
you, please do not keep it a secret!!!

> BTW I do want to discuss about this smp_mb patch above with you at LPC if you
> had time, even though we are removing it from the documentation. I thought
> about it a few times, and I was not able to fully appreciate the need for the
> barrier (that is even assuming that complete() etc did not do the right
> thing).  Specifically I was wondering same thing Peter said in the above
> thread I think that - if that rcu_read_unlock() triggered all the spin
> locking up the tree of nodes, then why is that locking not sufficient to
> prevent reads from the read-side section from bleeding out? That would
> prevent the reader that just unlocked from seeing anything that happens
> _after_ the synchronize_rcu.

Actually, I recall an smp_mb() being added, but am not seeing it anywhere
relevant to wait_for_completion().  So I might need to add the smp_mb()
to synchronize_rcu() and remove the patch (retaining the typo fix).  :-/

The short form answer is that anything before a grace period on any CPU
must be seen by any CPU as being before anything on any CPU after that
same grace period.  This guarantee requires a rather big hammer.

But yes, let's talk at LPC!

> Also about GP memory ordering and RCU-tree-locking, I think you mentioned to
> me that the RCU reader-sections are virtually extended both forward and
> backward and whereever it ends, those paths do heavy-weight synchronization
> that should be sufficient to prevent memory ordering issues (such as those
> you mentioned in the Requierments document). That is exactly why we don't
> need explicit barriers during rcu_read_unlock. If I recall I asked you why
> those are not needed. So that answer made sense, but then now on going
> through the 'Memory Ordering' document, I see that you mentioned there is
> reliance on the locking. Is that reliance on locking necessary to maintain
> ordering then?

There is a "network" of locking augmented by smp_mb__after_unlock_lock()
that implements the all-to-all memory ordering mentioned above.  But it
also needs to handle all the possible complete()/wait_for_completion()
races, even those assisted by hypervisor vCPU preemption.

> Or did I miss the points completely? :(

A question for the ages for both of us!  ;-)

> --
> TODO list of the index file marking which ones I have finished perusing:
> 
> arrayRCU.txt  DONE
> checklist.txt DONE
> listRCU.txt   DONE
> lockdep.txt   DONE
> lockdep-splat.txt DONE
> NMI-RCU.txt
> rcu_dereference.txt
> rcubarrier.txt
> rculist_nulls.txt
> rcuref.txt
> rcu.txt
> RTFP.txt  DONE
> stallwarn.txt DONE
> torture.txt
> UP.txt
> whatisRCU.txt DONE
> 
> Design
>  - Data-StructuresDONE
>  - Requirements   DONE
>  - Expedited-Grace-Periods
>  - Memory Orderingnext

Great progress, and again, thank you!!!

Thanx, Paul



Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-10-31 Thread Paul E. McKenney
On Tue, Oct 30, 2018 at 06:11:19PM -0700, Joel Fernandes wrote:
> Hi Paul,
> 
> On Tue, Oct 30, 2018 at 04:43:36PM -0700, Paul E. McKenney wrote:
> > On Tue, Oct 30, 2018 at 03:26:49PM -0700, Joel Fernandes wrote:
> > > Hi Paul,
> > > 
> > > On Sat, Oct 27, 2018 at 09:30:46PM -0700, Joel Fernandes (Google) wrote:
> > > > As per this thread [1], it seems this smp_mb isn't needed anymore:
> > > > "So the smp_mb() that I was trying to add doesn't need to be there."
> > > > 
> > > > So let us remove this part from the memory ordering documentation.
> > > > 
> > > > [1] https://lkml.org/lkml/2017/10/6/707
> > > > 
> > > > Signed-off-by: Joel Fernandes (Google) 
> > > 
> > > I was just checking about this patch. Do you feel it is correct to remove
> > > this part from the docs? Are you satisified that a barrier isn't needed 
> > > there
> > > now? Or did I miss something?
> > 
> > Apologies, it got lost in the shuffle.  I have now applied it with a
> > bit of rework to the commit log, thank you!
> 
> No worries, thanks for taking it!
> 
> Just wanted to update you on my progress reading/correcting the docs. The
> 'Memory Ordering' is taking a bit of time so I paused that and I'm focusing
> on finishing all the other low hanging fruit. This activity is mostly during
> night hours after the baby is asleep but some times I also manage to sneak it
> into the day job ;-)

If there is anything I can do to make this a more sustainable task for
you, please do not keep it a secret!!!

> BTW I do want to discuss about this smp_mb patch above with you at LPC if you
> had time, even though we are removing it from the documentation. I thought
> about it a few times, and I was not able to fully appreciate the need for the
> barrier (that is even assuming that complete() etc did not do the right
> thing).  Specifically I was wondering same thing Peter said in the above
> thread I think that - if that rcu_read_unlock() triggered all the spin
> locking up the tree of nodes, then why is that locking not sufficient to
> prevent reads from the read-side section from bleeding out? That would
> prevent the reader that just unlocked from seeing anything that happens
> _after_ the synchronize_rcu.

Actually, I recall an smp_mb() being added, but am not seeing it anywhere
relevant to wait_for_completion().  So I might need to add the smp_mb()
to synchronize_rcu() and remove the patch (retaining the typo fix).  :-/

The short form answer is that anything before a grace period on any CPU
must be seen by any CPU as being before anything on any CPU after that
same grace period.  This guarantee requires a rather big hammer.

But yes, let's talk at LPC!

> Also about GP memory ordering and RCU-tree-locking, I think you mentioned to
> me that the RCU reader-sections are virtually extended both forward and
> backward and whereever it ends, those paths do heavy-weight synchronization
> that should be sufficient to prevent memory ordering issues (such as those
> you mentioned in the Requierments document). That is exactly why we don't
> need explicit barriers during rcu_read_unlock. If I recall I asked you why
> those are not needed. So that answer made sense, but then now on going
> through the 'Memory Ordering' document, I see that you mentioned there is
> reliance on the locking. Is that reliance on locking necessary to maintain
> ordering then?

There is a "network" of locking augmented by smp_mb__after_unlock_lock()
that implements the all-to-all memory ordering mentioned above.  But it
also needs to handle all the possible complete()/wait_for_completion()
races, even those assisted by hypervisor vCPU preemption.

> Or did I miss the points completely? :(

A question for the ages for both of us!  ;-)

> --
> TODO list of the index file marking which ones I have finished perusing:
> 
> arrayRCU.txt  DONE
> checklist.txt DONE
> listRCU.txt   DONE
> lockdep.txt   DONE
> lockdep-splat.txt DONE
> NMI-RCU.txt
> rcu_dereference.txt
> rcubarrier.txt
> rculist_nulls.txt
> rcuref.txt
> rcu.txt
> RTFP.txt  DONE
> stallwarn.txt DONE
> torture.txt
> UP.txt
> whatisRCU.txt DONE
> 
> Design
>  - Data-StructuresDONE
>  - Requirements   DONE
>  - Expedited-Grace-Periods
>  - Memory Orderingnext

Great progress, and again, thank you!!!

Thanx, Paul



Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-10-30 Thread Joel Fernandes
Hi Paul,

On Tue, Oct 30, 2018 at 04:43:36PM -0700, Paul E. McKenney wrote:
> On Tue, Oct 30, 2018 at 03:26:49PM -0700, Joel Fernandes wrote:
> > Hi Paul,
> > 
> > On Sat, Oct 27, 2018 at 09:30:46PM -0700, Joel Fernandes (Google) wrote:
> > > As per this thread [1], it seems this smp_mb isn't needed anymore:
> > > "So the smp_mb() that I was trying to add doesn't need to be there."
> > > 
> > > So let us remove this part from the memory ordering documentation.
> > > 
> > > [1] https://lkml.org/lkml/2017/10/6/707
> > > 
> > > Signed-off-by: Joel Fernandes (Google) 
> > 
> > I was just checking about this patch. Do you feel it is correct to remove
> > this part from the docs? Are you satisified that a barrier isn't needed 
> > there
> > now? Or did I miss something?
> 
> Apologies, it got lost in the shuffle.  I have now applied it with a
> bit of rework to the commit log, thank you!

No worries, thanks for taking it!

Just wanted to update you on my progress reading/correcting the docs. The
'Memory Ordering' is taking a bit of time so I paused that and I'm focusing
on finishing all the other low hanging fruit. This activity is mostly during
night hours after the baby is asleep but some times I also manage to sneak it
into the day job ;-)

BTW I do want to discuss about this smp_mb patch above with you at LPC if you
had time, even though we are removing it from the documentation. I thought
about it a few times, and I was not able to fully appreciate the need for the
barrier (that is even assuming that complete() etc did not do the right
thing).  Specifically I was wondering same thing Peter said in the above
thread I think that - if that rcu_read_unlock() triggered all the spin
locking up the tree of nodes, then why is that locking not sufficient to
prevent reads from the read-side section from bleeding out? That would
prevent the reader that just unlocked from seeing anything that happens
_after_ the synchronize_rcu.

Also about GP memory ordering and RCU-tree-locking, I think you mentioned to
me that the RCU reader-sections are virtually extended both forward and
backward and whereever it ends, those paths do heavy-weight synchronization
that should be sufficient to prevent memory ordering issues (such as those
you mentioned in the Requierments document). That is exactly why we don't
need explicit barriers during rcu_read_unlock. If I recall I asked you why
those are not needed. So that answer made sense, but then now on going
through the 'Memory Ordering' document, I see that you mentioned there is
reliance on the locking. Is that reliance on locking necessary to maintain
ordering then?

Or did I miss the points completely? :(

--
TODO list of the index file marking which ones I have finished perusing:

arrayRCU.txtDONE
checklist.txt   DONE
listRCU.txt DONE
lockdep.txt DONE
lockdep-splat.txt   DONE
NMI-RCU.txt
rcu_dereference.txt
rcubarrier.txt
rculist_nulls.txt
rcuref.txt
rcu.txt
RTFP.txtDONE
stallwarn.txt   DONE
torture.txt
UP.txt
whatisRCU.txt   DONE

Design
 - Data-Structures  DONE
 - Requirements DONE
 - Expedited-Grace-Periods
 - Memory Ordering  next



Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-10-30 Thread Joel Fernandes
Hi Paul,

On Tue, Oct 30, 2018 at 04:43:36PM -0700, Paul E. McKenney wrote:
> On Tue, Oct 30, 2018 at 03:26:49PM -0700, Joel Fernandes wrote:
> > Hi Paul,
> > 
> > On Sat, Oct 27, 2018 at 09:30:46PM -0700, Joel Fernandes (Google) wrote:
> > > As per this thread [1], it seems this smp_mb isn't needed anymore:
> > > "So the smp_mb() that I was trying to add doesn't need to be there."
> > > 
> > > So let us remove this part from the memory ordering documentation.
> > > 
> > > [1] https://lkml.org/lkml/2017/10/6/707
> > > 
> > > Signed-off-by: Joel Fernandes (Google) 
> > 
> > I was just checking about this patch. Do you feel it is correct to remove
> > this part from the docs? Are you satisified that a barrier isn't needed 
> > there
> > now? Or did I miss something?
> 
> Apologies, it got lost in the shuffle.  I have now applied it with a
> bit of rework to the commit log, thank you!

No worries, thanks for taking it!

Just wanted to update you on my progress reading/correcting the docs. The
'Memory Ordering' is taking a bit of time so I paused that and I'm focusing
on finishing all the other low hanging fruit. This activity is mostly during
night hours after the baby is asleep but some times I also manage to sneak it
into the day job ;-)

BTW I do want to discuss about this smp_mb patch above with you at LPC if you
had time, even though we are removing it from the documentation. I thought
about it a few times, and I was not able to fully appreciate the need for the
barrier (that is even assuming that complete() etc did not do the right
thing).  Specifically I was wondering same thing Peter said in the above
thread I think that - if that rcu_read_unlock() triggered all the spin
locking up the tree of nodes, then why is that locking not sufficient to
prevent reads from the read-side section from bleeding out? That would
prevent the reader that just unlocked from seeing anything that happens
_after_ the synchronize_rcu.

Also about GP memory ordering and RCU-tree-locking, I think you mentioned to
me that the RCU reader-sections are virtually extended both forward and
backward and whereever it ends, those paths do heavy-weight synchronization
that should be sufficient to prevent memory ordering issues (such as those
you mentioned in the Requierments document). That is exactly why we don't
need explicit barriers during rcu_read_unlock. If I recall I asked you why
those are not needed. So that answer made sense, but then now on going
through the 'Memory Ordering' document, I see that you mentioned there is
reliance on the locking. Is that reliance on locking necessary to maintain
ordering then?

Or did I miss the points completely? :(

--
TODO list of the index file marking which ones I have finished perusing:

arrayRCU.txtDONE
checklist.txt   DONE
listRCU.txt DONE
lockdep.txt DONE
lockdep-splat.txt   DONE
NMI-RCU.txt
rcu_dereference.txt
rcubarrier.txt
rculist_nulls.txt
rcuref.txt
rcu.txt
RTFP.txtDONE
stallwarn.txt   DONE
torture.txt
UP.txt
whatisRCU.txt   DONE

Design
 - Data-Structures  DONE
 - Requirements DONE
 - Expedited-Grace-Periods
 - Memory Ordering  next



Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-10-30 Thread Paul E. McKenney
On Tue, Oct 30, 2018 at 03:26:49PM -0700, Joel Fernandes wrote:
> Hi Paul,
> 
> On Sat, Oct 27, 2018 at 09:30:46PM -0700, Joel Fernandes (Google) wrote:
> > As per this thread [1], it seems this smp_mb isn't needed anymore:
> > "So the smp_mb() that I was trying to add doesn't need to be there."
> > 
> > So let us remove this part from the memory ordering documentation.
> > 
> > [1] https://lkml.org/lkml/2017/10/6/707
> > 
> > Signed-off-by: Joel Fernandes (Google) 
> 
> I was just checking about this patch. Do you feel it is correct to remove
> this part from the docs? Are you satisified that a barrier isn't needed there
> now? Or did I miss something?

Apologies, it got lost in the shuffle.  I have now applied it with a
bit of rework to the commit log, thank you!

Thanx, Paul

> thanks,
> 
> - Joel
> 
> 
> > ---
> >  .../Tree-RCU-Memory-Ordering.html | 32 +--
> >  1 file changed, 1 insertion(+), 31 deletions(-)
> > 
> > diff --git 
> > a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html 
> > b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html
> > index a346ce0116eb..0fb1511763d4 100644
> > --- a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html
> > +++ b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html
> > @@ -77,7 +77,7 @@ The key point is that the lock-acquisition functions, 
> > including
> >  smp_mb__after_unlock_lock() immediately after successful
> >  acquisition of the lock.
> >  
> > -Therefore, for any given rcu_node struction, any access
> > +Therefore, for any given rcu_node structure, any access
> >  happening before one of the above lock-release functions will be seen
> >  by all CPUs as happening before any access happening after a later
> >  one of the above lock-acquisition functions.
> > @@ -162,36 +162,6 @@ an atomic_add_return() of zero) to detect 
> > idle CPUs.
> >  
> >  
> >  
> > -The approach must be extended to handle one final case, that
> > -of waking a task blocked in synchronize_rcu().
> > -This task might be affinitied to a CPU that is not yet aware that
> > -the grace period has ended, and thus might not yet be subject to
> > -the grace period's memory ordering.
> > -Therefore, there is an smp_mb() after the return from
> > -wait_for_completion() in the synchronize_rcu()
> > -code path.
> > -
> > -
> > -
> > -Quick Quiz:
> > -
> > -   What?  Where???
> > -   I don't see any smp_mb() after the return from
> > -   wait_for_completion()!!!
> > -
> > -Answer:
> > -
> > -   That would be because I spotted the need for that
> > -   smp_mb() during the creation of this documentation,
> > -   and it is therefore unlikely to hit mainline before v4.14.
> > -   Kudos to Lance Roy, Will Deacon, Peter Zijlstra, and
> > -   Jonathan Cameron for asking questions that sensitized me
> > -   to the rather elaborate sequence of events that demonstrate
> > -   the need for this memory barrier.
> > -
> > -
> > -
> > -
> >  Tree RCU's grace--period memory-ordering guarantees rely most
> >  heavily on the rcu_node structure's -lock
> >  field, so much so that it is necessary to abbreviate this pattern
> > -- 
> > 2.19.1.568.g152ad8e336-goog
> > 
> 



Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-10-30 Thread Paul E. McKenney
On Tue, Oct 30, 2018 at 03:26:49PM -0700, Joel Fernandes wrote:
> Hi Paul,
> 
> On Sat, Oct 27, 2018 at 09:30:46PM -0700, Joel Fernandes (Google) wrote:
> > As per this thread [1], it seems this smp_mb isn't needed anymore:
> > "So the smp_mb() that I was trying to add doesn't need to be there."
> > 
> > So let us remove this part from the memory ordering documentation.
> > 
> > [1] https://lkml.org/lkml/2017/10/6/707
> > 
> > Signed-off-by: Joel Fernandes (Google) 
> 
> I was just checking about this patch. Do you feel it is correct to remove
> this part from the docs? Are you satisified that a barrier isn't needed there
> now? Or did I miss something?

Apologies, it got lost in the shuffle.  I have now applied it with a
bit of rework to the commit log, thank you!

Thanx, Paul

> thanks,
> 
> - Joel
> 
> 
> > ---
> >  .../Tree-RCU-Memory-Ordering.html | 32 +--
> >  1 file changed, 1 insertion(+), 31 deletions(-)
> > 
> > diff --git 
> > a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html 
> > b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html
> > index a346ce0116eb..0fb1511763d4 100644
> > --- a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html
> > +++ b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html
> > @@ -77,7 +77,7 @@ The key point is that the lock-acquisition functions, 
> > including
> >  smp_mb__after_unlock_lock() immediately after successful
> >  acquisition of the lock.
> >  
> > -Therefore, for any given rcu_node struction, any access
> > +Therefore, for any given rcu_node structure, any access
> >  happening before one of the above lock-release functions will be seen
> >  by all CPUs as happening before any access happening after a later
> >  one of the above lock-acquisition functions.
> > @@ -162,36 +162,6 @@ an atomic_add_return() of zero) to detect 
> > idle CPUs.
> >  
> >  
> >  
> > -The approach must be extended to handle one final case, that
> > -of waking a task blocked in synchronize_rcu().
> > -This task might be affinitied to a CPU that is not yet aware that
> > -the grace period has ended, and thus might not yet be subject to
> > -the grace period's memory ordering.
> > -Therefore, there is an smp_mb() after the return from
> > -wait_for_completion() in the synchronize_rcu()
> > -code path.
> > -
> > -
> > -
> > -Quick Quiz:
> > -
> > -   What?  Where???
> > -   I don't see any smp_mb() after the return from
> > -   wait_for_completion()!!!
> > -
> > -Answer:
> > -
> > -   That would be because I spotted the need for that
> > -   smp_mb() during the creation of this documentation,
> > -   and it is therefore unlikely to hit mainline before v4.14.
> > -   Kudos to Lance Roy, Will Deacon, Peter Zijlstra, and
> > -   Jonathan Cameron for asking questions that sensitized me
> > -   to the rather elaborate sequence of events that demonstrate
> > -   the need for this memory barrier.
> > -
> > -
> > -
> > -
> >  Tree RCU's grace--period memory-ordering guarantees rely most
> >  heavily on the rcu_node structure's -lock
> >  field, so much so that it is necessary to abbreviate this pattern
> > -- 
> > 2.19.1.568.g152ad8e336-goog
> > 
> 



Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-10-30 Thread Joel Fernandes
Hi Paul,

On Sat, Oct 27, 2018 at 09:30:46PM -0700, Joel Fernandes (Google) wrote:
> As per this thread [1], it seems this smp_mb isn't needed anymore:
> "So the smp_mb() that I was trying to add doesn't need to be there."
> 
> So let us remove this part from the memory ordering documentation.
> 
> [1] https://lkml.org/lkml/2017/10/6/707
> 
> Signed-off-by: Joel Fernandes (Google) 

I was just checking about this patch. Do you feel it is correct to remove
this part from the docs? Are you satisified that a barrier isn't needed there
now? Or did I miss something?

thanks,

- Joel


> ---
>  .../Tree-RCU-Memory-Ordering.html | 32 +--
>  1 file changed, 1 insertion(+), 31 deletions(-)
> 
> diff --git 
> a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html 
> b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html
> index a346ce0116eb..0fb1511763d4 100644
> --- a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html
> +++ b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html
> @@ -77,7 +77,7 @@ The key point is that the lock-acquisition functions, 
> including
>  smp_mb__after_unlock_lock() immediately after successful
>  acquisition of the lock.
>  
> -Therefore, for any given rcu_node struction, any access
> +Therefore, for any given rcu_node structure, any access
>  happening before one of the above lock-release functions will be seen
>  by all CPUs as happening before any access happening after a later
>  one of the above lock-acquisition functions.
> @@ -162,36 +162,6 @@ an atomic_add_return() of zero) to detect idle 
> CPUs.
>  
>  
>  
> -The approach must be extended to handle one final case, that
> -of waking a task blocked in synchronize_rcu().
> -This task might be affinitied to a CPU that is not yet aware that
> -the grace period has ended, and thus might not yet be subject to
> -the grace period's memory ordering.
> -Therefore, there is an smp_mb() after the return from
> -wait_for_completion() in the synchronize_rcu()
> -code path.
> -
> -
> -
> -Quick Quiz:
> -
> - What?  Where???
> - I don't see any smp_mb() after the return from
> - wait_for_completion()!!!
> -
> -Answer:
> -
> - That would be because I spotted the need for that
> - smp_mb() during the creation of this documentation,
> - and it is therefore unlikely to hit mainline before v4.14.
> - Kudos to Lance Roy, Will Deacon, Peter Zijlstra, and
> - Jonathan Cameron for asking questions that sensitized me
> - to the rather elaborate sequence of events that demonstrate
> - the need for this memory barrier.
> -
> -
> -
> -
>  Tree RCU's grace--period memory-ordering guarantees rely most
>  heavily on the rcu_node structure's -lock
>  field, so much so that it is necessary to abbreviate this pattern
> -- 
> 2.19.1.568.g152ad8e336-goog
> 


Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu

2018-10-30 Thread Joel Fernandes
Hi Paul,

On Sat, Oct 27, 2018 at 09:30:46PM -0700, Joel Fernandes (Google) wrote:
> As per this thread [1], it seems this smp_mb isn't needed anymore:
> "So the smp_mb() that I was trying to add doesn't need to be there."
> 
> So let us remove this part from the memory ordering documentation.
> 
> [1] https://lkml.org/lkml/2017/10/6/707
> 
> Signed-off-by: Joel Fernandes (Google) 

I was just checking about this patch. Do you feel it is correct to remove
this part from the docs? Are you satisified that a barrier isn't needed there
now? Or did I miss something?

thanks,

- Joel


> ---
>  .../Tree-RCU-Memory-Ordering.html | 32 +--
>  1 file changed, 1 insertion(+), 31 deletions(-)
> 
> diff --git 
> a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html 
> b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html
> index a346ce0116eb..0fb1511763d4 100644
> --- a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html
> +++ b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html
> @@ -77,7 +77,7 @@ The key point is that the lock-acquisition functions, 
> including
>  smp_mb__after_unlock_lock() immediately after successful
>  acquisition of the lock.
>  
> -Therefore, for any given rcu_node struction, any access
> +Therefore, for any given rcu_node structure, any access
>  happening before one of the above lock-release functions will be seen
>  by all CPUs as happening before any access happening after a later
>  one of the above lock-acquisition functions.
> @@ -162,36 +162,6 @@ an atomic_add_return() of zero) to detect idle 
> CPUs.
>  
>  
>  
> -The approach must be extended to handle one final case, that
> -of waking a task blocked in synchronize_rcu().
> -This task might be affinitied to a CPU that is not yet aware that
> -the grace period has ended, and thus might not yet be subject to
> -the grace period's memory ordering.
> -Therefore, there is an smp_mb() after the return from
> -wait_for_completion() in the synchronize_rcu()
> -code path.
> -
> -
> -
> -Quick Quiz:
> -
> - What?  Where???
> - I don't see any smp_mb() after the return from
> - wait_for_completion()!!!
> -
> -Answer:
> -
> - That would be because I spotted the need for that
> - smp_mb() during the creation of this documentation,
> - and it is therefore unlikely to hit mainline before v4.14.
> - Kudos to Lance Roy, Will Deacon, Peter Zijlstra, and
> - Jonathan Cameron for asking questions that sensitized me
> - to the rather elaborate sequence of events that demonstrate
> - the need for this memory barrier.
> -
> -
> -
> -
>  Tree RCU's grace--period memory-ordering guarantees rely most
>  heavily on the rcu_node structure's -lock
>  field, so much so that it is necessary to abbreviate this pattern
> -- 
> 2.19.1.568.g152ad8e336-goog
>