Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-16 Thread Joel Fernandes
On Wed, May 16, 2018 at 08:45:43AM -0700, Paul E. McKenney wrote:
[...]
> > > > The code would be correct then, but one issue is it would shout out the
> > > > 'Prestarted' tracepoint for 'c' when that's not really true..
> > > > 
> > > >rcu_seq_done(_root->gp_seq, c)
> > > > 
> > > > translates to ULONG_CMP_GE(_root->gp_seq, c)
> > > > 
> > > > which translates to the fact that c-1 completed.
> > > > 
> > > > So in this case if rcu_seq_done returns true, then saying that c has 
> > > > been
> > > > 'Prestarted' seems a bit off to me. It should be 'Startedleaf' or 
> > > > something
> > > > since what we really are doing is just marking the leaf as you 
> > > > mentioned in
> > > > the unlock_out part for a future start.
> > > 
> > > Indeed, some of the tracing is not all that accurate.  But the trace
> > > message itself contains the information needed to work out why the
> > > loop was exited, so perhaps something like 'EarlyExit'?
> > 
> > I think since you're now using rcu_seq_start to determine if c has started 
> > or
> > completed since, the current 'Prestarted' trace will cover it.
> 
> "My work is done!"  ;-)

:-D Its cool how a conversation can turn into a code improvement.

> > > 
> > > 
> > > commit 59a4f38edcffbef1521852fe3b26ed4ed85af16e
> > > Author: Paul E. McKenney 
> > > Date:   Tue May 15 11:53:41 2018 -0700
> > > 
> > > rcu: Make rcu_start_this_gp() check for grace period already started
> > > 
> > > In the old days of ->gpnum and ->completed, the code requesting a new
> > > grace period checked to see if that grace period had already started,
> > > bailing early if so.  The new-age ->gp_seq approach instead checks
> > > whether the grace period has already finished.  A compensating change
> > > pushed the requested grace period down to the bottom of the tree, thus
> > > reducing lock contention and even eliminating it in some cases.  But 
> > > why
> > > not further reduce contention, especially on large systems, by doing 
> > > both,
> > > especially given that the cost of doing both is extremely small?
> > > 
> > > This commit therefore adds a new rcu_seq_started() function that 
> > > checks
> > > whether a specified grace period has already started.  It then uses
> > > this new function in place of rcu_seq_done() in the 
> > > rcu_start_this_gp()
> > > function's funnel locking code.
> > > 
> > > Reported-by: Joel Fernandes 
> > > Signed-off-by: Paul E. McKenney 
> > > 
> > > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > > index 003671825d62..1c5cbd9d7c97 100644
> > > --- a/kernel/rcu/rcu.h
> > > +++ b/kernel/rcu/rcu.h
> > > @@ -108,6 +108,15 @@ static inline unsigned long rcu_seq_current(unsigned 
> > > long *sp)
> > >  }
> > >  
> > >  /*
> > > + * Given a snapshot from rcu_seq_snap(), determine whether or not the
> > > + * corresponding update-side operation has started.
> > > + */
> > > +static inline bool rcu_seq_started(unsigned long *sp, unsigned long s)
> > > +{
> > > + return ULONG_CMP_LT((s - 1) & ~RCU_SEQ_STATE_MASK, READ_ONCE(*sp));
> > > +}
> > > +
> > > +/*
> > >   * Given a snapshot from rcu_seq_snap(), determine whether or not a
> > >   * full update-side operation has occurred.
> > >   */
> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > index 9e900c5926cc..ed69f49b7054 100644
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -1580,7 +1580,7 @@ static bool rcu_start_this_gp(struct rcu_node *rnp, 
> > > struct rcu_data *rdp,
> > >   if (rnp_root != rnp)
> > >   raw_spin_lock_rcu_node(rnp_root);
> > >   if (ULONG_CMP_GE(rnp_root->gp_seq_needed, c) ||
> > > - rcu_seq_done(_root->gp_seq, c) ||
> > > + rcu_seq_started(_root->gp_seq, c) ||
> > 
> > Yes, this does exactly what I was wanting, thanks! I think this puts our
> > discussion about this to rest :-)
> > 
> > By the way I was starting to beautify this loop like below last week, with
> > code comments.  I felt it would be easier to parse this loop in the future
> > for whoever was reading it.  Are you interested in such a patch? If not, let
> > me know and I'll drop this and focus on the other changes you requested.
> > 
> > Something like...  (just an example , actual code would be different)
> > 
> >for (rnp_node = rnp; 1; rnp_node = rnp_node->parent) {
> >int prestarted = 0;
> > 
> >/* Acquire lock if non-leaf node */
> >if (rnp_node != rnp)
> >raw_spin_lock_rcu_node(rnp_node);
> > 
> >/* Has the GP asked been recorded as a future need */
> >if (ULONG_CMP_GE(rnp_node->gp_seq_needed, gp_seq_start))
> >prestarted = 1;
> > 
> >/* Has 

Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-16 Thread Joel Fernandes
On Wed, May 16, 2018 at 08:45:43AM -0700, Paul E. McKenney wrote:
[...]
> > > > The code would be correct then, but one issue is it would shout out the
> > > > 'Prestarted' tracepoint for 'c' when that's not really true..
> > > > 
> > > >rcu_seq_done(_root->gp_seq, c)
> > > > 
> > > > translates to ULONG_CMP_GE(_root->gp_seq, c)
> > > > 
> > > > which translates to the fact that c-1 completed.
> > > > 
> > > > So in this case if rcu_seq_done returns true, then saying that c has 
> > > > been
> > > > 'Prestarted' seems a bit off to me. It should be 'Startedleaf' or 
> > > > something
> > > > since what we really are doing is just marking the leaf as you 
> > > > mentioned in
> > > > the unlock_out part for a future start.
> > > 
> > > Indeed, some of the tracing is not all that accurate.  But the trace
> > > message itself contains the information needed to work out why the
> > > loop was exited, so perhaps something like 'EarlyExit'?
> > 
> > I think since you're now using rcu_seq_start to determine if c has started 
> > or
> > completed since, the current 'Prestarted' trace will cover it.
> 
> "My work is done!"  ;-)

:-D Its cool how a conversation can turn into a code improvement.

> > > 
> > > 
> > > commit 59a4f38edcffbef1521852fe3b26ed4ed85af16e
> > > Author: Paul E. McKenney 
> > > Date:   Tue May 15 11:53:41 2018 -0700
> > > 
> > > rcu: Make rcu_start_this_gp() check for grace period already started
> > > 
> > > In the old days of ->gpnum and ->completed, the code requesting a new
> > > grace period checked to see if that grace period had already started,
> > > bailing early if so.  The new-age ->gp_seq approach instead checks
> > > whether the grace period has already finished.  A compensating change
> > > pushed the requested grace period down to the bottom of the tree, thus
> > > reducing lock contention and even eliminating it in some cases.  But 
> > > why
> > > not further reduce contention, especially on large systems, by doing 
> > > both,
> > > especially given that the cost of doing both is extremely small?
> > > 
> > > This commit therefore adds a new rcu_seq_started() function that 
> > > checks
> > > whether a specified grace period has already started.  It then uses
> > > this new function in place of rcu_seq_done() in the 
> > > rcu_start_this_gp()
> > > function's funnel locking code.
> > > 
> > > Reported-by: Joel Fernandes 
> > > Signed-off-by: Paul E. McKenney 
> > > 
> > > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > > index 003671825d62..1c5cbd9d7c97 100644
> > > --- a/kernel/rcu/rcu.h
> > > +++ b/kernel/rcu/rcu.h
> > > @@ -108,6 +108,15 @@ static inline unsigned long rcu_seq_current(unsigned 
> > > long *sp)
> > >  }
> > >  
> > >  /*
> > > + * Given a snapshot from rcu_seq_snap(), determine whether or not the
> > > + * corresponding update-side operation has started.
> > > + */
> > > +static inline bool rcu_seq_started(unsigned long *sp, unsigned long s)
> > > +{
> > > + return ULONG_CMP_LT((s - 1) & ~RCU_SEQ_STATE_MASK, READ_ONCE(*sp));
> > > +}
> > > +
> > > +/*
> > >   * Given a snapshot from rcu_seq_snap(), determine whether or not a
> > >   * full update-side operation has occurred.
> > >   */
> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > index 9e900c5926cc..ed69f49b7054 100644
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -1580,7 +1580,7 @@ static bool rcu_start_this_gp(struct rcu_node *rnp, 
> > > struct rcu_data *rdp,
> > >   if (rnp_root != rnp)
> > >   raw_spin_lock_rcu_node(rnp_root);
> > >   if (ULONG_CMP_GE(rnp_root->gp_seq_needed, c) ||
> > > - rcu_seq_done(_root->gp_seq, c) ||
> > > + rcu_seq_started(_root->gp_seq, c) ||
> > 
> > Yes, this does exactly what I was wanting, thanks! I think this puts our
> > discussion about this to rest :-)
> > 
> > By the way I was starting to beautify this loop like below last week, with
> > code comments.  I felt it would be easier to parse this loop in the future
> > for whoever was reading it.  Are you interested in such a patch? If not, let
> > me know and I'll drop this and focus on the other changes you requested.
> > 
> > Something like...  (just an example , actual code would be different)
> > 
> >for (rnp_node = rnp; 1; rnp_node = rnp_node->parent) {
> >int prestarted = 0;
> > 
> >/* Acquire lock if non-leaf node */
> >if (rnp_node != rnp)
> >raw_spin_lock_rcu_node(rnp_node);
> > 
> >/* Has the GP asked been recorded as a future need */
> >if (ULONG_CMP_GE(rnp_node->gp_seq_needed, gp_seq_start))
> >prestarted = 1;
> > 
> >/* Has the GP requested for already been completed */
> >if (!prestarted 

Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-16 Thread Paul E. McKenney
On Tue, May 15, 2018 at 03:55:09PM -0700, Joel Fernandes wrote:
> On Tue, May 15, 2018 at 12:08:01PM -0700, Paul E. McKenney wrote:
> > On Tue, May 15, 2018 at 11:41:15AM -0700, Joel Fernandes wrote:
> > > On Tue, May 15, 2018 at 05:55:07AM -0700, Paul E. McKenney wrote:
> > > > On Tue, May 15, 2018 at 12:02:43AM -0700, Joel Fernandes wrote:
> > > > > Hi Paul,
> > > > > Good morning, hope you're having a great Tuesday. I managed to find 
> > > > > some
> > > > > evening hours today to dig into this a bit more.
> > > > > 
> > > > > On Mon, May 14, 2018 at 08:59:52PM -0700, Paul E. McKenney wrote:
> > > > > > On Mon, May 14, 2018 at 06:51:33PM -0700, Joel Fernandes wrote:
> > > > > > > On Mon, May 14, 2018 at 10:38:16AM -0700, Paul E. McKenney wrote:
> > > > > > > > On Sun, May 13, 2018 at 08:15:34PM -0700, Joel Fernandes 
> > > > > > > > (Google) wrote:
> > > > > > > > > rcu_seq_snap may be tricky for someone looking at it for the 
> > > > > > > > > first time.
> > > > > > > > > Lets document how it works with an example to make it easier.
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Joel Fernandes (Google) 
> > > > > > > > > 
> > > > > > > > > ---
> > > > > > > > >  kernel/rcu/rcu.h | 24 +++-
> > > > > > > > >  1 file changed, 23 insertions(+), 1 deletion(-)
> > > > > > > > > 
> > > > > > > > > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > > > > > > > > index 003671825d62..fc3170914ac7 100644
> > > > > > > > > --- a/kernel/rcu/rcu.h
> > > > > > > > > +++ b/kernel/rcu/rcu.h
> > > > > > > > > @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned 
> > > > > > > > > long *sp)
> > > > > > > > >   WRITE_ONCE(*sp, rcu_seq_endval(sp));
> > > > > > > > >  }
> > > > > > > > > 
> > > > > > > > > -/* Take a snapshot of the update side's sequence number. */
> > > > > > > > > +/*
> > > > > > > > > + * Take a snapshot of the update side's sequence number.
> > > > > > > > > + *
> > > > > > > > > + * This function predicts what the grace period number will 
> > > > > > > > > be the next
> > > > > > > > > + * time an RCU callback will be executed, given the current 
> > > > > > > > > grace period's
> > > > > > > > > + * number. This can be gp+1 if RCU is idle, or gp+2 if a 
> > > > > > > > > grace period is
> > > > > > > > > + * already in progress.
> > > > > > > > 
> > > > > > > > How about something like this?
> > > > > > > > 
> > > > > > > > This function returns the earliest value of the 
> > > > > > > > grace-period
> > > > > > > > sequence number that will indicate that a full grace 
> > > > > > > > period has
> > > > > > > > elapsed since the current time.  Once the grace-period 
> > > > > > > > sequence
> > > > > > > > number has reached this value, it will be safe to 
> > > > > > > > invoke all
> > > > > > > > callbacks that have been registered prior to the 
> > > > > > > > current time.
> > > > > > > > This value is the current grace-period number plus two 
> > > > > > > > to the
> > > > > > > > power of the number of low-order bits reserved for 
> > > > > > > > state, then
> > > > > > > > rounded up to the next value in which the state bits 
> > > > > > > > are all zero.
> > > > > > > 
> > > > > > > This makes sense too, but do you disagree with what I said?
> > > > > > 
> > > > > > In a pedantic sense, definitely.  RCU callbacks are being executed 
> > > > > > pretty
> > > > > > much all the time on a busy system, so it is only the recently 
> > > > > > queued
> > > > > > ones that are guaranteed to be deferred that long.  And my 
> > > > > > experience
> > > > > > indicates that someone really will get confused by that distinction,
> > > > > > so I feel justified in being pedantic in this case.
> > > > > 
> > > > > Ok I agree, I'll include your comment above.
> > > > > 
> > > > > > > Also just to let you know, thanks so much for elaborately 
> > > > > > > providing an
> > > > > > > example on the other thread where we are discussing the 
> > > > > > > rcu_seq_done check. I
> > > > > > > will take some time to trace this down and see if I can zero in 
> > > > > > > on the same
> > > > > > > understanding as yours.
> > > > > > > 
> > > > > > > I get why we use rcu_seq_snap there in rcu_start_this_gp but the 
> > > > > > > way it its
> > > > > > > used is 'c' is the requested GP obtained from _snap, and we are 
> > > > > > > comparing that with the existing
> > > > > > > rnp->gp_seq in rcu_seq_done.  When that rnp->gp_seq reaches 'c', 
> > > > > > > it only
> > > > > > > means rnp->gp_seq is done, it doesn't tell us if 'c' is done 
> > > > > > > which is what
> > > > > > > we were trying to check in that loop... that's why I felt that 
> > > > > > > check wasn't
> > > > > > > correct - that's my (most likely wrong) take on the matter, and 
> > > > > > > I'll get back
> > > > > > > once I trace this a bit more hopefully today :-P
> > > > > > 
> > > > > > If your point is that 

Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-16 Thread Paul E. McKenney
On Tue, May 15, 2018 at 03:55:09PM -0700, Joel Fernandes wrote:
> On Tue, May 15, 2018 at 12:08:01PM -0700, Paul E. McKenney wrote:
> > On Tue, May 15, 2018 at 11:41:15AM -0700, Joel Fernandes wrote:
> > > On Tue, May 15, 2018 at 05:55:07AM -0700, Paul E. McKenney wrote:
> > > > On Tue, May 15, 2018 at 12:02:43AM -0700, Joel Fernandes wrote:
> > > > > Hi Paul,
> > > > > Good morning, hope you're having a great Tuesday. I managed to find 
> > > > > some
> > > > > evening hours today to dig into this a bit more.
> > > > > 
> > > > > On Mon, May 14, 2018 at 08:59:52PM -0700, Paul E. McKenney wrote:
> > > > > > On Mon, May 14, 2018 at 06:51:33PM -0700, Joel Fernandes wrote:
> > > > > > > On Mon, May 14, 2018 at 10:38:16AM -0700, Paul E. McKenney wrote:
> > > > > > > > On Sun, May 13, 2018 at 08:15:34PM -0700, Joel Fernandes 
> > > > > > > > (Google) wrote:
> > > > > > > > > rcu_seq_snap may be tricky for someone looking at it for the 
> > > > > > > > > first time.
> > > > > > > > > Lets document how it works with an example to make it easier.
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Joel Fernandes (Google) 
> > > > > > > > > 
> > > > > > > > > ---
> > > > > > > > >  kernel/rcu/rcu.h | 24 +++-
> > > > > > > > >  1 file changed, 23 insertions(+), 1 deletion(-)
> > > > > > > > > 
> > > > > > > > > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > > > > > > > > index 003671825d62..fc3170914ac7 100644
> > > > > > > > > --- a/kernel/rcu/rcu.h
> > > > > > > > > +++ b/kernel/rcu/rcu.h
> > > > > > > > > @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned 
> > > > > > > > > long *sp)
> > > > > > > > >   WRITE_ONCE(*sp, rcu_seq_endval(sp));
> > > > > > > > >  }
> > > > > > > > > 
> > > > > > > > > -/* Take a snapshot of the update side's sequence number. */
> > > > > > > > > +/*
> > > > > > > > > + * Take a snapshot of the update side's sequence number.
> > > > > > > > > + *
> > > > > > > > > + * This function predicts what the grace period number will 
> > > > > > > > > be the next
> > > > > > > > > + * time an RCU callback will be executed, given the current 
> > > > > > > > > grace period's
> > > > > > > > > + * number. This can be gp+1 if RCU is idle, or gp+2 if a 
> > > > > > > > > grace period is
> > > > > > > > > + * already in progress.
> > > > > > > > 
> > > > > > > > How about something like this?
> > > > > > > > 
> > > > > > > > This function returns the earliest value of the 
> > > > > > > > grace-period
> > > > > > > > sequence number that will indicate that a full grace 
> > > > > > > > period has
> > > > > > > > elapsed since the current time.  Once the grace-period 
> > > > > > > > sequence
> > > > > > > > number has reached this value, it will be safe to 
> > > > > > > > invoke all
> > > > > > > > callbacks that have been registered prior to the 
> > > > > > > > current time.
> > > > > > > > This value is the current grace-period number plus two 
> > > > > > > > to the
> > > > > > > > power of the number of low-order bits reserved for 
> > > > > > > > state, then
> > > > > > > > rounded up to the next value in which the state bits 
> > > > > > > > are all zero.
> > > > > > > 
> > > > > > > This makes sense too, but do you disagree with what I said?
> > > > > > 
> > > > > > In a pedantic sense, definitely.  RCU callbacks are being executed 
> > > > > > pretty
> > > > > > much all the time on a busy system, so it is only the recently 
> > > > > > queued
> > > > > > ones that are guaranteed to be deferred that long.  And my 
> > > > > > experience
> > > > > > indicates that someone really will get confused by that distinction,
> > > > > > so I feel justified in being pedantic in this case.
> > > > > 
> > > > > Ok I agree, I'll include your comment above.
> > > > > 
> > > > > > > Also just to let you know, thanks so much for elaborately 
> > > > > > > providing an
> > > > > > > example on the other thread where we are discussing the 
> > > > > > > rcu_seq_done check. I
> > > > > > > will take some time to trace this down and see if I can zero in 
> > > > > > > on the same
> > > > > > > understanding as yours.
> > > > > > > 
> > > > > > > I get why we use rcu_seq_snap there in rcu_start_this_gp but the 
> > > > > > > way it its
> > > > > > > used is 'c' is the requested GP obtained from _snap, and we are 
> > > > > > > comparing that with the existing
> > > > > > > rnp->gp_seq in rcu_seq_done.  When that rnp->gp_seq reaches 'c', 
> > > > > > > it only
> > > > > > > means rnp->gp_seq is done, it doesn't tell us if 'c' is done 
> > > > > > > which is what
> > > > > > > we were trying to check in that loop... that's why I felt that 
> > > > > > > check wasn't
> > > > > > > correct - that's my (most likely wrong) take on the matter, and 
> > > > > > > I'll get back
> > > > > > > once I trace this a bit more hopefully today :-P
> > > > > > 
> > > > > > If your point is that interrupts are disabled 

Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-15 Thread Joel Fernandes
On Tue, May 15, 2018 at 12:08:01PM -0700, Paul E. McKenney wrote:
> On Tue, May 15, 2018 at 11:41:15AM -0700, Joel Fernandes wrote:
> > On Tue, May 15, 2018 at 05:55:07AM -0700, Paul E. McKenney wrote:
> > > On Tue, May 15, 2018 at 12:02:43AM -0700, Joel Fernandes wrote:
> > > > Hi Paul,
> > > > Good morning, hope you're having a great Tuesday. I managed to find some
> > > > evening hours today to dig into this a bit more.
> > > > 
> > > > On Mon, May 14, 2018 at 08:59:52PM -0700, Paul E. McKenney wrote:
> > > > > On Mon, May 14, 2018 at 06:51:33PM -0700, Joel Fernandes wrote:
> > > > > > On Mon, May 14, 2018 at 10:38:16AM -0700, Paul E. McKenney wrote:
> > > > > > > On Sun, May 13, 2018 at 08:15:34PM -0700, Joel Fernandes (Google) 
> > > > > > > wrote:
> > > > > > > > rcu_seq_snap may be tricky for someone looking at it for the 
> > > > > > > > first time.
> > > > > > > > Lets document how it works with an example to make it easier.
> > > > > > > > 
> > > > > > > > Signed-off-by: Joel Fernandes (Google) 
> > > > > > > > ---
> > > > > > > >  kernel/rcu/rcu.h | 24 +++-
> > > > > > > >  1 file changed, 23 insertions(+), 1 deletion(-)
> > > > > > > > 
> > > > > > > > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > > > > > > > index 003671825d62..fc3170914ac7 100644
> > > > > > > > --- a/kernel/rcu/rcu.h
> > > > > > > > +++ b/kernel/rcu/rcu.h
> > > > > > > > @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long 
> > > > > > > > *sp)
> > > > > > > > WRITE_ONCE(*sp, rcu_seq_endval(sp));
> > > > > > > >  }
> > > > > > > > 
> > > > > > > > -/* Take a snapshot of the update side's sequence number. */
> > > > > > > > +/*
> > > > > > > > + * Take a snapshot of the update side's sequence number.
> > > > > > > > + *
> > > > > > > > + * This function predicts what the grace period number will be 
> > > > > > > > the next
> > > > > > > > + * time an RCU callback will be executed, given the current 
> > > > > > > > grace period's
> > > > > > > > + * number. This can be gp+1 if RCU is idle, or gp+2 if a grace 
> > > > > > > > period is
> > > > > > > > + * already in progress.
> > > > > > > 
> > > > > > > How about something like this?
> > > > > > > 
> > > > > > >   This function returns the earliest value of the grace-period
> > > > > > >   sequence number that will indicate that a full grace period has
> > > > > > >   elapsed since the current time.  Once the grace-period sequence
> > > > > > >   number has reached this value, it will be safe to invoke all
> > > > > > >   callbacks that have been registered prior to the current time.
> > > > > > >   This value is the current grace-period number plus two to the
> > > > > > >   power of the number of low-order bits reserved for state, then
> > > > > > >   rounded up to the next value in which the state bits are all 
> > > > > > > zero.
> > > > > > 
> > > > > > This makes sense too, but do you disagree with what I said?
> > > > > 
> > > > > In a pedantic sense, definitely.  RCU callbacks are being executed 
> > > > > pretty
> > > > > much all the time on a busy system, so it is only the recently queued
> > > > > ones that are guaranteed to be deferred that long.  And my experience
> > > > > indicates that someone really will get confused by that distinction,
> > > > > so I feel justified in being pedantic in this case.
> > > > 
> > > > Ok I agree, I'll include your comment above.
> > > > 
> > > > > > Also just to let you know, thanks so much for elaborately providing 
> > > > > > an
> > > > > > example on the other thread where we are discussing the 
> > > > > > rcu_seq_done check. I
> > > > > > will take some time to trace this down and see if I can zero in on 
> > > > > > the same
> > > > > > understanding as yours.
> > > > > > 
> > > > > > I get why we use rcu_seq_snap there in rcu_start_this_gp but the 
> > > > > > way it its
> > > > > > used is 'c' is the requested GP obtained from _snap, and we are 
> > > > > > comparing that with the existing
> > > > > > rnp->gp_seq in rcu_seq_done.  When that rnp->gp_seq reaches 'c', it 
> > > > > > only
> > > > > > means rnp->gp_seq is done, it doesn't tell us if 'c' is done which 
> > > > > > is what
> > > > > > we were trying to check in that loop... that's why I felt that 
> > > > > > check wasn't
> > > > > > correct - that's my (most likely wrong) take on the matter, and 
> > > > > > I'll get back
> > > > > > once I trace this a bit more hopefully today :-P
> > > > > 
> > > > > If your point is that interrupts are disabled throughout, so there 
> > > > > isn't
> > > > > much chance of the grace period completing during that time, you are
> > > > > mostly right.  The places you might not be right are the idle loop and
> > > > > offline CPUs.  And yes, call_rcu() doesn't like queuing callbacks onto
> > > > > offline CPUs, but IIRC it is just fine in the case where callbacks 
> > > > > have
> > > > > been offloaded from that CPU.
> > > > > 
> > > > > And if you 

Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-15 Thread Joel Fernandes
On Tue, May 15, 2018 at 12:08:01PM -0700, Paul E. McKenney wrote:
> On Tue, May 15, 2018 at 11:41:15AM -0700, Joel Fernandes wrote:
> > On Tue, May 15, 2018 at 05:55:07AM -0700, Paul E. McKenney wrote:
> > > On Tue, May 15, 2018 at 12:02:43AM -0700, Joel Fernandes wrote:
> > > > Hi Paul,
> > > > Good morning, hope you're having a great Tuesday. I managed to find some
> > > > evening hours today to dig into this a bit more.
> > > > 
> > > > On Mon, May 14, 2018 at 08:59:52PM -0700, Paul E. McKenney wrote:
> > > > > On Mon, May 14, 2018 at 06:51:33PM -0700, Joel Fernandes wrote:
> > > > > > On Mon, May 14, 2018 at 10:38:16AM -0700, Paul E. McKenney wrote:
> > > > > > > On Sun, May 13, 2018 at 08:15:34PM -0700, Joel Fernandes (Google) 
> > > > > > > wrote:
> > > > > > > > rcu_seq_snap may be tricky for someone looking at it for the 
> > > > > > > > first time.
> > > > > > > > Lets document how it works with an example to make it easier.
> > > > > > > > 
> > > > > > > > Signed-off-by: Joel Fernandes (Google) 
> > > > > > > > ---
> > > > > > > >  kernel/rcu/rcu.h | 24 +++-
> > > > > > > >  1 file changed, 23 insertions(+), 1 deletion(-)
> > > > > > > > 
> > > > > > > > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > > > > > > > index 003671825d62..fc3170914ac7 100644
> > > > > > > > --- a/kernel/rcu/rcu.h
> > > > > > > > +++ b/kernel/rcu/rcu.h
> > > > > > > > @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long 
> > > > > > > > *sp)
> > > > > > > > WRITE_ONCE(*sp, rcu_seq_endval(sp));
> > > > > > > >  }
> > > > > > > > 
> > > > > > > > -/* Take a snapshot of the update side's sequence number. */
> > > > > > > > +/*
> > > > > > > > + * Take a snapshot of the update side's sequence number.
> > > > > > > > + *
> > > > > > > > + * This function predicts what the grace period number will be 
> > > > > > > > the next
> > > > > > > > + * time an RCU callback will be executed, given the current 
> > > > > > > > grace period's
> > > > > > > > + * number. This can be gp+1 if RCU is idle, or gp+2 if a grace 
> > > > > > > > period is
> > > > > > > > + * already in progress.
> > > > > > > 
> > > > > > > How about something like this?
> > > > > > > 
> > > > > > >   This function returns the earliest value of the grace-period
> > > > > > >   sequence number that will indicate that a full grace period has
> > > > > > >   elapsed since the current time.  Once the grace-period sequence
> > > > > > >   number has reached this value, it will be safe to invoke all
> > > > > > >   callbacks that have been registered prior to the current time.
> > > > > > >   This value is the current grace-period number plus two to the
> > > > > > >   power of the number of low-order bits reserved for state, then
> > > > > > >   rounded up to the next value in which the state bits are all 
> > > > > > > zero.
> > > > > > 
> > > > > > This makes sense too, but do you disagree with what I said?
> > > > > 
> > > > > In a pedantic sense, definitely.  RCU callbacks are being executed 
> > > > > pretty
> > > > > much all the time on a busy system, so it is only the recently queued
> > > > > ones that are guaranteed to be deferred that long.  And my experience
> > > > > indicates that someone really will get confused by that distinction,
> > > > > so I feel justified in being pedantic in this case.
> > > > 
> > > > Ok I agree, I'll include your comment above.
> > > > 
> > > > > > Also just to let you know, thanks so much for elaborately providing 
> > > > > > an
> > > > > > example on the other thread where we are discussing the 
> > > > > > rcu_seq_done check. I
> > > > > > will take some time to trace this down and see if I can zero in on 
> > > > > > the same
> > > > > > understanding as yours.
> > > > > > 
> > > > > > I get why we use rcu_seq_snap there in rcu_start_this_gp but the 
> > > > > > way it its
> > > > > > used is 'c' is the requested GP obtained from _snap, and we are 
> > > > > > comparing that with the existing
> > > > > > rnp->gp_seq in rcu_seq_done.  When that rnp->gp_seq reaches 'c', it 
> > > > > > only
> > > > > > means rnp->gp_seq is done, it doesn't tell us if 'c' is done which 
> > > > > > is what
> > > > > > we were trying to check in that loop... that's why I felt that 
> > > > > > check wasn't
> > > > > > correct - that's my (most likely wrong) take on the matter, and 
> > > > > > I'll get back
> > > > > > once I trace this a bit more hopefully today :-P
> > > > > 
> > > > > If your point is that interrupts are disabled throughout, so there 
> > > > > isn't
> > > > > much chance of the grace period completing during that time, you are
> > > > > mostly right.  The places you might not be right are the idle loop and
> > > > > offline CPUs.  And yes, call_rcu() doesn't like queuing callbacks onto
> > > > > offline CPUs, but IIRC it is just fine in the case where callbacks 
> > > > > have
> > > > > been offloaded from that CPU.
> > > > > 
> > > > > And if you instead say that "c" is the 

Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-15 Thread Paul E. McKenney
On Tue, May 15, 2018 at 11:41:15AM -0700, Joel Fernandes wrote:
> On Tue, May 15, 2018 at 05:55:07AM -0700, Paul E. McKenney wrote:
> > On Tue, May 15, 2018 at 12:02:43AM -0700, Joel Fernandes wrote:
> > > Hi Paul,
> > > Good morning, hope you're having a great Tuesday. I managed to find some
> > > evening hours today to dig into this a bit more.
> > > 
> > > On Mon, May 14, 2018 at 08:59:52PM -0700, Paul E. McKenney wrote:
> > > > On Mon, May 14, 2018 at 06:51:33PM -0700, Joel Fernandes wrote:
> > > > > On Mon, May 14, 2018 at 10:38:16AM -0700, Paul E. McKenney wrote:
> > > > > > On Sun, May 13, 2018 at 08:15:34PM -0700, Joel Fernandes (Google) 
> > > > > > wrote:
> > > > > > > rcu_seq_snap may be tricky for someone looking at it for the 
> > > > > > > first time.
> > > > > > > Lets document how it works with an example to make it easier.
> > > > > > > 
> > > > > > > Signed-off-by: Joel Fernandes (Google) 
> > > > > > > ---
> > > > > > >  kernel/rcu/rcu.h | 24 +++-
> > > > > > >  1 file changed, 23 insertions(+), 1 deletion(-)
> > > > > > > 
> > > > > > > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > > > > > > index 003671825d62..fc3170914ac7 100644
> > > > > > > --- a/kernel/rcu/rcu.h
> > > > > > > +++ b/kernel/rcu/rcu.h
> > > > > > > @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long 
> > > > > > > *sp)
> > > > > > >   WRITE_ONCE(*sp, rcu_seq_endval(sp));
> > > > > > >  }
> > > > > > > 
> > > > > > > -/* Take a snapshot of the update side's sequence number. */
> > > > > > > +/*
> > > > > > > + * Take a snapshot of the update side's sequence number.
> > > > > > > + *
> > > > > > > + * This function predicts what the grace period number will be 
> > > > > > > the next
> > > > > > > + * time an RCU callback will be executed, given the current 
> > > > > > > grace period's
> > > > > > > + * number. This can be gp+1 if RCU is idle, or gp+2 if a grace 
> > > > > > > period is
> > > > > > > + * already in progress.
> > > > > > 
> > > > > > How about something like this?
> > > > > > 
> > > > > > This function returns the earliest value of the grace-period
> > > > > > sequence number that will indicate that a full grace period has
> > > > > > elapsed since the current time.  Once the grace-period sequence
> > > > > > number has reached this value, it will be safe to invoke all
> > > > > > callbacks that have been registered prior to the current time.
> > > > > > This value is the current grace-period number plus two to the
> > > > > > power of the number of low-order bits reserved for state, then
> > > > > > rounded up to the next value in which the state bits are all 
> > > > > > zero.
> > > > > 
> > > > > This makes sense too, but do you disagree with what I said?
> > > > 
> > > > In a pedantic sense, definitely.  RCU callbacks are being executed 
> > > > pretty
> > > > much all the time on a busy system, so it is only the recently queued
> > > > ones that are guaranteed to be deferred that long.  And my experience
> > > > indicates that someone really will get confused by that distinction,
> > > > so I feel justified in being pedantic in this case.
> > > 
> > > Ok I agree, I'll include your comment above.
> > > 
> > > > > Also just to let you know, thanks so much for elaborately providing an
> > > > > example on the other thread where we are discussing the rcu_seq_done 
> > > > > check. I
> > > > > will take some time to trace this down and see if I can zero in on 
> > > > > the same
> > > > > understanding as yours.
> > > > > 
> > > > > I get why we use rcu_seq_snap there in rcu_start_this_gp but the way 
> > > > > it its
> > > > > used is 'c' is the requested GP obtained from _snap, and we are 
> > > > > comparing that with the existing
> > > > > rnp->gp_seq in rcu_seq_done.  When that rnp->gp_seq reaches 'c', it 
> > > > > only
> > > > > means rnp->gp_seq is done, it doesn't tell us if 'c' is done which is 
> > > > > what
> > > > > we were trying to check in that loop... that's why I felt that check 
> > > > > wasn't
> > > > > correct - that's my (most likely wrong) take on the matter, and I'll 
> > > > > get back
> > > > > once I trace this a bit more hopefully today :-P
> > > > 
> > > > If your point is that interrupts are disabled throughout, so there isn't
> > > > much chance of the grace period completing during that time, you are
> > > > mostly right.  The places you might not be right are the idle loop and
> > > > offline CPUs.  And yes, call_rcu() doesn't like queuing callbacks onto
> > > > offline CPUs, but IIRC it is just fine in the case where callbacks have
> > > > been offloaded from that CPU.
> > > > 
> > > > And if you instead say that "c" is the requested final ->gp_seq value
> > > > obtained from _snap(), the thought process might go more easily.
> > > 
> > > Yes I agree with c being the requested final value which is the GP for 
> > > which
> > > the callbacks will be queued. At the end of the 

Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-15 Thread Paul E. McKenney
On Tue, May 15, 2018 at 11:41:15AM -0700, Joel Fernandes wrote:
> On Tue, May 15, 2018 at 05:55:07AM -0700, Paul E. McKenney wrote:
> > On Tue, May 15, 2018 at 12:02:43AM -0700, Joel Fernandes wrote:
> > > Hi Paul,
> > > Good morning, hope you're having a great Tuesday. I managed to find some
> > > evening hours today to dig into this a bit more.
> > > 
> > > On Mon, May 14, 2018 at 08:59:52PM -0700, Paul E. McKenney wrote:
> > > > On Mon, May 14, 2018 at 06:51:33PM -0700, Joel Fernandes wrote:
> > > > > On Mon, May 14, 2018 at 10:38:16AM -0700, Paul E. McKenney wrote:
> > > > > > On Sun, May 13, 2018 at 08:15:34PM -0700, Joel Fernandes (Google) 
> > > > > > wrote:
> > > > > > > rcu_seq_snap may be tricky for someone looking at it for the 
> > > > > > > first time.
> > > > > > > Lets document how it works with an example to make it easier.
> > > > > > > 
> > > > > > > Signed-off-by: Joel Fernandes (Google) 
> > > > > > > ---
> > > > > > >  kernel/rcu/rcu.h | 24 +++-
> > > > > > >  1 file changed, 23 insertions(+), 1 deletion(-)
> > > > > > > 
> > > > > > > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > > > > > > index 003671825d62..fc3170914ac7 100644
> > > > > > > --- a/kernel/rcu/rcu.h
> > > > > > > +++ b/kernel/rcu/rcu.h
> > > > > > > @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long 
> > > > > > > *sp)
> > > > > > >   WRITE_ONCE(*sp, rcu_seq_endval(sp));
> > > > > > >  }
> > > > > > > 
> > > > > > > -/* Take a snapshot of the update side's sequence number. */
> > > > > > > +/*
> > > > > > > + * Take a snapshot of the update side's sequence number.
> > > > > > > + *
> > > > > > > + * This function predicts what the grace period number will be 
> > > > > > > the next
> > > > > > > + * time an RCU callback will be executed, given the current 
> > > > > > > grace period's
> > > > > > > + * number. This can be gp+1 if RCU is idle, or gp+2 if a grace 
> > > > > > > period is
> > > > > > > + * already in progress.
> > > > > > 
> > > > > > How about something like this?
> > > > > > 
> > > > > > This function returns the earliest value of the grace-period
> > > > > > sequence number that will indicate that a full grace period has
> > > > > > elapsed since the current time.  Once the grace-period sequence
> > > > > > number has reached this value, it will be safe to invoke all
> > > > > > callbacks that have been registered prior to the current time.
> > > > > > This value is the current grace-period number plus two to the
> > > > > > power of the number of low-order bits reserved for state, then
> > > > > > rounded up to the next value in which the state bits are all 
> > > > > > zero.
> > > > > 
> > > > > This makes sense too, but do you disagree with what I said?
> > > > 
> > > > In a pedantic sense, definitely.  RCU callbacks are being executed 
> > > > pretty
> > > > much all the time on a busy system, so it is only the recently queued
> > > > ones that are guaranteed to be deferred that long.  And my experience
> > > > indicates that someone really will get confused by that distinction,
> > > > so I feel justified in being pedantic in this case.
> > > 
> > > Ok I agree, I'll include your comment above.
> > > 
> > > > > Also just to let you know, thanks so much for elaborately providing an
> > > > > example on the other thread where we are discussing the rcu_seq_done 
> > > > > check. I
> > > > > will take some time to trace this down and see if I can zero in on 
> > > > > the same
> > > > > understanding as yours.
> > > > > 
> > > > > I get why we use rcu_seq_snap there in rcu_start_this_gp but the way 
> > > > > it its
> > > > > used is 'c' is the requested GP obtained from _snap, and we are 
> > > > > comparing that with the existing
> > > > > rnp->gp_seq in rcu_seq_done.  When that rnp->gp_seq reaches 'c', it 
> > > > > only
> > > > > means rnp->gp_seq is done, it doesn't tell us if 'c' is done which is 
> > > > > what
> > > > > we were trying to check in that loop... that's why I felt that check 
> > > > > wasn't
> > > > > correct - that's my (most likely wrong) take on the matter, and I'll 
> > > > > get back
> > > > > once I trace this a bit more hopefully today :-P
> > > > 
> > > > If your point is that interrupts are disabled throughout, so there isn't
> > > > much chance of the grace period completing during that time, you are
> > > > mostly right.  The places you might not be right are the idle loop and
> > > > offline CPUs.  And yes, call_rcu() doesn't like queuing callbacks onto
> > > > offline CPUs, but IIRC it is just fine in the case where callbacks have
> > > > been offloaded from that CPU.
> > > > 
> > > > And if you instead say that "c" is the requested final ->gp_seq value
> > > > obtained from _snap(), the thought process might go more easily.
> > > 
> > > Yes I agree with c being the requested final value which is the GP for 
> > > which
> > > the callbacks will be queued. At the end of the GP c, the callbacks 

Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-15 Thread Joel Fernandes
On Tue, May 15, 2018 at 05:55:07AM -0700, Paul E. McKenney wrote:
> On Tue, May 15, 2018 at 12:02:43AM -0700, Joel Fernandes wrote:
> > Hi Paul,
> > Good morning, hope you're having a great Tuesday. I managed to find some
> > evening hours today to dig into this a bit more.
> > 
> > On Mon, May 14, 2018 at 08:59:52PM -0700, Paul E. McKenney wrote:
> > > On Mon, May 14, 2018 at 06:51:33PM -0700, Joel Fernandes wrote:
> > > > On Mon, May 14, 2018 at 10:38:16AM -0700, Paul E. McKenney wrote:
> > > > > On Sun, May 13, 2018 at 08:15:34PM -0700, Joel Fernandes (Google) 
> > > > > wrote:
> > > > > > rcu_seq_snap may be tricky for someone looking at it for the first 
> > > > > > time.
> > > > > > Lets document how it works with an example to make it easier.
> > > > > > 
> > > > > > Signed-off-by: Joel Fernandes (Google) 
> > > > > > ---
> > > > > >  kernel/rcu/rcu.h | 24 +++-
> > > > > >  1 file changed, 23 insertions(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > > > > > index 003671825d62..fc3170914ac7 100644
> > > > > > --- a/kernel/rcu/rcu.h
> > > > > > +++ b/kernel/rcu/rcu.h
> > > > > > @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long *sp)
> > > > > > WRITE_ONCE(*sp, rcu_seq_endval(sp));
> > > > > >  }
> > > > > > 
> > > > > > -/* Take a snapshot of the update side's sequence number. */
> > > > > > +/*
> > > > > > + * Take a snapshot of the update side's sequence number.
> > > > > > + *
> > > > > > + * This function predicts what the grace period number will be the 
> > > > > > next
> > > > > > + * time an RCU callback will be executed, given the current grace 
> > > > > > period's
> > > > > > + * number. This can be gp+1 if RCU is idle, or gp+2 if a grace 
> > > > > > period is
> > > > > > + * already in progress.
> > > > > 
> > > > > How about something like this?
> > > > > 
> > > > >   This function returns the earliest value of the grace-period
> > > > >   sequence number that will indicate that a full grace period has
> > > > >   elapsed since the current time.  Once the grace-period sequence
> > > > >   number has reached this value, it will be safe to invoke all
> > > > >   callbacks that have been registered prior to the current time.
> > > > >   This value is the current grace-period number plus two to the
> > > > >   power of the number of low-order bits reserved for state, then
> > > > >   rounded up to the next value in which the state bits are all 
> > > > > zero.
> > > > 
> > > > This makes sense too, but do you disagree with what I said?
> > > 
> > > In a pedantic sense, definitely.  RCU callbacks are being executed pretty
> > > much all the time on a busy system, so it is only the recently queued
> > > ones that are guaranteed to be deferred that long.  And my experience
> > > indicates that someone really will get confused by that distinction,
> > > so I feel justified in being pedantic in this case.
> > 
> > Ok I agree, I'll include your comment above.
> > 
> > > > Also just to let you know, thanks so much for elaborately providing an
> > > > example on the other thread where we are discussing the rcu_seq_done 
> > > > check. I
> > > > will take some time to trace this down and see if I can zero in on the 
> > > > same
> > > > understanding as yours.
> > > > 
> > > > I get why we use rcu_seq_snap there in rcu_start_this_gp but the way it 
> > > > its
> > > > used is 'c' is the requested GP obtained from _snap, and we are 
> > > > comparing that with the existing
> > > > rnp->gp_seq in rcu_seq_done.  When that rnp->gp_seq reaches 'c', it only
> > > > means rnp->gp_seq is done, it doesn't tell us if 'c' is done which is 
> > > > what
> > > > we were trying to check in that loop... that's why I felt that check 
> > > > wasn't
> > > > correct - that's my (most likely wrong) take on the matter, and I'll 
> > > > get back
> > > > once I trace this a bit more hopefully today :-P
> > > 
> > > If your point is that interrupts are disabled throughout, so there isn't
> > > much chance of the grace period completing during that time, you are
> > > mostly right.  The places you might not be right are the idle loop and
> > > offline CPUs.  And yes, call_rcu() doesn't like queuing callbacks onto
> > > offline CPUs, but IIRC it is just fine in the case where callbacks have
> > > been offloaded from that CPU.
> > > 
> > > And if you instead say that "c" is the requested final ->gp_seq value
> > > obtained from _snap(), the thought process might go more easily.
> > 
> > Yes I agree with c being the requested final value which is the GP for which
> > the callbacks will be queued. At the end of the GP c, the callbacks will 
> > have
> > executed.
> > 
> > About the rcu_seq_done check and why I believe its not right to use it in
> > that funnel locking loop, if you could allow me to try argument my point 
> > from
> > a different angle...
> > 
> > We agreed that the way 

Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-15 Thread Joel Fernandes
On Tue, May 15, 2018 at 05:55:07AM -0700, Paul E. McKenney wrote:
> On Tue, May 15, 2018 at 12:02:43AM -0700, Joel Fernandes wrote:
> > Hi Paul,
> > Good morning, hope you're having a great Tuesday. I managed to find some
> > evening hours today to dig into this a bit more.
> > 
> > On Mon, May 14, 2018 at 08:59:52PM -0700, Paul E. McKenney wrote:
> > > On Mon, May 14, 2018 at 06:51:33PM -0700, Joel Fernandes wrote:
> > > > On Mon, May 14, 2018 at 10:38:16AM -0700, Paul E. McKenney wrote:
> > > > > On Sun, May 13, 2018 at 08:15:34PM -0700, Joel Fernandes (Google) 
> > > > > wrote:
> > > > > > rcu_seq_snap may be tricky for someone looking at it for the first 
> > > > > > time.
> > > > > > Lets document how it works with an example to make it easier.
> > > > > > 
> > > > > > Signed-off-by: Joel Fernandes (Google) 
> > > > > > ---
> > > > > >  kernel/rcu/rcu.h | 24 +++-
> > > > > >  1 file changed, 23 insertions(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > > > > > index 003671825d62..fc3170914ac7 100644
> > > > > > --- a/kernel/rcu/rcu.h
> > > > > > +++ b/kernel/rcu/rcu.h
> > > > > > @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long *sp)
> > > > > > WRITE_ONCE(*sp, rcu_seq_endval(sp));
> > > > > >  }
> > > > > > 
> > > > > > -/* Take a snapshot of the update side's sequence number. */
> > > > > > +/*
> > > > > > + * Take a snapshot of the update side's sequence number.
> > > > > > + *
> > > > > > + * This function predicts what the grace period number will be the 
> > > > > > next
> > > > > > + * time an RCU callback will be executed, given the current grace 
> > > > > > period's
> > > > > > + * number. This can be gp+1 if RCU is idle, or gp+2 if a grace 
> > > > > > period is
> > > > > > + * already in progress.
> > > > > 
> > > > > How about something like this?
> > > > > 
> > > > >   This function returns the earliest value of the grace-period
> > > > >   sequence number that will indicate that a full grace period has
> > > > >   elapsed since the current time.  Once the grace-period sequence
> > > > >   number has reached this value, it will be safe to invoke all
> > > > >   callbacks that have been registered prior to the current time.
> > > > >   This value is the current grace-period number plus two to the
> > > > >   power of the number of low-order bits reserved for state, then
> > > > >   rounded up to the next value in which the state bits are all 
> > > > > zero.
> > > > 
> > > > This makes sense too, but do you disagree with what I said?
> > > 
> > > In a pedantic sense, definitely.  RCU callbacks are being executed pretty
> > > much all the time on a busy system, so it is only the recently queued
> > > ones that are guaranteed to be deferred that long.  And my experience
> > > indicates that someone really will get confused by that distinction,
> > > so I feel justified in being pedantic in this case.
> > 
> > Ok I agree, I'll include your comment above.
> > 
> > > > Also just to let you know, thanks so much for elaborately providing an
> > > > example on the other thread where we are discussing the rcu_seq_done 
> > > > check. I
> > > > will take some time to trace this down and see if I can zero in on the 
> > > > same
> > > > understanding as yours.
> > > > 
> > > > I get why we use rcu_seq_snap there in rcu_start_this_gp but the way it 
> > > > its
> > > > used is 'c' is the requested GP obtained from _snap, and we are 
> > > > comparing that with the existing
> > > > rnp->gp_seq in rcu_seq_done.  When that rnp->gp_seq reaches 'c', it only
> > > > means rnp->gp_seq is done, it doesn't tell us if 'c' is done which is 
> > > > what
> > > > we were trying to check in that loop... that's why I felt that check 
> > > > wasn't
> > > > correct - that's my (most likely wrong) take on the matter, and I'll 
> > > > get back
> > > > once I trace this a bit more hopefully today :-P
> > > 
> > > If your point is that interrupts are disabled throughout, so there isn't
> > > much chance of the grace period completing during that time, you are
> > > mostly right.  The places you might not be right are the idle loop and
> > > offline CPUs.  And yes, call_rcu() doesn't like queuing callbacks onto
> > > offline CPUs, but IIRC it is just fine in the case where callbacks have
> > > been offloaded from that CPU.
> > > 
> > > And if you instead say that "c" is the requested final ->gp_seq value
> > > obtained from _snap(), the thought process might go more easily.
> > 
> > Yes I agree with c being the requested final value which is the GP for which
> > the callbacks will be queued. At the end of the GP c, the callbacks will 
> > have
> > executed.
> > 
> > About the rcu_seq_done check and why I believe its not right to use it in
> > that funnel locking loop, if you could allow me to try argument my point 
> > from
> > a different angle...
> > 
> > We agreed that the way gp_seq numbers work and 

Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-15 Thread Paul E. McKenney
On Tue, May 15, 2018 at 12:02:43AM -0700, Joel Fernandes wrote:
> Hi Paul,
> Good morning, hope you're having a great Tuesday. I managed to find some
> evening hours today to dig into this a bit more.
> 
> On Mon, May 14, 2018 at 08:59:52PM -0700, Paul E. McKenney wrote:
> > On Mon, May 14, 2018 at 06:51:33PM -0700, Joel Fernandes wrote:
> > > On Mon, May 14, 2018 at 10:38:16AM -0700, Paul E. McKenney wrote:
> > > > On Sun, May 13, 2018 at 08:15:34PM -0700, Joel Fernandes (Google) wrote:
> > > > > rcu_seq_snap may be tricky for someone looking at it for the first 
> > > > > time.
> > > > > Lets document how it works with an example to make it easier.
> > > > > 
> > > > > Signed-off-by: Joel Fernandes (Google) 
> > > > > ---
> > > > >  kernel/rcu/rcu.h | 24 +++-
> > > > >  1 file changed, 23 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > > > > index 003671825d62..fc3170914ac7 100644
> > > > > --- a/kernel/rcu/rcu.h
> > > > > +++ b/kernel/rcu/rcu.h
> > > > > @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long *sp)
> > > > >   WRITE_ONCE(*sp, rcu_seq_endval(sp));
> > > > >  }
> > > > > 
> > > > > -/* Take a snapshot of the update side's sequence number. */
> > > > > +/*
> > > > > + * Take a snapshot of the update side's sequence number.
> > > > > + *
> > > > > + * This function predicts what the grace period number will be the 
> > > > > next
> > > > > + * time an RCU callback will be executed, given the current grace 
> > > > > period's
> > > > > + * number. This can be gp+1 if RCU is idle, or gp+2 if a grace 
> > > > > period is
> > > > > + * already in progress.
> > > > 
> > > > How about something like this?
> > > > 
> > > > This function returns the earliest value of the grace-period
> > > > sequence number that will indicate that a full grace period has
> > > > elapsed since the current time.  Once the grace-period sequence
> > > > number has reached this value, it will be safe to invoke all
> > > > callbacks that have been registered prior to the current time.
> > > > This value is the current grace-period number plus two to the
> > > > power of the number of low-order bits reserved for state, then
> > > > rounded up to the next value in which the state bits are all 
> > > > zero.
> > > 
> > > This makes sense too, but do you disagree with what I said?
> > 
> > In a pedantic sense, definitely.  RCU callbacks are being executed pretty
> > much all the time on a busy system, so it is only the recently queued
> > ones that are guaranteed to be deferred that long.  And my experience
> > indicates that someone really will get confused by that distinction,
> > so I feel justified in being pedantic in this case.
> 
> Ok I agree, I'll include your comment above.
> 
> > > Also just to let you know, thanks so much for elaborately providing an
> > > example on the other thread where we are discussing the rcu_seq_done 
> > > check. I
> > > will take some time to trace this down and see if I can zero in on the 
> > > same
> > > understanding as yours.
> > > 
> > > I get why we use rcu_seq_snap there in rcu_start_this_gp but the way it 
> > > its
> > > used is 'c' is the requested GP obtained from _snap, and we are comparing 
> > > that with the existing
> > > rnp->gp_seq in rcu_seq_done.  When that rnp->gp_seq reaches 'c', it only
> > > means rnp->gp_seq is done, it doesn't tell us if 'c' is done which is what
> > > we were trying to check in that loop... that's why I felt that check 
> > > wasn't
> > > correct - that's my (most likely wrong) take on the matter, and I'll get 
> > > back
> > > once I trace this a bit more hopefully today :-P
> > 
> > If your point is that interrupts are disabled throughout, so there isn't
> > much chance of the grace period completing during that time, you are
> > mostly right.  The places you might not be right are the idle loop and
> > offline CPUs.  And yes, call_rcu() doesn't like queuing callbacks onto
> > offline CPUs, but IIRC it is just fine in the case where callbacks have
> > been offloaded from that CPU.
> > 
> > And if you instead say that "c" is the requested final ->gp_seq value
> > obtained from _snap(), the thought process might go more easily.
> 
> Yes I agree with c being the requested final value which is the GP for which
> the callbacks will be queued. At the end of the GP c, the callbacks will have
> executed.
> 
> About the rcu_seq_done check and why I believe its not right to use it in
> that funnel locking loop, if you could allow me to try argument my point from
> a different angle...
> 
> We agreed that the way gp_seq numbers work and are compared with each other
> to identify if a GP is elapsed or not, is different from the way the previous
> numbers (gp_num) were compared.
> 
> Most notably, before the gp_seq conversions - inorder to start a GP, we were
> doing gp_num += 1, and 

Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-15 Thread Paul E. McKenney
On Tue, May 15, 2018 at 12:02:43AM -0700, Joel Fernandes wrote:
> Hi Paul,
> Good morning, hope you're having a great Tuesday. I managed to find some
> evening hours today to dig into this a bit more.
> 
> On Mon, May 14, 2018 at 08:59:52PM -0700, Paul E. McKenney wrote:
> > On Mon, May 14, 2018 at 06:51:33PM -0700, Joel Fernandes wrote:
> > > On Mon, May 14, 2018 at 10:38:16AM -0700, Paul E. McKenney wrote:
> > > > On Sun, May 13, 2018 at 08:15:34PM -0700, Joel Fernandes (Google) wrote:
> > > > > rcu_seq_snap may be tricky for someone looking at it for the first 
> > > > > time.
> > > > > Lets document how it works with an example to make it easier.
> > > > > 
> > > > > Signed-off-by: Joel Fernandes (Google) 
> > > > > ---
> > > > >  kernel/rcu/rcu.h | 24 +++-
> > > > >  1 file changed, 23 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > > > > index 003671825d62..fc3170914ac7 100644
> > > > > --- a/kernel/rcu/rcu.h
> > > > > +++ b/kernel/rcu/rcu.h
> > > > > @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long *sp)
> > > > >   WRITE_ONCE(*sp, rcu_seq_endval(sp));
> > > > >  }
> > > > > 
> > > > > -/* Take a snapshot of the update side's sequence number. */
> > > > > +/*
> > > > > + * Take a snapshot of the update side's sequence number.
> > > > > + *
> > > > > + * This function predicts what the grace period number will be the 
> > > > > next
> > > > > + * time an RCU callback will be executed, given the current grace 
> > > > > period's
> > > > > + * number. This can be gp+1 if RCU is idle, or gp+2 if a grace 
> > > > > period is
> > > > > + * already in progress.
> > > > 
> > > > How about something like this?
> > > > 
> > > > This function returns the earliest value of the grace-period
> > > > sequence number that will indicate that a full grace period has
> > > > elapsed since the current time.  Once the grace-period sequence
> > > > number has reached this value, it will be safe to invoke all
> > > > callbacks that have been registered prior to the current time.
> > > > This value is the current grace-period number plus two to the
> > > > power of the number of low-order bits reserved for state, then
> > > > rounded up to the next value in which the state bits are all 
> > > > zero.
> > > 
> > > This makes sense too, but do you disagree with what I said?
> > 
> > In a pedantic sense, definitely.  RCU callbacks are being executed pretty
> > much all the time on a busy system, so it is only the recently queued
> > ones that are guaranteed to be deferred that long.  And my experience
> > indicates that someone really will get confused by that distinction,
> > so I feel justified in being pedantic in this case.
> 
> Ok I agree, I'll include your comment above.
> 
> > > Also just to let you know, thanks so much for elaborately providing an
> > > example on the other thread where we are discussing the rcu_seq_done 
> > > check. I
> > > will take some time to trace this down and see if I can zero in on the 
> > > same
> > > understanding as yours.
> > > 
> > > I get why we use rcu_seq_snap there in rcu_start_this_gp but the way it 
> > > its
> > > used is 'c' is the requested GP obtained from _snap, and we are comparing 
> > > that with the existing
> > > rnp->gp_seq in rcu_seq_done.  When that rnp->gp_seq reaches 'c', it only
> > > means rnp->gp_seq is done, it doesn't tell us if 'c' is done which is what
> > > we were trying to check in that loop... that's why I felt that check 
> > > wasn't
> > > correct - that's my (most likely wrong) take on the matter, and I'll get 
> > > back
> > > once I trace this a bit more hopefully today :-P
> > 
> > If your point is that interrupts are disabled throughout, so there isn't
> > much chance of the grace period completing during that time, you are
> > mostly right.  The places you might not be right are the idle loop and
> > offline CPUs.  And yes, call_rcu() doesn't like queuing callbacks onto
> > offline CPUs, but IIRC it is just fine in the case where callbacks have
> > been offloaded from that CPU.
> > 
> > And if you instead say that "c" is the requested final ->gp_seq value
> > obtained from _snap(), the thought process might go more easily.
> 
> Yes I agree with c being the requested final value which is the GP for which
> the callbacks will be queued. At the end of the GP c, the callbacks will have
> executed.
> 
> About the rcu_seq_done check and why I believe its not right to use it in
> that funnel locking loop, if you could allow me to try argument my point from
> a different angle...
> 
> We agreed that the way gp_seq numbers work and are compared with each other
> to identify if a GP is elapsed or not, is different from the way the previous
> numbers (gp_num) were compared.
> 
> Most notably, before the gp_seq conversions - inorder to start a GP, we were
> doing gp_num += 1, and completed had to catch 

Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-15 Thread Joel Fernandes
Hi Paul,
Good morning, hope you're having a great Tuesday. I managed to find some
evening hours today to dig into this a bit more.

On Mon, May 14, 2018 at 08:59:52PM -0700, Paul E. McKenney wrote:
> On Mon, May 14, 2018 at 06:51:33PM -0700, Joel Fernandes wrote:
> > On Mon, May 14, 2018 at 10:38:16AM -0700, Paul E. McKenney wrote:
> > > On Sun, May 13, 2018 at 08:15:34PM -0700, Joel Fernandes (Google) wrote:
> > > > rcu_seq_snap may be tricky for someone looking at it for the first time.
> > > > Lets document how it works with an example to make it easier.
> > > > 
> > > > Signed-off-by: Joel Fernandes (Google) 
> > > > ---
> > > >  kernel/rcu/rcu.h | 24 +++-
> > > >  1 file changed, 23 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > > > index 003671825d62..fc3170914ac7 100644
> > > > --- a/kernel/rcu/rcu.h
> > > > +++ b/kernel/rcu/rcu.h
> > > > @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long *sp)
> > > > WRITE_ONCE(*sp, rcu_seq_endval(sp));
> > > >  }
> > > > 
> > > > -/* Take a snapshot of the update side's sequence number. */
> > > > +/*
> > > > + * Take a snapshot of the update side's sequence number.
> > > > + *
> > > > + * This function predicts what the grace period number will be the next
> > > > + * time an RCU callback will be executed, given the current grace 
> > > > period's
> > > > + * number. This can be gp+1 if RCU is idle, or gp+2 if a grace period 
> > > > is
> > > > + * already in progress.
> > > 
> > > How about something like this?
> > > 
> > >   This function returns the earliest value of the grace-period
> > >   sequence number that will indicate that a full grace period has
> > >   elapsed since the current time.  Once the grace-period sequence
> > >   number has reached this value, it will be safe to invoke all
> > >   callbacks that have been registered prior to the current time.
> > >   This value is the current grace-period number plus two to the
> > >   power of the number of low-order bits reserved for state, then
> > >   rounded up to the next value in which the state bits are all zero.
> > 
> > This makes sense too, but do you disagree with what I said?
> 
> In a pedantic sense, definitely.  RCU callbacks are being executed pretty
> much all the time on a busy system, so it is only the recently queued
> ones that are guaranteed to be deferred that long.  And my experience
> indicates that someone really will get confused by that distinction,
> so I feel justified in being pedantic in this case.

Ok I agree, I'll include your comment above.

> > Also just to let you know, thanks so much for elaborately providing an
> > example on the other thread where we are discussing the rcu_seq_done check. 
> > I
> > will take some time to trace this down and see if I can zero in on the same
> > understanding as yours.
> > 
> > I get why we use rcu_seq_snap there in rcu_start_this_gp but the way it its
> > used is 'c' is the requested GP obtained from _snap, and we are comparing 
> > that with the existing
> > rnp->gp_seq in rcu_seq_done.  When that rnp->gp_seq reaches 'c', it only
> > means rnp->gp_seq is done, it doesn't tell us if 'c' is done which is what
> > we were trying to check in that loop... that's why I felt that check wasn't
> > correct - that's my (most likely wrong) take on the matter, and I'll get 
> > back
> > once I trace this a bit more hopefully today :-P
> 
> If your point is that interrupts are disabled throughout, so there isn't
> much chance of the grace period completing during that time, you are
> mostly right.  The places you might not be right are the idle loop and
> offline CPUs.  And yes, call_rcu() doesn't like queuing callbacks onto
> offline CPUs, but IIRC it is just fine in the case where callbacks have
> been offloaded from that CPU.
> 
> And if you instead say that "c" is the requested final ->gp_seq value
> obtained from _snap(), the thought process might go more easily.

Yes I agree with c being the requested final value which is the GP for which
the callbacks will be queued. At the end of the GP c, the callbacks will have
executed.

About the rcu_seq_done check and why I believe its not right to use it in
that funnel locking loop, if you could allow me to try argument my point from
a different angle...

We agreed that the way gp_seq numbers work and are compared with each other
to identify if a GP is elapsed or not, is different from the way the previous
numbers (gp_num) were compared.

Most notably, before the gp_seq conversions - inorder to start a GP, we were
doing gp_num += 1, and completed had to catch up to gp_num + 1 to mark the
end.

Now with gp_seq, for a gp to start, we don't do the "+1", we just set the
state bits. To mark the end, we clear the state bits and increment the gp_num
part of gp_seq.

However, in the below commit 12d6c129fd0a ("rcu: Convert grace-period
requests to ->gp_seq"). You did a one-to-one replacement 

Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-15 Thread Joel Fernandes
Hi Paul,
Good morning, hope you're having a great Tuesday. I managed to find some
evening hours today to dig into this a bit more.

On Mon, May 14, 2018 at 08:59:52PM -0700, Paul E. McKenney wrote:
> On Mon, May 14, 2018 at 06:51:33PM -0700, Joel Fernandes wrote:
> > On Mon, May 14, 2018 at 10:38:16AM -0700, Paul E. McKenney wrote:
> > > On Sun, May 13, 2018 at 08:15:34PM -0700, Joel Fernandes (Google) wrote:
> > > > rcu_seq_snap may be tricky for someone looking at it for the first time.
> > > > Lets document how it works with an example to make it easier.
> > > > 
> > > > Signed-off-by: Joel Fernandes (Google) 
> > > > ---
> > > >  kernel/rcu/rcu.h | 24 +++-
> > > >  1 file changed, 23 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > > > index 003671825d62..fc3170914ac7 100644
> > > > --- a/kernel/rcu/rcu.h
> > > > +++ b/kernel/rcu/rcu.h
> > > > @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long *sp)
> > > > WRITE_ONCE(*sp, rcu_seq_endval(sp));
> > > >  }
> > > > 
> > > > -/* Take a snapshot of the update side's sequence number. */
> > > > +/*
> > > > + * Take a snapshot of the update side's sequence number.
> > > > + *
> > > > + * This function predicts what the grace period number will be the next
> > > > + * time an RCU callback will be executed, given the current grace 
> > > > period's
> > > > + * number. This can be gp+1 if RCU is idle, or gp+2 if a grace period 
> > > > is
> > > > + * already in progress.
> > > 
> > > How about something like this?
> > > 
> > >   This function returns the earliest value of the grace-period
> > >   sequence number that will indicate that a full grace period has
> > >   elapsed since the current time.  Once the grace-period sequence
> > >   number has reached this value, it will be safe to invoke all
> > >   callbacks that have been registered prior to the current time.
> > >   This value is the current grace-period number plus two to the
> > >   power of the number of low-order bits reserved for state, then
> > >   rounded up to the next value in which the state bits are all zero.
> > 
> > This makes sense too, but do you disagree with what I said?
> 
> In a pedantic sense, definitely.  RCU callbacks are being executed pretty
> much all the time on a busy system, so it is only the recently queued
> ones that are guaranteed to be deferred that long.  And my experience
> indicates that someone really will get confused by that distinction,
> so I feel justified in being pedantic in this case.

Ok I agree, I'll include your comment above.

> > Also just to let you know, thanks so much for elaborately providing an
> > example on the other thread where we are discussing the rcu_seq_done check. 
> > I
> > will take some time to trace this down and see if I can zero in on the same
> > understanding as yours.
> > 
> > I get why we use rcu_seq_snap there in rcu_start_this_gp but the way it its
> > used is 'c' is the requested GP obtained from _snap, and we are comparing 
> > that with the existing
> > rnp->gp_seq in rcu_seq_done.  When that rnp->gp_seq reaches 'c', it only
> > means rnp->gp_seq is done, it doesn't tell us if 'c' is done which is what
> > we were trying to check in that loop... that's why I felt that check wasn't
> > correct - that's my (most likely wrong) take on the matter, and I'll get 
> > back
> > once I trace this a bit more hopefully today :-P
> 
> If your point is that interrupts are disabled throughout, so there isn't
> much chance of the grace period completing during that time, you are
> mostly right.  The places you might not be right are the idle loop and
> offline CPUs.  And yes, call_rcu() doesn't like queuing callbacks onto
> offline CPUs, but IIRC it is just fine in the case where callbacks have
> been offloaded from that CPU.
> 
> And if you instead say that "c" is the requested final ->gp_seq value
> obtained from _snap(), the thought process might go more easily.

Yes I agree with c being the requested final value which is the GP for which
the callbacks will be queued. At the end of the GP c, the callbacks will have
executed.

About the rcu_seq_done check and why I believe its not right to use it in
that funnel locking loop, if you could allow me to try argument my point from
a different angle...

We agreed that the way gp_seq numbers work and are compared with each other
to identify if a GP is elapsed or not, is different from the way the previous
numbers (gp_num) were compared.

Most notably, before the gp_seq conversions - inorder to start a GP, we were
doing gp_num += 1, and completed had to catch up to gp_num + 1 to mark the
end.

Now with gp_seq, for a gp to start, we don't do the "+1", we just set the
state bits. To mark the end, we clear the state bits and increment the gp_num
part of gp_seq.

However, in the below commit 12d6c129fd0a ("rcu: Convert grace-period
requests to ->gp_seq"). You did a one-to-one replacement of the ULONG_CMP_GE

Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-14 Thread Paul E. McKenney
On Mon, May 14, 2018 at 06:51:33PM -0700, Joel Fernandes wrote:
> On Mon, May 14, 2018 at 10:38:16AM -0700, Paul E. McKenney wrote:
> > On Sun, May 13, 2018 at 08:15:34PM -0700, Joel Fernandes (Google) wrote:
> > > rcu_seq_snap may be tricky for someone looking at it for the first time.
> > > Lets document how it works with an example to make it easier.
> > > 
> > > Signed-off-by: Joel Fernandes (Google) 
> > > ---
> > >  kernel/rcu/rcu.h | 24 +++-
> > >  1 file changed, 23 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > > index 003671825d62..fc3170914ac7 100644
> > > --- a/kernel/rcu/rcu.h
> > > +++ b/kernel/rcu/rcu.h
> > > @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long *sp)
> > >   WRITE_ONCE(*sp, rcu_seq_endval(sp));
> > >  }
> > > 
> > > -/* Take a snapshot of the update side's sequence number. */
> > > +/*
> > > + * Take a snapshot of the update side's sequence number.
> > > + *
> > > + * This function predicts what the grace period number will be the next
> > > + * time an RCU callback will be executed, given the current grace 
> > > period's
> > > + * number. This can be gp+1 if RCU is idle, or gp+2 if a grace period is
> > > + * already in progress.
> > 
> > How about something like this?
> > 
> > This function returns the earliest value of the grace-period
> > sequence number that will indicate that a full grace period has
> > elapsed since the current time.  Once the grace-period sequence
> > number has reached this value, it will be safe to invoke all
> > callbacks that have been registered prior to the current time.
> > This value is the current grace-period number plus two to the
> > power of the number of low-order bits reserved for state, then
> > rounded up to the next value in which the state bits are all zero.
> 
> This makes sense too, but do you disagree with what I said?

In a pedantic sense, definitely.  RCU callbacks are being executed pretty
much all the time on a busy system, so it is only the recently queued
ones that are guaranteed to be deferred that long.  And my experience
indicates that someone really will get confused by that distinction,
so I feel justified in being pedantic in this case.

> I was kind of thinking of snap along the lines of how the previous code
> worked. Where you were calling rcu_cbs_completed() or a function with a
> similar name. Now we call _snap.

You are quite correct that rcu_seq_snap() is the new analog of
rcu_cbs_completed(), though there are differences due to the old ->gpnum
and ->completed now being represented by a single ->gp_seq value.

> So basically I connected these 2 facts together to mean that rcu_seq_snap
> also does that same thing as rcu_cbs_completed - which is basically it gives
> the "next GP" where existing callbacks have already run and new callbacks
> will run at the end of this "next GP".

Ah, but you didn't have the qualifier "new" in your original.  And even
then, the definition of "new" might be different for different readers.

> > > + *
> > > + * We do this with a single addition and masking.
> > 
> > Please either fold this sentence into rest of the paragraph or add a
> > blank line after it.
> > 
> > > + * For example, if RCU_SEQ_STATE_MASK=1 and the least significant bit 
> > > (LSB) of
> > > + * the seq is used to track if a GP is in progress or not, its 
> > > sufficient if we
> > > + * add (2+1) and mask with ~1. Let's see why with an example:
> > > + *
> > > + * Say the current seq is 6 which is 0b110 (gp is 3 and state bit is 0).
> > > + * To get the next GP number, we have to at least add 0b10 to this (0x1 
> > > << 1)
> > > + * to account for the state bit. However, if the current seq is 7 (gp is 
> > > 3 and
> > > + * state bit is 1), then it means the current grace period is already in
> > > + * progress so the next time the callback will run is at the end of grace
> > > + * period number gp+2. To account for the extra +1, we just overflow the 
> > > LSB by
> > > + * adding another 0x1 and masking with ~0x1. In case no GP was in 
> > > progress (RCU
> > > + * is idle), then the addition of the extra 0x1 and masking will have no
> > > + * effect. This is calculated as below.
> > > + */
> > 
> > Having the explicit numbers is good, but please use RCU_SEQ_STATE_MASK=3,
> > since that is the current value.  One alternative (or perhaps addition)
> > is to have a short table of numbers showing the mapping from *sp to the
> > return value.  (I started from such a table when writing this function,
> > for whatever that is worth.)
> 
> Ok I'll try to give a better example above. thanks,

Sounds good!

> Also just to let you know, thanks so much for elaborately providing an
> example on the other thread where we are discussing the rcu_seq_done check. I
> will take some time to trace this down and see if I can zero in on the same
> understanding as yours.
> 
> I get why we use 

Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-14 Thread Paul E. McKenney
On Mon, May 14, 2018 at 06:51:33PM -0700, Joel Fernandes wrote:
> On Mon, May 14, 2018 at 10:38:16AM -0700, Paul E. McKenney wrote:
> > On Sun, May 13, 2018 at 08:15:34PM -0700, Joel Fernandes (Google) wrote:
> > > rcu_seq_snap may be tricky for someone looking at it for the first time.
> > > Lets document how it works with an example to make it easier.
> > > 
> > > Signed-off-by: Joel Fernandes (Google) 
> > > ---
> > >  kernel/rcu/rcu.h | 24 +++-
> > >  1 file changed, 23 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > > index 003671825d62..fc3170914ac7 100644
> > > --- a/kernel/rcu/rcu.h
> > > +++ b/kernel/rcu/rcu.h
> > > @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long *sp)
> > >   WRITE_ONCE(*sp, rcu_seq_endval(sp));
> > >  }
> > > 
> > > -/* Take a snapshot of the update side's sequence number. */
> > > +/*
> > > + * Take a snapshot of the update side's sequence number.
> > > + *
> > > + * This function predicts what the grace period number will be the next
> > > + * time an RCU callback will be executed, given the current grace 
> > > period's
> > > + * number. This can be gp+1 if RCU is idle, or gp+2 if a grace period is
> > > + * already in progress.
> > 
> > How about something like this?
> > 
> > This function returns the earliest value of the grace-period
> > sequence number that will indicate that a full grace period has
> > elapsed since the current time.  Once the grace-period sequence
> > number has reached this value, it will be safe to invoke all
> > callbacks that have been registered prior to the current time.
> > This value is the current grace-period number plus two to the
> > power of the number of low-order bits reserved for state, then
> > rounded up to the next value in which the state bits are all zero.
> 
> This makes sense too, but do you disagree with what I said?

In a pedantic sense, definitely.  RCU callbacks are being executed pretty
much all the time on a busy system, so it is only the recently queued
ones that are guaranteed to be deferred that long.  And my experience
indicates that someone really will get confused by that distinction,
so I feel justified in being pedantic in this case.

> I was kind of thinking of snap along the lines of how the previous code
> worked. Where you were calling rcu_cbs_completed() or a function with a
> similar name. Now we call _snap.

You are quite correct that rcu_seq_snap() is the new analog of
rcu_cbs_completed(), though there are differences due to the old ->gpnum
and ->completed now being represented by a single ->gp_seq value.

> So basically I connected these 2 facts together to mean that rcu_seq_snap
> also does that same thing as rcu_cbs_completed - which is basically it gives
> the "next GP" where existing callbacks have already run and new callbacks
> will run at the end of this "next GP".

Ah, but you didn't have the qualifier "new" in your original.  And even
then, the definition of "new" might be different for different readers.

> > > + *
> > > + * We do this with a single addition and masking.
> > 
> > Please either fold this sentence into rest of the paragraph or add a
> > blank line after it.
> > 
> > > + * For example, if RCU_SEQ_STATE_MASK=1 and the least significant bit 
> > > (LSB) of
> > > + * the seq is used to track if a GP is in progress or not, its 
> > > sufficient if we
> > > + * add (2+1) and mask with ~1. Let's see why with an example:
> > > + *
> > > + * Say the current seq is 6 which is 0b110 (gp is 3 and state bit is 0).
> > > + * To get the next GP number, we have to at least add 0b10 to this (0x1 
> > > << 1)
> > > + * to account for the state bit. However, if the current seq is 7 (gp is 
> > > 3 and
> > > + * state bit is 1), then it means the current grace period is already in
> > > + * progress so the next time the callback will run is at the end of grace
> > > + * period number gp+2. To account for the extra +1, we just overflow the 
> > > LSB by
> > > + * adding another 0x1 and masking with ~0x1. In case no GP was in 
> > > progress (RCU
> > > + * is idle), then the addition of the extra 0x1 and masking will have no
> > > + * effect. This is calculated as below.
> > > + */
> > 
> > Having the explicit numbers is good, but please use RCU_SEQ_STATE_MASK=3,
> > since that is the current value.  One alternative (or perhaps addition)
> > is to have a short table of numbers showing the mapping from *sp to the
> > return value.  (I started from such a table when writing this function,
> > for whatever that is worth.)
> 
> Ok I'll try to give a better example above. thanks,

Sounds good!

> Also just to let you know, thanks so much for elaborately providing an
> example on the other thread where we are discussing the rcu_seq_done check. I
> will take some time to trace this down and see if I can zero in on the same
> understanding as yours.
> 
> I get why we use rcu_seq_snap there in 

Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-14 Thread Joel Fernandes
On Mon, May 14, 2018 at 10:38:16AM -0700, Paul E. McKenney wrote:
> On Sun, May 13, 2018 at 08:15:34PM -0700, Joel Fernandes (Google) wrote:
> > rcu_seq_snap may be tricky for someone looking at it for the first time.
> > Lets document how it works with an example to make it easier.
> > 
> > Signed-off-by: Joel Fernandes (Google) 
> > ---
> >  kernel/rcu/rcu.h | 24 +++-
> >  1 file changed, 23 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > index 003671825d62..fc3170914ac7 100644
> > --- a/kernel/rcu/rcu.h
> > +++ b/kernel/rcu/rcu.h
> > @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long *sp)
> > WRITE_ONCE(*sp, rcu_seq_endval(sp));
> >  }
> > 
> > -/* Take a snapshot of the update side's sequence number. */
> > +/*
> > + * Take a snapshot of the update side's sequence number.
> > + *
> > + * This function predicts what the grace period number will be the next
> > + * time an RCU callback will be executed, given the current grace period's
> > + * number. This can be gp+1 if RCU is idle, or gp+2 if a grace period is
> > + * already in progress.
> 
> How about something like this?
> 
>   This function returns the earliest value of the grace-period
>   sequence number that will indicate that a full grace period has
>   elapsed since the current time.  Once the grace-period sequence
>   number has reached this value, it will be safe to invoke all
>   callbacks that have been registered prior to the current time.
>   This value is the current grace-period number plus two to the
>   power of the number of low-order bits reserved for state, then
>   rounded up to the next value in which the state bits are all zero.

This makes sense too, but do you disagree with what I said?

I was kind of thinking of snap along the lines of how the previous code
worked. Where you were calling rcu_cbs_completed() or a function with a
similar name. Now we call _snap.

So basically I connected these 2 facts together to mean that rcu_seq_snap
also does that same thing as rcu_cbs_completed - which is basically it gives
the "next GP" where existing callbacks have already run and new callbacks
will run at the end of this "next GP".

> > + *
> > + * We do this with a single addition and masking.
> 
> Please either fold this sentence into rest of the paragraph or add a
> blank line after it.
> 
> > + * For example, if RCU_SEQ_STATE_MASK=1 and the least significant bit 
> > (LSB) of
> > + * the seq is used to track if a GP is in progress or not, its sufficient 
> > if we
> > + * add (2+1) and mask with ~1. Let's see why with an example:
> > + *
> > + * Say the current seq is 6 which is 0b110 (gp is 3 and state bit is 0).
> > + * To get the next GP number, we have to at least add 0b10 to this (0x1 << 
> > 1)
> > + * to account for the state bit. However, if the current seq is 7 (gp is 3 
> > and
> > + * state bit is 1), then it means the current grace period is already in
> > + * progress so the next time the callback will run is at the end of grace
> > + * period number gp+2. To account for the extra +1, we just overflow the 
> > LSB by
> > + * adding another 0x1 and masking with ~0x1. In case no GP was in progress 
> > (RCU
> > + * is idle), then the addition of the extra 0x1 and masking will have no
> > + * effect. This is calculated as below.
> > + */
> 
> Having the explicit numbers is good, but please use RCU_SEQ_STATE_MASK=3,
> since that is the current value.  One alternative (or perhaps addition)
> is to have a short table of numbers showing the mapping from *sp to the
> return value.  (I started from such a table when writing this function,
> for whatever that is worth.)

Ok I'll try to give a better example above. thanks,

Also just to let you know, thanks so much for elaborately providing an
example on the other thread where we are discussing the rcu_seq_done check. I
will take some time to trace this down and see if I can zero in on the same
understanding as yours.

I get why we use rcu_seq_snap there in rcu_start_this_gp but the way it its
used is 'c' is the requested GP obtained from _snap, and we are comparing that 
with the existing
rnp->gp_seq in rcu_seq_done.  When that rnp->gp_seq reaches 'c', it only
means rnp->gp_seq is done, it doesn't tell us if 'c' is done which is what
we were trying to check in that loop... that's why I felt that check wasn't
correct - that's my (most likely wrong) take on the matter, and I'll get back
once I trace this a bit more hopefully today :-P

thanks!

- Joel
 


Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-14 Thread Joel Fernandes
On Mon, May 14, 2018 at 10:38:16AM -0700, Paul E. McKenney wrote:
> On Sun, May 13, 2018 at 08:15:34PM -0700, Joel Fernandes (Google) wrote:
> > rcu_seq_snap may be tricky for someone looking at it for the first time.
> > Lets document how it works with an example to make it easier.
> > 
> > Signed-off-by: Joel Fernandes (Google) 
> > ---
> >  kernel/rcu/rcu.h | 24 +++-
> >  1 file changed, 23 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > index 003671825d62..fc3170914ac7 100644
> > --- a/kernel/rcu/rcu.h
> > +++ b/kernel/rcu/rcu.h
> > @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long *sp)
> > WRITE_ONCE(*sp, rcu_seq_endval(sp));
> >  }
> > 
> > -/* Take a snapshot of the update side's sequence number. */
> > +/*
> > + * Take a snapshot of the update side's sequence number.
> > + *
> > + * This function predicts what the grace period number will be the next
> > + * time an RCU callback will be executed, given the current grace period's
> > + * number. This can be gp+1 if RCU is idle, or gp+2 if a grace period is
> > + * already in progress.
> 
> How about something like this?
> 
>   This function returns the earliest value of the grace-period
>   sequence number that will indicate that a full grace period has
>   elapsed since the current time.  Once the grace-period sequence
>   number has reached this value, it will be safe to invoke all
>   callbacks that have been registered prior to the current time.
>   This value is the current grace-period number plus two to the
>   power of the number of low-order bits reserved for state, then
>   rounded up to the next value in which the state bits are all zero.

This makes sense too, but do you disagree with what I said?

I was kind of thinking of snap along the lines of how the previous code
worked. Where you were calling rcu_cbs_completed() or a function with a
similar name. Now we call _snap.

So basically I connected these 2 facts together to mean that rcu_seq_snap
also does that same thing as rcu_cbs_completed - which is basically it gives
the "next GP" where existing callbacks have already run and new callbacks
will run at the end of this "next GP".

> > + *
> > + * We do this with a single addition and masking.
> 
> Please either fold this sentence into rest of the paragraph or add a
> blank line after it.
> 
> > + * For example, if RCU_SEQ_STATE_MASK=1 and the least significant bit 
> > (LSB) of
> > + * the seq is used to track if a GP is in progress or not, its sufficient 
> > if we
> > + * add (2+1) and mask with ~1. Let's see why with an example:
> > + *
> > + * Say the current seq is 6 which is 0b110 (gp is 3 and state bit is 0).
> > + * To get the next GP number, we have to at least add 0b10 to this (0x1 << 
> > 1)
> > + * to account for the state bit. However, if the current seq is 7 (gp is 3 
> > and
> > + * state bit is 1), then it means the current grace period is already in
> > + * progress so the next time the callback will run is at the end of grace
> > + * period number gp+2. To account for the extra +1, we just overflow the 
> > LSB by
> > + * adding another 0x1 and masking with ~0x1. In case no GP was in progress 
> > (RCU
> > + * is idle), then the addition of the extra 0x1 and masking will have no
> > + * effect. This is calculated as below.
> > + */
> 
> Having the explicit numbers is good, but please use RCU_SEQ_STATE_MASK=3,
> since that is the current value.  One alternative (or perhaps addition)
> is to have a short table of numbers showing the mapping from *sp to the
> return value.  (I started from such a table when writing this function,
> for whatever that is worth.)

Ok I'll try to give a better example above. thanks,

Also just to let you know, thanks so much for elaborately providing an
example on the other thread where we are discussing the rcu_seq_done check. I
will take some time to trace this down and see if I can zero in on the same
understanding as yours.

I get why we use rcu_seq_snap there in rcu_start_this_gp but the way it its
used is 'c' is the requested GP obtained from _snap, and we are comparing that 
with the existing
rnp->gp_seq in rcu_seq_done.  When that rnp->gp_seq reaches 'c', it only
means rnp->gp_seq is done, it doesn't tell us if 'c' is done which is what
we were trying to check in that loop... that's why I felt that check wasn't
correct - that's my (most likely wrong) take on the matter, and I'll get back
once I trace this a bit more hopefully today :-P

thanks!

- Joel
 


Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-14 Thread Paul E. McKenney
On Sun, May 13, 2018 at 08:15:34PM -0700, Joel Fernandes (Google) wrote:
> rcu_seq_snap may be tricky for someone looking at it for the first time.
> Lets document how it works with an example to make it easier.
> 
> Signed-off-by: Joel Fernandes (Google) 
> ---
>  kernel/rcu/rcu.h | 24 +++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> index 003671825d62..fc3170914ac7 100644
> --- a/kernel/rcu/rcu.h
> +++ b/kernel/rcu/rcu.h
> @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long *sp)
>   WRITE_ONCE(*sp, rcu_seq_endval(sp));
>  }
> 
> -/* Take a snapshot of the update side's sequence number. */
> +/*
> + * Take a snapshot of the update side's sequence number.
> + *
> + * This function predicts what the grace period number will be the next
> + * time an RCU callback will be executed, given the current grace period's
> + * number. This can be gp+1 if RCU is idle, or gp+2 if a grace period is
> + * already in progress.

How about something like this?

This function returns the earliest value of the grace-period
sequence number that will indicate that a full grace period has
elapsed since the current time.  Once the grace-period sequence
number has reached this value, it will be safe to invoke all
callbacks that have been registered prior to the current time.
This value is the current grace-period number plus two to the
power of the number of low-order bits reserved for state, then
rounded up to the next value in which the state bits are all zero.

> + *
> + * We do this with a single addition and masking.

Please either fold this sentence into rest of the paragraph or add a
blank line after it.

> + * For example, if RCU_SEQ_STATE_MASK=1 and the least significant bit (LSB) 
> of
> + * the seq is used to track if a GP is in progress or not, its sufficient if 
> we
> + * add (2+1) and mask with ~1. Let's see why with an example:
> + *
> + * Say the current seq is 6 which is 0b110 (gp is 3 and state bit is 0).
> + * To get the next GP number, we have to at least add 0b10 to this (0x1 << 1)
> + * to account for the state bit. However, if the current seq is 7 (gp is 3 
> and
> + * state bit is 1), then it means the current grace period is already in
> + * progress so the next time the callback will run is at the end of grace
> + * period number gp+2. To account for the extra +1, we just overflow the LSB 
> by
> + * adding another 0x1 and masking with ~0x1. In case no GP was in progress 
> (RCU
> + * is idle), then the addition of the extra 0x1 and masking will have no
> + * effect. This is calculated as below.
> + */

Having the explicit numbers is good, but please use RCU_SEQ_STATE_MASK=3,
since that is the current value.  One alternative (or perhaps addition)
is to have a short table of numbers showing the mapping from *sp to the
return value.  (I started from such a table when writing this function,
for whatever that is worth.)

Thanx, Paul

>  static inline unsigned long rcu_seq_snap(unsigned long *sp)
>  {
>   unsigned long s;
> -- 
> 2.17.0.441.gb46fe60e1d-goog
> 



Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-14 Thread Paul E. McKenney
On Sun, May 13, 2018 at 08:15:34PM -0700, Joel Fernandes (Google) wrote:
> rcu_seq_snap may be tricky for someone looking at it for the first time.
> Lets document how it works with an example to make it easier.
> 
> Signed-off-by: Joel Fernandes (Google) 
> ---
>  kernel/rcu/rcu.h | 24 +++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> index 003671825d62..fc3170914ac7 100644
> --- a/kernel/rcu/rcu.h
> +++ b/kernel/rcu/rcu.h
> @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long *sp)
>   WRITE_ONCE(*sp, rcu_seq_endval(sp));
>  }
> 
> -/* Take a snapshot of the update side's sequence number. */
> +/*
> + * Take a snapshot of the update side's sequence number.
> + *
> + * This function predicts what the grace period number will be the next
> + * time an RCU callback will be executed, given the current grace period's
> + * number. This can be gp+1 if RCU is idle, or gp+2 if a grace period is
> + * already in progress.

How about something like this?

This function returns the earliest value of the grace-period
sequence number that will indicate that a full grace period has
elapsed since the current time.  Once the grace-period sequence
number has reached this value, it will be safe to invoke all
callbacks that have been registered prior to the current time.
This value is the current grace-period number plus two to the
power of the number of low-order bits reserved for state, then
rounded up to the next value in which the state bits are all zero.

> + *
> + * We do this with a single addition and masking.

Please either fold this sentence into rest of the paragraph or add a
blank line after it.

> + * For example, if RCU_SEQ_STATE_MASK=1 and the least significant bit (LSB) 
> of
> + * the seq is used to track if a GP is in progress or not, its sufficient if 
> we
> + * add (2+1) and mask with ~1. Let's see why with an example:
> + *
> + * Say the current seq is 6 which is 0b110 (gp is 3 and state bit is 0).
> + * To get the next GP number, we have to at least add 0b10 to this (0x1 << 1)
> + * to account for the state bit. However, if the current seq is 7 (gp is 3 
> and
> + * state bit is 1), then it means the current grace period is already in
> + * progress so the next time the callback will run is at the end of grace
> + * period number gp+2. To account for the extra +1, we just overflow the LSB 
> by
> + * adding another 0x1 and masking with ~0x1. In case no GP was in progress 
> (RCU
> + * is idle), then the addition of the extra 0x1 and masking will have no
> + * effect. This is calculated as below.
> + */

Having the explicit numbers is good, but please use RCU_SEQ_STATE_MASK=3,
since that is the current value.  One alternative (or perhaps addition)
is to have a short table of numbers showing the mapping from *sp to the
return value.  (I started from such a table when writing this function,
for whatever that is worth.)

Thanx, Paul

>  static inline unsigned long rcu_seq_snap(unsigned long *sp)
>  {
>   unsigned long s;
> -- 
> 2.17.0.441.gb46fe60e1d-goog
> 



Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-13 Thread Joel Fernandes
On Sun, May 13, 2018 at 08:47:24PM -0700, Randy Dunlap wrote:
> On 05/13/2018 08:15 PM, Joel Fernandes (Google) wrote:
> > rcu_seq_snap may be tricky for someone looking at it for the first time.
> > Lets document how it works with an example to make it easier.
> > 
> > Signed-off-by: Joel Fernandes (Google) 
> > ---
> >  kernel/rcu/rcu.h | 24 +++-
> >  1 file changed, 23 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > index 003671825d62..fc3170914ac7 100644
> > --- a/kernel/rcu/rcu.h
> > +++ b/kernel/rcu/rcu.h
> > @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long *sp)
> > WRITE_ONCE(*sp, rcu_seq_endval(sp));
> >  }
> >  
> > -/* Take a snapshot of the update side's sequence number. */
> > +/*
> > + * Take a snapshot of the update side's sequence number.
> > + *
> > + * This function predicts what the grace period number will be the next
> > + * time an RCU callback will be executed, given the current grace period's
> > + * number. This can be gp+1 if RCU is idle, or gp+2 if a grace period is
> > + * already in progress.
> > + *
> > + * We do this with a single addition and masking.
> > + * For example, if RCU_SEQ_STATE_MASK=1 and the least significant bit 
> > (LSB) of
> > + * the seq is used to track if a GP is in progress or not, its sufficient 
> > if we
> 
>   it's

Pardon my english. Fixed, thanks,

- Joel



Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-13 Thread Joel Fernandes
On Sun, May 13, 2018 at 08:47:24PM -0700, Randy Dunlap wrote:
> On 05/13/2018 08:15 PM, Joel Fernandes (Google) wrote:
> > rcu_seq_snap may be tricky for someone looking at it for the first time.
> > Lets document how it works with an example to make it easier.
> > 
> > Signed-off-by: Joel Fernandes (Google) 
> > ---
> >  kernel/rcu/rcu.h | 24 +++-
> >  1 file changed, 23 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > index 003671825d62..fc3170914ac7 100644
> > --- a/kernel/rcu/rcu.h
> > +++ b/kernel/rcu/rcu.h
> > @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long *sp)
> > WRITE_ONCE(*sp, rcu_seq_endval(sp));
> >  }
> >  
> > -/* Take a snapshot of the update side's sequence number. */
> > +/*
> > + * Take a snapshot of the update side's sequence number.
> > + *
> > + * This function predicts what the grace period number will be the next
> > + * time an RCU callback will be executed, given the current grace period's
> > + * number. This can be gp+1 if RCU is idle, or gp+2 if a grace period is
> > + * already in progress.
> > + *
> > + * We do this with a single addition and masking.
> > + * For example, if RCU_SEQ_STATE_MASK=1 and the least significant bit 
> > (LSB) of
> > + * the seq is used to track if a GP is in progress or not, its sufficient 
> > if we
> 
>   it's

Pardon my english. Fixed, thanks,

- Joel



Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-13 Thread Randy Dunlap
On 05/13/2018 08:15 PM, Joel Fernandes (Google) wrote:
> rcu_seq_snap may be tricky for someone looking at it for the first time.
> Lets document how it works with an example to make it easier.
> 
> Signed-off-by: Joel Fernandes (Google) 
> ---
>  kernel/rcu/rcu.h | 24 +++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> index 003671825d62..fc3170914ac7 100644
> --- a/kernel/rcu/rcu.h
> +++ b/kernel/rcu/rcu.h
> @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long *sp)
>   WRITE_ONCE(*sp, rcu_seq_endval(sp));
>  }
>  
> -/* Take a snapshot of the update side's sequence number. */
> +/*
> + * Take a snapshot of the update side's sequence number.
> + *
> + * This function predicts what the grace period number will be the next
> + * time an RCU callback will be executed, given the current grace period's
> + * number. This can be gp+1 if RCU is idle, or gp+2 if a grace period is
> + * already in progress.
> + *
> + * We do this with a single addition and masking.
> + * For example, if RCU_SEQ_STATE_MASK=1 and the least significant bit (LSB) 
> of
> + * the seq is used to track if a GP is in progress or not, its sufficient if 
> we

  it's

> + * add (2+1) and mask with ~1. Let's see why with an example:
> + *
> + * Say the current seq is 6 which is 0b110 (gp is 3 and state bit is 0).
> + * To get the next GP number, we have to at least add 0b10 to this (0x1 << 1)
> + * to account for the state bit. However, if the current seq is 7 (gp is 3 
> and
> + * state bit is 1), then it means the current grace period is already in
> + * progress so the next time the callback will run is at the end of grace
> + * period number gp+2. To account for the extra +1, we just overflow the LSB 
> by
> + * adding another 0x1 and masking with ~0x1. In case no GP was in progress 
> (RCU
> + * is idle), then the addition of the extra 0x1 and masking will have no
> + * effect. This is calculated as below.
> + */
>  static inline unsigned long rcu_seq_snap(unsigned long *sp)
>  {
>   unsigned long s;
> 


-- 
~Randy


Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-13 Thread Randy Dunlap
On 05/13/2018 08:15 PM, Joel Fernandes (Google) wrote:
> rcu_seq_snap may be tricky for someone looking at it for the first time.
> Lets document how it works with an example to make it easier.
> 
> Signed-off-by: Joel Fernandes (Google) 
> ---
>  kernel/rcu/rcu.h | 24 +++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> index 003671825d62..fc3170914ac7 100644
> --- a/kernel/rcu/rcu.h
> +++ b/kernel/rcu/rcu.h
> @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long *sp)
>   WRITE_ONCE(*sp, rcu_seq_endval(sp));
>  }
>  
> -/* Take a snapshot of the update side's sequence number. */
> +/*
> + * Take a snapshot of the update side's sequence number.
> + *
> + * This function predicts what the grace period number will be the next
> + * time an RCU callback will be executed, given the current grace period's
> + * number. This can be gp+1 if RCU is idle, or gp+2 if a grace period is
> + * already in progress.
> + *
> + * We do this with a single addition and masking.
> + * For example, if RCU_SEQ_STATE_MASK=1 and the least significant bit (LSB) 
> of
> + * the seq is used to track if a GP is in progress or not, its sufficient if 
> we

  it's

> + * add (2+1) and mask with ~1. Let's see why with an example:
> + *
> + * Say the current seq is 6 which is 0b110 (gp is 3 and state bit is 0).
> + * To get the next GP number, we have to at least add 0b10 to this (0x1 << 1)
> + * to account for the state bit. However, if the current seq is 7 (gp is 3 
> and
> + * state bit is 1), then it means the current grace period is already in
> + * progress so the next time the callback will run is at the end of grace
> + * period number gp+2. To account for the extra +1, we just overflow the LSB 
> by
> + * adding another 0x1 and masking with ~0x1. In case no GP was in progress 
> (RCU
> + * is idle), then the addition of the extra 0x1 and masking will have no
> + * effect. This is calculated as below.
> + */
>  static inline unsigned long rcu_seq_snap(unsigned long *sp)
>  {
>   unsigned long s;
> 


-- 
~Randy


[PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-13 Thread Joel Fernandes (Google)
rcu_seq_snap may be tricky for someone looking at it for the first time.
Lets document how it works with an example to make it easier.

Signed-off-by: Joel Fernandes (Google) 
---
 kernel/rcu/rcu.h | 24 +++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
index 003671825d62..fc3170914ac7 100644
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long *sp)
WRITE_ONCE(*sp, rcu_seq_endval(sp));
 }
 
-/* Take a snapshot of the update side's sequence number. */
+/*
+ * Take a snapshot of the update side's sequence number.
+ *
+ * This function predicts what the grace period number will be the next
+ * time an RCU callback will be executed, given the current grace period's
+ * number. This can be gp+1 if RCU is idle, or gp+2 if a grace period is
+ * already in progress.
+ *
+ * We do this with a single addition and masking.
+ * For example, if RCU_SEQ_STATE_MASK=1 and the least significant bit (LSB) of
+ * the seq is used to track if a GP is in progress or not, its sufficient if we
+ * add (2+1) and mask with ~1. Let's see why with an example:
+ *
+ * Say the current seq is 6 which is 0b110 (gp is 3 and state bit is 0).
+ * To get the next GP number, we have to at least add 0b10 to this (0x1 << 1)
+ * to account for the state bit. However, if the current seq is 7 (gp is 3 and
+ * state bit is 1), then it means the current grace period is already in
+ * progress so the next time the callback will run is at the end of grace
+ * period number gp+2. To account for the extra +1, we just overflow the LSB by
+ * adding another 0x1 and masking with ~0x1. In case no GP was in progress (RCU
+ * is idle), then the addition of the extra 0x1 and masking will have no
+ * effect. This is calculated as below.
+ */
 static inline unsigned long rcu_seq_snap(unsigned long *sp)
 {
unsigned long s;
-- 
2.17.0.441.gb46fe60e1d-goog



[PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-13 Thread Joel Fernandes (Google)
rcu_seq_snap may be tricky for someone looking at it for the first time.
Lets document how it works with an example to make it easier.

Signed-off-by: Joel Fernandes (Google) 
---
 kernel/rcu/rcu.h | 24 +++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
index 003671825d62..fc3170914ac7 100644
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long *sp)
WRITE_ONCE(*sp, rcu_seq_endval(sp));
 }
 
-/* Take a snapshot of the update side's sequence number. */
+/*
+ * Take a snapshot of the update side's sequence number.
+ *
+ * This function predicts what the grace period number will be the next
+ * time an RCU callback will be executed, given the current grace period's
+ * number. This can be gp+1 if RCU is idle, or gp+2 if a grace period is
+ * already in progress.
+ *
+ * We do this with a single addition and masking.
+ * For example, if RCU_SEQ_STATE_MASK=1 and the least significant bit (LSB) of
+ * the seq is used to track if a GP is in progress or not, its sufficient if we
+ * add (2+1) and mask with ~1. Let's see why with an example:
+ *
+ * Say the current seq is 6 which is 0b110 (gp is 3 and state bit is 0).
+ * To get the next GP number, we have to at least add 0b10 to this (0x1 << 1)
+ * to account for the state bit. However, if the current seq is 7 (gp is 3 and
+ * state bit is 1), then it means the current grace period is already in
+ * progress so the next time the callback will run is at the end of grace
+ * period number gp+2. To account for the extra +1, we just overflow the LSB by
+ * adding another 0x1 and masking with ~0x1. In case no GP was in progress (RCU
+ * is idle), then the addition of the extra 0x1 and masking will have no
+ * effect. This is calculated as below.
+ */
 static inline unsigned long rcu_seq_snap(unsigned long *sp)
 {
unsigned long s;
-- 
2.17.0.441.gb46fe60e1d-goog