Re: [PATCH 3/3] rcu/trace: Add name of the source for gp_seq to prevent confusion
On Sat, Jun 20, 2020 at 01:36:08AM -0400, Joel Fernandes wrote: > On Fri, Jun 19, 2020 at 10:40:01AM -0700, Paul E. McKenney wrote: > > On Thu, Jun 18, 2020 at 10:07:18PM -0400, Joel Fernandes wrote: > > > On Thu, Jun 18, 2020 at 09:36:41PM -0400, Joel Fernandes (Google) wrote: > > > [...] > > > > @@ -2019,7 +2019,7 @@ static int __noreturn rcu_gp_kthread(void *unused) > > > > cond_resched_tasks_rcu_qs(); > > > > WRITE_ONCE(rcu_state.gp_activity, jiffies); > > > > WARN_ON(signal_pending(current)); > > > > - trace_rcu_grace_period(rcu_state.name, > > > > rcu_state.gp_seq, > > > > + trace_rcu_grace_period(rcu_state.name, > > > > TPS("rsp"), rcu_state.gp_seq, > > > >TPS("reqwaitsig")); > > > > } > > > > > > > > @@ -2263,7 +2263,7 @@ int rcutree_dying_cpu(unsigned int cpu) > > > > return 0; > > > > > > > > blkd = !!(rnp->qsmask & rdp->grpmask); > > > > - trace_rcu_grace_period(rcu_state.name, READ_ONCE(rnp->gp_seq), > > > > + trace_rcu_grace_period(rcu_state.name, TPS("rsp"), > > > > READ_ONCE(rnp->gp_seq), > > > > > > This should be: TPS("rnp") :-( > > > > > > Happy to fix it up and resend if you'd like. Thanks! > > > > I queued and pushed 1/2 and 2/2. > > Thanks! > > > but again, I am still not at all > > convinced by 3/3. If you want to make RCU trace output human > > readable, post-processing will be needed. > > Or I could post-process the code before building it since the pattern seems > easy to parse ;-) For this one thing, perhaps. For most other information of interest, doing so in-kernel would not be so good, for example, from a lock-contention viewpoint. Thanx, Paul
Re: [PATCH 3/3] rcu/trace: Add name of the source for gp_seq to prevent confusion
On Fri, Jun 19, 2020 at 10:40:01AM -0700, Paul E. McKenney wrote: > On Thu, Jun 18, 2020 at 10:07:18PM -0400, Joel Fernandes wrote: > > On Thu, Jun 18, 2020 at 09:36:41PM -0400, Joel Fernandes (Google) wrote: > > [...] > > > @@ -2019,7 +2019,7 @@ static int __noreturn rcu_gp_kthread(void *unused) > > > cond_resched_tasks_rcu_qs(); > > > WRITE_ONCE(rcu_state.gp_activity, jiffies); > > > WARN_ON(signal_pending(current)); > > > - trace_rcu_grace_period(rcu_state.name, rcu_state.gp_seq, > > > + trace_rcu_grace_period(rcu_state.name, TPS("rsp"), > > > rcu_state.gp_seq, > > > TPS("reqwaitsig")); > > > } > > > > > > @@ -2263,7 +2263,7 @@ int rcutree_dying_cpu(unsigned int cpu) > > > return 0; > > > > > > blkd = !!(rnp->qsmask & rdp->grpmask); > > > - trace_rcu_grace_period(rcu_state.name, READ_ONCE(rnp->gp_seq), > > > + trace_rcu_grace_period(rcu_state.name, TPS("rsp"), > > > READ_ONCE(rnp->gp_seq), > > > > This should be: TPS("rnp") :-( > > > > Happy to fix it up and resend if you'd like. Thanks! > > I queued and pushed 1/2 and 2/2. Thanks! > but again, I am still not at all > convinced by 3/3. If you want to make RCU trace output human > readable, post-processing will be needed. Or I could post-process the code before building it since the pattern seems easy to parse ;-) - Joel
Re: [PATCH 3/3] rcu/trace: Add name of the source for gp_seq to prevent confusion
On Thu, Jun 18, 2020 at 10:07:18PM -0400, Joel Fernandes wrote: > On Thu, Jun 18, 2020 at 09:36:41PM -0400, Joel Fernandes (Google) wrote: > [...] > > @@ -2019,7 +2019,7 @@ static int __noreturn rcu_gp_kthread(void *unused) > > cond_resched_tasks_rcu_qs(); > > WRITE_ONCE(rcu_state.gp_activity, jiffies); > > WARN_ON(signal_pending(current)); > > - trace_rcu_grace_period(rcu_state.name, rcu_state.gp_seq, > > + trace_rcu_grace_period(rcu_state.name, TPS("rsp"), > > rcu_state.gp_seq, > >TPS("reqwaitsig")); > > } > > > > @@ -2263,7 +2263,7 @@ int rcutree_dying_cpu(unsigned int cpu) > > return 0; > > > > blkd = !!(rnp->qsmask & rdp->grpmask); > > - trace_rcu_grace_period(rcu_state.name, READ_ONCE(rnp->gp_seq), > > + trace_rcu_grace_period(rcu_state.name, TPS("rsp"), > > READ_ONCE(rnp->gp_seq), > > This should be: TPS("rnp") :-( > > Happy to fix it up and resend if you'd like. Thanks! I queued and pushed 1/2 and 2/2, but again, I am still not at all convinced by 3/3. If you want to make RCU trace output human readable, post-processing will be needed. Thanx, Paul
Re: [PATCH 3/3] rcu/trace: Add name of the source for gp_seq to prevent confusion
On Thu, Jun 18, 2020 at 09:36:41PM -0400, Joel Fernandes (Google) wrote: [...] > @@ -2019,7 +2019,7 @@ static int __noreturn rcu_gp_kthread(void *unused) > cond_resched_tasks_rcu_qs(); > WRITE_ONCE(rcu_state.gp_activity, jiffies); > WARN_ON(signal_pending(current)); > - trace_rcu_grace_period(rcu_state.name, rcu_state.gp_seq, > + trace_rcu_grace_period(rcu_state.name, TPS("rsp"), > rcu_state.gp_seq, > TPS("reqwaitsig")); > } > > @@ -2263,7 +2263,7 @@ int rcutree_dying_cpu(unsigned int cpu) > return 0; > > blkd = !!(rnp->qsmask & rdp->grpmask); > - trace_rcu_grace_period(rcu_state.name, READ_ONCE(rnp->gp_seq), > + trace_rcu_grace_period(rcu_state.name, TPS("rsp"), > READ_ONCE(rnp->gp_seq), This should be: TPS("rnp") :-( Happy to fix it up and resend if you'd like. Thanks!
[PATCH 3/3] rcu/trace: Add name of the source for gp_seq to prevent confusion
The gp_seq value can come from either of rdp, rsp or rnp. This makes the rcu_grace_period tracepoint extremely confusing since once if left constantly applying effort to reason about what a gp_seq means. Only the rsp is the global source of truth (most accurate GP info). The rnp can be off by ~1 and the rdp can be off by way more. Add some more context to traces to clarify where it comes from. While this can be resolved using some trace post-processing, such post- processing does need extra maintenance of scripts. Just make it easy by adding more context in the trace itself. Signed-off-by: Joel Fernandes (Google) --- include/trace/events/rcu.h | 12 kernel/rcu/tree.c | 32 kernel/rcu/tree_plugin.h | 4 ++-- 3 files changed, 26 insertions(+), 22 deletions(-) diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h index 155b5cb43cfd3..bf2d9f2e25f85 100644 --- a/include/trace/events/rcu.h +++ b/include/trace/events/rcu.h @@ -68,24 +68,28 @@ TRACE_EVENT(rcu_utilization, */ TRACE_EVENT_RCU(rcu_grace_period, - TP_PROTO(const char *rcuname, unsigned long gp_seq, const char *gpevent), + TP_PROTO(const char *rcuname, const char *gp_seq_src, + unsigned long gp_seq, const char *gpevent), - TP_ARGS(rcuname, gp_seq, gpevent), + TP_ARGS(rcuname, gp_seq_src, gp_seq, gpevent), TP_STRUCT__entry( __field(const char *, rcuname) + __field(const char *, gp_seq_src) __field(long, gp_seq) __field(const char *, gpevent) ), TP_fast_assign( __entry->rcuname = rcuname; + __entry->gp_seq_src = gp_seq_src; __entry->gp_seq = (long)gp_seq; __entry->gpevent = gpevent; ), - TP_printk("%s %ld %s", - __entry->rcuname, __entry->gp_seq, __entry->gpevent) + TP_printk("%s %s_gp_seq=%ld %s", + __entry->rcuname, __entry->gp_seq_src, + __entry->gp_seq, __entry->gpevent) ); /* diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index d0988a1c1079d..b625811f529fa 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1367,7 +1367,7 @@ static bool rcu_start_this_gp(struct rcu_node *rnp_start, struct rcu_data *rdp, trace_rcu_this_gp(rnp, rdp, gp_seq_req, TPS("NoGPkthread")); goto unlock_out; } - trace_rcu_grace_period(rcu_state.name, data_race(rcu_state.gp_seq), TPS("newreq")); + trace_rcu_grace_period(rcu_state.name, TPS("rsp"), data_race(rcu_state.gp_seq), TPS("newreq")); ret = true; /* Caller must wake GP kthread. */ unlock_out: /* Push furthest requested GP to leaf node and rcu_data structure. */ @@ -1464,9 +1464,9 @@ static bool rcu_accelerate_cbs(struct rcu_node *rnp, struct rcu_data *rdp) /* Trace depending on how much we were able to accelerate. */ if (rcu_segcblist_restempty(>cblist, RCU_WAIT_TAIL)) - trace_rcu_grace_period(rcu_state.name, gp_seq_req, TPS("AccWaitCB")); + trace_rcu_grace_period(rcu_state.name, TPS("req"), gp_seq_req, TPS("AccWaitCB")); else - trace_rcu_grace_period(rcu_state.name, gp_seq_req, TPS("AccReadyCB")); + trace_rcu_grace_period(rcu_state.name, TPS("req"), gp_seq_req, TPS("AccReadyCB")); return ret; } @@ -1566,7 +1566,7 @@ static bool __note_gp_changes(struct rcu_node *rnp, struct rcu_data *rdp) if (!offloaded) ret = rcu_advance_cbs(rnp, rdp); /* Advance CBs. */ rdp->core_needs_qs = false; - trace_rcu_grace_period(rcu_state.name, rdp->gp_seq, TPS("cpuend")); + trace_rcu_grace_period(rcu_state.name, TPS("rdp"), rdp->gp_seq, TPS("cpuend")); } else { if (!offloaded) ret = rcu_accelerate_cbs(rnp, rdp); /* Recent CBs. */ @@ -1582,7 +1582,7 @@ static bool __note_gp_changes(struct rcu_node *rnp, struct rcu_data *rdp) * set up to detect a quiescent state, otherwise don't * go looking for one. */ - trace_rcu_grace_period(rcu_state.name, rnp->gp_seq, TPS("cpustart")); + trace_rcu_grace_period(rcu_state.name, TPS("rnp"), rnp->gp_seq, TPS("cpustart")); need_qs = !!(rnp->qsmask & rdp->grpmask); rdp->cpu_no_qs.b.norm = need_qs; rdp->core_needs_qs = need_qs; @@ -1683,7 +1683,7 @@ static bool rcu_gp_init(void) /* Record GP times before starting GP, hence rcu_seq_start(). */ rcu_seq_start(_state.gp_seq); ASSERT_EXCLUSIVE_WRITER(rcu_state.gp_seq); - trace_rcu_grace_period(rcu_state.name, rcu_state.gp_seq, TPS("start")); + trace_rcu_grace_period(rcu_state.name, TPS("rsp"), rcu_state.gp_seq, TPS("start"));