Re: RCU NOHZ, tsc, and clock_gettime

2012-11-13 Thread Prarit Bhargava
On 11/12/2012 06:27 PM, John Stultz wrote: > Hey Prarit, > Just back from being on leave, and wanted to check in on this. Did you > ever > get to run with an increase sample size to see how that affected things? Its > exactly your point that the non-NOHZ case could align the execution of

Re: RCU NOHZ, tsc, and clock_gettime

2012-11-13 Thread Prarit Bhargava
On 11/12/2012 06:27 PM, John Stultz wrote: Hey Prarit, Just back from being on leave, and wanted to check in on this. Did you ever get to run with an increase sample size to see how that affected things? Its exactly your point that the non-NOHZ case could align the execution of a

Re: RCU NOHZ, tsc, and clock_gettime

2012-11-12 Thread John Stultz
On 10/12/2012 08:40 AM, Prarit Bhargava wrote: One possibility is that if the cpu we're doing our timekeeping accumulation on is different then the one running the test, we might go into deeper idle for longer periods of time. Then when we accumulate time, we have more then a single tick to

Re: RCU NOHZ, tsc, and clock_gettime

2012-11-12 Thread John Stultz
On 10/12/2012 08:40 AM, Prarit Bhargava wrote: One possibility is that if the cpu we're doing our timekeeping accumulation on is different then the one running the test, we might go into deeper idle for longer periods of time. Then when we accumulate time, we have more then a single tick to

Re: RCU NOHZ, tsc, and clock_gettime

2012-10-15 Thread Paul E. McKenney
On Fri, Oct 12, 2012 at 11:40:44AM -0400, Prarit Bhargava wrote: > On 10/11/2012 04:21 PM, Paul E. McKenney wrote: > > On Thu, Oct 11, 2012 at 12:51:44PM -0700, John Stultz wrote: [ . . . ] > >> One possibility is that if the cpu we're doing our timekeeping > >> accumulation on is different then

Re: RCU NOHZ, tsc, and clock_gettime

2012-10-15 Thread Paul E. McKenney
On Fri, Oct 12, 2012 at 11:40:44AM -0400, Prarit Bhargava wrote: On 10/11/2012 04:21 PM, Paul E. McKenney wrote: On Thu, Oct 11, 2012 at 12:51:44PM -0700, John Stultz wrote: [ . . . ] One possibility is that if the cpu we're doing our timekeeping accumulation on is different then the one

Re: RCU NOHZ, tsc, and clock_gettime

2012-10-14 Thread Paul E. McKenney
On Fri, Oct 12, 2012 at 02:27:01PM -0400, Prarit Bhargava wrote: > > > The effect of removing the two functions you noted (on 3.6 and earlier) > > is to prevent RCU from checking for dyntick-idle CPUs, likely incurring > > a cache miss for each CPU with interrupts disabled. If you have a lot > >

Re: RCU NOHZ, tsc, and clock_gettime

2012-10-14 Thread Paul E. McKenney
On Fri, Oct 12, 2012 at 02:27:01PM -0400, Prarit Bhargava wrote: The effect of removing the two functions you noted (on 3.6 and earlier) is to prevent RCU from checking for dyntick-idle CPUs, likely incurring a cache miss for each CPU with interrupts disabled. If you have a lot of CPUs

Re: RCU NOHZ, tsc, and clock_gettime

2012-10-12 Thread Prarit Bhargava
> The effect of removing the two functions you noted (on 3.6 and earlier) > is to prevent RCU from checking for dyntick-idle CPUs, likely incurring > a cache miss for each CPU with interrupts disabled. If you have a lot > of CPUs (or even if NR_CPUS is large and you have a smaller number of >

Re: RCU NOHZ, tsc, and clock_gettime

2012-10-12 Thread Prarit Bhargava
On 10/11/2012 04:21 PM, Paul E. McKenney wrote: > On Thu, Oct 11, 2012 at 12:51:44PM -0700, John Stultz wrote: >> On 10/11/2012 11:52 AM, Prarit Bhargava wrote: >>> I've been tracking an odd bug that may involve the RCU NOHZ code and >>> just want to know if you have any ideas on debugging

Re: RCU NOHZ, tsc, and clock_gettime

2012-10-12 Thread Prarit Bhargava
On 10/11/2012 04:21 PM, Paul E. McKenney wrote: On Thu, Oct 11, 2012 at 12:51:44PM -0700, John Stultz wrote: On 10/11/2012 11:52 AM, Prarit Bhargava wrote: I've been tracking an odd bug that may involve the RCU NOHZ code and just want to know if you have any ideas on debugging and/or what

Re: RCU NOHZ, tsc, and clock_gettime

2012-10-12 Thread Prarit Bhargava
The effect of removing the two functions you noted (on 3.6 and earlier) is to prevent RCU from checking for dyntick-idle CPUs, likely incurring a cache miss for each CPU with interrupts disabled. If you have a lot of CPUs (or even if NR_CPUS is large and you have a smaller number of CPUs),

Re: RCU NOHZ, tsc, and clock_gettime

2012-10-11 Thread Paul E. McKenney
On Thu, Oct 11, 2012 at 12:51:44PM -0700, John Stultz wrote: > On 10/11/2012 11:52 AM, Prarit Bhargava wrote: > >I've been tracking an odd bug that may involve the RCU NOHZ code and > >just want to know if you have any ideas on debugging and/or what might be > >wrong. Note the bug happens on

Re: RCU NOHZ, tsc, and clock_gettime

2012-10-11 Thread John Stultz
On 10/11/2012 11:52 AM, Prarit Bhargava wrote: I've been tracking an odd bug that may involve the RCU NOHZ code and just want to know if you have any ideas on debugging and/or what might be wrong. Note the bug happens on *BOTH* upstream and the current RHEL6 tree. The data in this email is from

RCU NOHZ, tsc, and clock_gettime

2012-10-11 Thread Prarit Bhargava
I've been tracking an odd bug that may involve the RCU NOHZ code and just want to know if you have any ideas on debugging and/or what might be wrong. Note the bug happens on *BOTH* upstream and the current RHEL6 tree. The data in this email is from running on RHEL6 because that's what I happen to

RCU NOHZ, tsc, and clock_gettime

2012-10-11 Thread Prarit Bhargava
I've been tracking an odd bug that may involve the RCU NOHZ code and just want to know if you have any ideas on debugging and/or what might be wrong. Note the bug happens on *BOTH* upstream and the current RHEL6 tree. The data in this email is from running on RHEL6 because that's what I happen to

Re: RCU NOHZ, tsc, and clock_gettime

2012-10-11 Thread John Stultz
On 10/11/2012 11:52 AM, Prarit Bhargava wrote: I've been tracking an odd bug that may involve the RCU NOHZ code and just want to know if you have any ideas on debugging and/or what might be wrong. Note the bug happens on *BOTH* upstream and the current RHEL6 tree. The data in this email is from

Re: RCU NOHZ, tsc, and clock_gettime

2012-10-11 Thread Paul E. McKenney
On Thu, Oct 11, 2012 at 12:51:44PM -0700, John Stultz wrote: On 10/11/2012 11:52 AM, Prarit Bhargava wrote: I've been tracking an odd bug that may involve the RCU NOHZ code and just want to know if you have any ideas on debugging and/or what might be wrong. Note the bug happens on *BOTH*