Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-06 Thread Peter Zijlstra
On Fri, Jul 05, 2013 at 11:50:05PM +0200, Thomas Gleixner wrote:
> On Fri, 5 Jul 2013, Peter Zijlstra wrote:
> > On Fri, Jul 05, 2013 at 05:24:09PM +0200, Thomas Gleixner wrote:
> > > See arch/x86/kernel/tsc.c
> > > 
> > > We disable the watchdog for the TSC when tsc_clocksource_reliable is
> > > set.
> > > 
> > > tsc_clocksource_reliable is set when:
> > > 
> > >  - you add tsc=reliable to the kernel command line
> > 
> > Ah, I didn't know about that one, useful.
> > 
> > >  - boot_cpu_has(X86_FEATURE_TSC_RELIABLE)
> > >  
> > >X86_FEATURE_TSC_RELIABLE is a software flag, set by vmware and
> > >moorsetown. So all other machines keep the watchdog enabled.
> > 
> > Right.. I knew it was enabled on my machines even though they normally
> > have usable TSC.
> 
> Yeah, but our well justified paranoia still prevents us from trusting
> these CPU flags. Maybe some day BIOS is going to be replaced by
> something useful. You know: Hope springs eternal

Oh quite agreed. Its just that at several times I've wanted to disable the
thing. Now I know you can do using the kernel cmdline. Previously I had to
wreck code -- not that much harder really :-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-06 Thread Borislav Petkov
On Sat, Jul 06, 2013 at 12:17:46AM +0200, Thomas Gleixner wrote:
> Good news! 10 years is way less than eternity and just before
> retirement :)

You know that after the 10 years they'll come up with an even uglier
platform-differentiation-fiddle-with-dong-while-smoking-crack-crap which
will even replace the OS, right?

Hmm, I'm wondering what would be faster: wait *at least* 10 more years
or get an old mainboard and start experimenting with coreboot...

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-06 Thread Borislav Petkov
On Sat, Jul 06, 2013 at 12:17:46AM +0200, Thomas Gleixner wrote:
 Good news! 10 years is way less than eternity and just before
 retirement :)

You know that after the 10 years they'll come up with an even uglier
platform-differentiation-fiddle-with-dong-while-smoking-crack-crap which
will even replace the OS, right?

Hmm, I'm wondering what would be faster: wait *at least* 10 more years
or get an old mainboard and start experimenting with coreboot...

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-06 Thread Peter Zijlstra
On Fri, Jul 05, 2013 at 11:50:05PM +0200, Thomas Gleixner wrote:
 On Fri, 5 Jul 2013, Peter Zijlstra wrote:
  On Fri, Jul 05, 2013 at 05:24:09PM +0200, Thomas Gleixner wrote:
   See arch/x86/kernel/tsc.c
   
   We disable the watchdog for the TSC when tsc_clocksource_reliable is
   set.
   
   tsc_clocksource_reliable is set when:
   
- you add tsc=reliable to the kernel command line
  
  Ah, I didn't know about that one, useful.
  
- boot_cpu_has(X86_FEATURE_TSC_RELIABLE)

  X86_FEATURE_TSC_RELIABLE is a software flag, set by vmware and
  moorsetown. So all other machines keep the watchdog enabled.
  
  Right.. I knew it was enabled on my machines even though they normally
  have usable TSC.
 
 Yeah, but our well justified paranoia still prevents us from trusting
 these CPU flags. Maybe some day BIOS is going to be replaced by
 something useful. You know: Hope springs eternal

Oh quite agreed. Its just that at several times I've wanted to disable the
thing. Now I know you can do using the kernel cmdline. Previously I had to
wreck code -- not that much harder really :-)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-05 Thread Thomas Gleixner
On Fri, 5 Jul 2013, Borislav Petkov wrote:
> On Fri, Jul 05, 2013 at 11:50:05PM +0200, Thomas Gleixner wrote:
> > Yeah, but our well justified paranoia still prevents us from trusting
> > these CPU flags. Maybe some day BIOS is going to be replaced by
> > something useful. You know: Hope springs eternal
> 
> Not in the next 10 yrs at least if one took a look at the
> overengineered, obese at birth and braindead crap by the name of UEFI.

Good news! 10 years is way less than eternity and just before
retirement :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-05 Thread Borislav Petkov
On Fri, Jul 05, 2013 at 11:50:05PM +0200, Thomas Gleixner wrote:
> Yeah, but our well justified paranoia still prevents us from trusting
> these CPU flags. Maybe some day BIOS is going to be replaced by
> something useful. You know: Hope springs eternal

Not in the next 10 yrs at least if one took a look at the
overengineered, obese at birth and braindead crap by the name of UEFI.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-05 Thread Thomas Gleixner
On Fri, 5 Jul 2013, Peter Zijlstra wrote:
> On Fri, Jul 05, 2013 at 05:24:09PM +0200, Thomas Gleixner wrote:
> > See arch/x86/kernel/tsc.c
> > 
> > We disable the watchdog for the TSC when tsc_clocksource_reliable is
> > set.
> > 
> > tsc_clocksource_reliable is set when:
> > 
> >  - you add tsc=reliable to the kernel command line
> 
> Ah, I didn't know about that one, useful.
> 
> >  - boot_cpu_has(X86_FEATURE_TSC_RELIABLE)
> >  
> >X86_FEATURE_TSC_RELIABLE is a software flag, set by vmware and
> >moorsetown. So all other machines keep the watchdog enabled.
> 
> Right.. I knew it was enabled on my machines even though they normally
> have usable TSC.

Yeah, but our well justified paranoia still prevents us from trusting
these CPU flags. Maybe some day BIOS is going to be replaced by
something useful. You know: Hope springs eternal




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-05 Thread Peter Zijlstra
On Fri, Jul 05, 2013 at 05:24:09PM +0200, Thomas Gleixner wrote:
> See arch/x86/kernel/tsc.c
> 
> We disable the watchdog for the TSC when tsc_clocksource_reliable is
> set.
> 
> tsc_clocksource_reliable is set when:
> 
>  - you add tsc=reliable to the kernel command line

Ah, I didn't know about that one, useful.

>  - boot_cpu_has(X86_FEATURE_TSC_RELIABLE)
>  
>X86_FEATURE_TSC_RELIABLE is a software flag, set by vmware and
>moorsetown. So all other machines keep the watchdog enabled.

Right.. I knew it was enabled on my machines even though they normally
have usable TSC.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-05 Thread Thomas Gleixner
On Fri, 5 Jul 2013, Peter Zijlstra wrote:

> On Fri, Jul 05, 2013 at 04:23:33PM +0200, Frederic Weisbecker wrote:
> > Nope, I haven't touched that. I prefer not to fiddle with unstable
> > clocksource for now :)
> > 
> > As for unstable TSCs, if sched_clock_tick() needs to be fed, we simply
> > don't stop the tick.
> 
> Not entirely the same thing; I thought the clocksource watchdog was ran
> even when we have a 'stable' TSC, just to make sure it stays stable.
> There's known cases where the BIOS f*cks us over and wrecks TSC sync.

See arch/x86/kernel/tsc.c

We disable the watchdog for the TSC when tsc_clocksource_reliable is
set.

tsc_clocksource_reliable is set when:

 - you add tsc=reliable to the kernel command line

 - boot_cpu_has(X86_FEATURE_TSC_RELIABLE)
 
   X86_FEATURE_TSC_RELIABLE is a software flag, set by vmware and
   moorsetown. So all other machines keep the watchdog enabled.

 - On Geode LX (OLPC)

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-05 Thread Peter Zijlstra
On Fri, Jul 05, 2013 at 04:23:33PM +0200, Frederic Weisbecker wrote:
> Nope, I haven't touched that. I prefer not to fiddle with unstable
> clocksource for now :)
> 
> As for unstable TSCs, if sched_clock_tick() needs to be fed, we simply
> don't stop the tick.

Not entirely the same thing; I thought the clocksource watchdog was ran
even when we have a 'stable' TSC, just to make sure it stays stable.
There's known cases where the BIOS f*cks us over and wrecks TSC sync.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-05 Thread Frederic Weisbecker
2013/7/4 Peter Zijlstra :
> On Thu, Jul 04, 2013 at 01:34:13PM +0800, Alex Shi wrote:
>
>> If the tsc is marked as constant and nonstop, could we set it as system
>> clocksource when do tsc register? w/o checking it on clocksource_watchdog?
>
> I'd not do that; the BIOS can still screw you over, we need some validation.
>
> That said; we do need means to disable the clocksource watchdog -- although I
> suppose Frederic might already have provided this for this NOHZ efforts when I
> wasn't looking.

Nope, I haven't touched that. I prefer not to fiddle with unstable
clocksource for now :)

As for unstable TSCs, if sched_clock_tick() needs to be fed, we simply
don't stop the tick.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-05 Thread Alex Shi
On 07/05/2013 01:58 PM, Thomas Gleixner wrote:
>> > 
>> > Ingo had merged your branch into sched/core. :)
>> > 
>> > commit f9bed7021710a3e45c331f7d7781de914cc1b939
>> > Merge: 7e76057 67dd331
>> > Author: Ingo Molnar 
>> > Date:   Wed May 29 11:21:59 2013 +0200
>> > 
>> > Merge branch 'timers/urgent'
> Not really.
> 
> tip$ git branch --contains f9bed70217
> * master
> 
> tip$ git branch --contains 5d33b883ae
> * master
>   timers/core
> 
> So you are testing tip/master not tip/sched/core.

You'r right. I mixed them. sorry.

-- 
Thanks
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-05 Thread Alex Shi
On 07/05/2013 01:58 PM, Thomas Gleixner wrote:
  
  Ingo had merged your branch into sched/core. :)
  
  commit f9bed7021710a3e45c331f7d7781de914cc1b939
  Merge: 7e76057 67dd331
  Author: Ingo Molnar mi...@kernel.org
  Date:   Wed May 29 11:21:59 2013 +0200
  
  Merge branch 'timers/urgent'
 Not really.
 
 tip$ git branch --contains f9bed70217
 * master
 
 tip$ git branch --contains 5d33b883ae
 * master
   timers/core
 
 So you are testing tip/master not tip/sched/core.

You'r right. I mixed them. sorry.

-- 
Thanks
Alex
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-05 Thread Frederic Weisbecker
2013/7/4 Peter Zijlstra pet...@infradead.org:
 On Thu, Jul 04, 2013 at 01:34:13PM +0800, Alex Shi wrote:

 If the tsc is marked as constant and nonstop, could we set it as system
 clocksource when do tsc register? w/o checking it on clocksource_watchdog?

 I'd not do that; the BIOS can still screw you over, we need some validation.

 That said; we do need means to disable the clocksource watchdog -- although I
 suppose Frederic might already have provided this for this NOHZ efforts when I
 wasn't looking.

Nope, I haven't touched that. I prefer not to fiddle with unstable
clocksource for now :)

As for unstable TSCs, if sched_clock_tick() needs to be fed, we simply
don't stop the tick.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-05 Thread Peter Zijlstra
On Fri, Jul 05, 2013 at 04:23:33PM +0200, Frederic Weisbecker wrote:
 Nope, I haven't touched that. I prefer not to fiddle with unstable
 clocksource for now :)
 
 As for unstable TSCs, if sched_clock_tick() needs to be fed, we simply
 don't stop the tick.

Not entirely the same thing; I thought the clocksource watchdog was ran
even when we have a 'stable' TSC, just to make sure it stays stable.
There's known cases where the BIOS f*cks us over and wrecks TSC sync.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-05 Thread Thomas Gleixner
On Fri, 5 Jul 2013, Peter Zijlstra wrote:

 On Fri, Jul 05, 2013 at 04:23:33PM +0200, Frederic Weisbecker wrote:
  Nope, I haven't touched that. I prefer not to fiddle with unstable
  clocksource for now :)
  
  As for unstable TSCs, if sched_clock_tick() needs to be fed, we simply
  don't stop the tick.
 
 Not entirely the same thing; I thought the clocksource watchdog was ran
 even when we have a 'stable' TSC, just to make sure it stays stable.
 There's known cases where the BIOS f*cks us over and wrecks TSC sync.

See arch/x86/kernel/tsc.c

We disable the watchdog for the TSC when tsc_clocksource_reliable is
set.

tsc_clocksource_reliable is set when:

 - you add tsc=reliable to the kernel command line

 - boot_cpu_has(X86_FEATURE_TSC_RELIABLE)
 
   X86_FEATURE_TSC_RELIABLE is a software flag, set by vmware and
   moorsetown. So all other machines keep the watchdog enabled.

 - On Geode LX (OLPC)

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-05 Thread Peter Zijlstra
On Fri, Jul 05, 2013 at 05:24:09PM +0200, Thomas Gleixner wrote:
 See arch/x86/kernel/tsc.c
 
 We disable the watchdog for the TSC when tsc_clocksource_reliable is
 set.
 
 tsc_clocksource_reliable is set when:
 
  - you add tsc=reliable to the kernel command line

Ah, I didn't know about that one, useful.

  - boot_cpu_has(X86_FEATURE_TSC_RELIABLE)
  
X86_FEATURE_TSC_RELIABLE is a software flag, set by vmware and
moorsetown. So all other machines keep the watchdog enabled.

Right.. I knew it was enabled on my machines even though they normally
have usable TSC.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-05 Thread Thomas Gleixner
On Fri, 5 Jul 2013, Peter Zijlstra wrote:
 On Fri, Jul 05, 2013 at 05:24:09PM +0200, Thomas Gleixner wrote:
  See arch/x86/kernel/tsc.c
  
  We disable the watchdog for the TSC when tsc_clocksource_reliable is
  set.
  
  tsc_clocksource_reliable is set when:
  
   - you add tsc=reliable to the kernel command line
 
 Ah, I didn't know about that one, useful.
 
   - boot_cpu_has(X86_FEATURE_TSC_RELIABLE)
   
 X86_FEATURE_TSC_RELIABLE is a software flag, set by vmware and
 moorsetown. So all other machines keep the watchdog enabled.
 
 Right.. I knew it was enabled on my machines even though they normally
 have usable TSC.

Yeah, but our well justified paranoia still prevents us from trusting
these CPU flags. Maybe some day BIOS is going to be replaced by
something useful. You know: Hope springs eternal




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-05 Thread Borislav Petkov
On Fri, Jul 05, 2013 at 11:50:05PM +0200, Thomas Gleixner wrote:
 Yeah, but our well justified paranoia still prevents us from trusting
 these CPU flags. Maybe some day BIOS is going to be replaced by
 something useful. You know: Hope springs eternal

Not in the next 10 yrs at least if one took a look at the
overengineered, obese at birth and braindead crap by the name of UEFI.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-05 Thread Thomas Gleixner
On Fri, 5 Jul 2013, Borislav Petkov wrote:
 On Fri, Jul 05, 2013 at 11:50:05PM +0200, Thomas Gleixner wrote:
  Yeah, but our well justified paranoia still prevents us from trusting
  these CPU flags. Maybe some day BIOS is going to be replaced by
  something useful. You know: Hope springs eternal
 
 Not in the next 10 yrs at least if one took a look at the
 overengineered, obese at birth and braindead crap by the name of UEFI.

Good news! 10 years is way less than eternity and just before
retirement :)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-04 Thread Thomas Gleixner
On Fri, 5 Jul 2013, Alex Shi wrote:
> On 07/05/2013 04:27 AM, Thomas Gleixner wrote:
>  We find some benchmarks drop a lot on tip/sched/core on many Intel boxes,
> > And that branch does NOT have that commit included. So how can you see
> > a regression on a branch caused by a commit NOT included into that
> > branch?
> > 
> > The offending commit is in tip timers/core and not in tip
> > sched/core. What I'm wanted to say is, that we need a proper
> > description of problems and not some random association.
> > 
> > It tricked you to assume, that I'm not able to figure it out myself :)
> > 
> > See? These things are complex and subtle, so we need precise
> > descriptions and not some sloppy semi correct data.
> > 
> > I'm well aware of the issue and with Peters help I got a reasonable
> > explanation for it. A proper fix is about to be sent out.
> > 
> 
> Ingo had merged your branch into sched/core. :)
> 
> commit f9bed7021710a3e45c331f7d7781de914cc1b939
> Merge: 7e76057 67dd331
> Author: Ingo Molnar 
> Date:   Wed May 29 11:21:59 2013 +0200
> 
> Merge branch 'timers/urgent'

Not really.

tip$ git branch --contains f9bed70217
* master

tip$ git branch --contains 5d33b883ae
* master
  timers/core

So you are testing tip/master not tip/sched/core.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-04 Thread Alex Shi
On 07/05/2013 04:27 AM, Thomas Gleixner wrote:
 We find some benchmarks drop a lot on tip/sched/core on many Intel boxes,
> And that branch does NOT have that commit included. So how can you see
> a regression on a branch caused by a commit NOT included into that
> branch?
> 
> The offending commit is in tip timers/core and not in tip
> sched/core. What I'm wanted to say is, that we need a proper
> description of problems and not some random association.
> 
> It tricked you to assume, that I'm not able to figure it out myself :)
> 
> See? These things are complex and subtle, so we need precise
> descriptions and not some sloppy semi correct data.
> 
> I'm well aware of the issue and with Peters help I got a reasonable
> explanation for it. A proper fix is about to be sent out.
> 

Ingo had merged your branch into sched/core. :)

commit f9bed7021710a3e45c331f7d7781de914cc1b939
Merge: 7e76057 67dd331
Author: Ingo Molnar 
Date:   Wed May 29 11:21:59 2013 +0200

Merge branch 'timers/urgent'

...

commit 7d194f78bde64ec813c1ed8291181bdd61515e78
Merge: 0298bf7 1eaff67
Author: Ingo Molnar 
Date:   Tue May 28 09:53:41 2013 +0200

Merge branch 'timers/core'

commit 0298bf70644d7334bec16ae47f3aa58f4f883b59
Merge: cc662fa 2938d27
Author: Ingo Molnar 
Date:   Tue May 28 09:49:51 2013 +0200

Merge branch 'timers/urgent'

-- 
Thanks
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-04 Thread Thomas Gleixner
On Thu, 4 Jul 2013, Davidlohr Bueso wrote:

> On Thu, 2013-07-04 at 13:00 +0200, Thomas Gleixner wrote:
> > On Thu, 4 Jul 2013, Alex Shi wrote:
> > 
> > > We find some benchmarks drop a lot on tip/sched/core on many Intel boxes,
> > > like oltp, tbench, hackbench etc. and bisected the commit 5d33b883ae
> > > cause this regression. Due to this commit, the clocksource was changed
> > > to hpet from tsc even tsc will be set CLOCK_SOURCE_VALID_FOR_HRES later
> > > in clocksource_watchdog. 
> > 
> > 5d33b883ae is not in tip/sched/core. So what are you testing and
> > bisecting?
> 
> I think he's referring to:
> 
> commit 5d33b883aed81c6fbcd09c6f7c3619eee850a7e2

I know what he is referring to. He explicitly mentions this commit:

> > > like oltp, tbench, hackbench etc. and bisected the commit 5d33b883ae

What I was pointing out that he was referring to tip sched/core at the
same time

> > > We find some benchmarks drop a lot on tip/sched/core on many Intel boxes,

And that branch does NOT have that commit included. So how can you see
a regression on a branch caused by a commit NOT included into that
branch?

The offending commit is in tip timers/core and not in tip
sched/core. What I'm wanted to say is, that we need a proper
description of problems and not some random association.

It tricked you to assume, that I'm not able to figure it out myself :)

See? These things are complex and subtle, so we need precise
descriptions and not some sloppy semi correct data.

I'm well aware of the issue and with Peters help I got a reasonable
explanation for it. A proper fix is about to be sent out.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-04 Thread Davidlohr Bueso
On Thu, 2013-07-04 at 13:00 +0200, Thomas Gleixner wrote:
> On Thu, 4 Jul 2013, Alex Shi wrote:
> 
> > We find some benchmarks drop a lot on tip/sched/core on many Intel boxes,
> > like oltp, tbench, hackbench etc. and bisected the commit 5d33b883ae
> > cause this regression. Due to this commit, the clocksource was changed
> > to hpet from tsc even tsc will be set CLOCK_SOURCE_VALID_FOR_HRES later
> > in clocksource_watchdog. 
> 
> 5d33b883ae is not in tip/sched/core. So what are you testing and
> bisecting?

I think he's referring to:

commit 5d33b883aed81c6fbcd09c6f7c3619eee850a7e2
Author: Thomas Gleixner 
Date:   Thu Apr 25 20:31:43 2013 +

clocksource: Always verify highres capability

If a clocksource has a (wrong) high rating, but can't be used as a
timebase for oneshot tick mode, it is unconditionally selected even
when the system is already in oneshot tick mode. This causes full
system failure.

Verify the clocksource selection against the oneshot mode.

Signed-off-by: Thomas Gleixner 
Acked-by: John Stultz 
Cc: Magnus Damm 
Link: http://lkml.kernel.org/r/20130425143435.635040...@linutronix.de
Signed-off-by: Thomas Gleixner 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-04 Thread Thomas Gleixner
On Thu, 4 Jul 2013, Alex Shi wrote:

> We find some benchmarks drop a lot on tip/sched/core on many Intel boxes,
> like oltp, tbench, hackbench etc. and bisected the commit 5d33b883ae
> cause this regression. Due to this commit, the clocksource was changed
> to hpet from tsc even tsc will be set CLOCK_SOURCE_VALID_FOR_HRES later
> in clocksource_watchdog. 

5d33b883ae is not in tip/sched/core. So what are you testing and
bisecting?
 
Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-04 Thread Alex Shi
On 07/04/2013 03:58 PM, Peter Zijlstra wrote:
> On Thu, Jul 04, 2013 at 01:34:13PM +0800, Alex Shi wrote:
> 
>> If the tsc is marked as constant and nonstop, could we set it as system
>> clocksource when do tsc register? w/o checking it on clocksource_watchdog?
> 
> I'd not do that; the BIOS can still screw you over, we need some validation.

I see. thanks!
> 
> That said; we do need means to disable the clocksource watchdog -- although I
> suppose Frederic might already have provided this for this NOHZ efforts when I
> wasn't looking.
> 


-- 
Thanks
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-04 Thread Peter Zijlstra
On Thu, Jul 04, 2013 at 01:34:13PM +0800, Alex Shi wrote:

> If the tsc is marked as constant and nonstop, could we set it as system
> clocksource when do tsc register? w/o checking it on clocksource_watchdog?

I'd not do that; the BIOS can still screw you over, we need some validation.

That said; we do need means to disable the clocksource watchdog -- although I
suppose Frederic might already have provided this for this NOHZ efforts when I
wasn't looking.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-04 Thread Peter Zijlstra
On Thu, Jul 04, 2013 at 01:34:13PM +0800, Alex Shi wrote:

 If the tsc is marked as constant and nonstop, could we set it as system
 clocksource when do tsc register? w/o checking it on clocksource_watchdog?

I'd not do that; the BIOS can still screw you over, we need some validation.

That said; we do need means to disable the clocksource watchdog -- although I
suppose Frederic might already have provided this for this NOHZ efforts when I
wasn't looking.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-04 Thread Alex Shi
On 07/04/2013 03:58 PM, Peter Zijlstra wrote:
 On Thu, Jul 04, 2013 at 01:34:13PM +0800, Alex Shi wrote:
 
 If the tsc is marked as constant and nonstop, could we set it as system
 clocksource when do tsc register? w/o checking it on clocksource_watchdog?
 
 I'd not do that; the BIOS can still screw you over, we need some validation.

I see. thanks!
 
 That said; we do need means to disable the clocksource watchdog -- although I
 suppose Frederic might already have provided this for this NOHZ efforts when I
 wasn't looking.
 


-- 
Thanks
Alex
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-04 Thread Thomas Gleixner
On Thu, 4 Jul 2013, Alex Shi wrote:

 We find some benchmarks drop a lot on tip/sched/core on many Intel boxes,
 like oltp, tbench, hackbench etc. and bisected the commit 5d33b883ae
 cause this regression. Due to this commit, the clocksource was changed
 to hpet from tsc even tsc will be set CLOCK_SOURCE_VALID_FOR_HRES later
 in clocksource_watchdog. 

5d33b883ae is not in tip/sched/core. So what are you testing and
bisecting?
 
Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-04 Thread Davidlohr Bueso
On Thu, 2013-07-04 at 13:00 +0200, Thomas Gleixner wrote:
 On Thu, 4 Jul 2013, Alex Shi wrote:
 
  We find some benchmarks drop a lot on tip/sched/core on many Intel boxes,
  like oltp, tbench, hackbench etc. and bisected the commit 5d33b883ae
  cause this regression. Due to this commit, the clocksource was changed
  to hpet from tsc even tsc will be set CLOCK_SOURCE_VALID_FOR_HRES later
  in clocksource_watchdog. 
 
 5d33b883ae is not in tip/sched/core. So what are you testing and
 bisecting?

I think he's referring to:

commit 5d33b883aed81c6fbcd09c6f7c3619eee850a7e2
Author: Thomas Gleixner t...@linutronix.de
Date:   Thu Apr 25 20:31:43 2013 +

clocksource: Always verify highres capability

If a clocksource has a (wrong) high rating, but can't be used as a
timebase for oneshot tick mode, it is unconditionally selected even
when the system is already in oneshot tick mode. This causes full
system failure.

Verify the clocksource selection against the oneshot mode.

Signed-off-by: Thomas Gleixner t...@linutronix.de
Acked-by: John Stultz john.stu...@linaro.org
Cc: Magnus Damm magnus.d...@gmail.com
Link: http://lkml.kernel.org/r/20130425143435.635040...@linutronix.de
Signed-off-by: Thomas Gleixner t...@linutronix.de


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-04 Thread Thomas Gleixner
On Thu, 4 Jul 2013, Davidlohr Bueso wrote:

 On Thu, 2013-07-04 at 13:00 +0200, Thomas Gleixner wrote:
  On Thu, 4 Jul 2013, Alex Shi wrote:
  
   We find some benchmarks drop a lot on tip/sched/core on many Intel boxes,
   like oltp, tbench, hackbench etc. and bisected the commit 5d33b883ae
   cause this regression. Due to this commit, the clocksource was changed
   to hpet from tsc even tsc will be set CLOCK_SOURCE_VALID_FOR_HRES later
   in clocksource_watchdog. 
  
  5d33b883ae is not in tip/sched/core. So what are you testing and
  bisecting?
 
 I think he's referring to:
 
 commit 5d33b883aed81c6fbcd09c6f7c3619eee850a7e2

I know what he is referring to. He explicitly mentions this commit:

   like oltp, tbench, hackbench etc. and bisected the commit 5d33b883ae

What I was pointing out that he was referring to tip sched/core at the
same time

   We find some benchmarks drop a lot on tip/sched/core on many Intel boxes,

And that branch does NOT have that commit included. So how can you see
a regression on a branch caused by a commit NOT included into that
branch?

The offending commit is in tip timers/core and not in tip
sched/core. What I'm wanted to say is, that we need a proper
description of problems and not some random association.

It tricked you to assume, that I'm not able to figure it out myself :)

See? These things are complex and subtle, so we need precise
descriptions and not some sloppy semi correct data.

I'm well aware of the issue and with Peters help I got a reasonable
explanation for it. A proper fix is about to be sent out.

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-04 Thread Alex Shi
On 07/05/2013 04:27 AM, Thomas Gleixner wrote:
 We find some benchmarks drop a lot on tip/sched/core on many Intel boxes,
 And that branch does NOT have that commit included. So how can you see
 a regression on a branch caused by a commit NOT included into that
 branch?
 
 The offending commit is in tip timers/core and not in tip
 sched/core. What I'm wanted to say is, that we need a proper
 description of problems and not some random association.
 
 It tricked you to assume, that I'm not able to figure it out myself :)
 
 See? These things are complex and subtle, so we need precise
 descriptions and not some sloppy semi correct data.
 
 I'm well aware of the issue and with Peters help I got a reasonable
 explanation for it. A proper fix is about to be sent out.
 

Ingo had merged your branch into sched/core. :)

commit f9bed7021710a3e45c331f7d7781de914cc1b939
Merge: 7e76057 67dd331
Author: Ingo Molnar mi...@kernel.org
Date:   Wed May 29 11:21:59 2013 +0200

Merge branch 'timers/urgent'

...

commit 7d194f78bde64ec813c1ed8291181bdd61515e78
Merge: 0298bf7 1eaff67
Author: Ingo Molnar mi...@kernel.org
Date:   Tue May 28 09:53:41 2013 +0200

Merge branch 'timers/core'

commit 0298bf70644d7334bec16ae47f3aa58f4f883b59
Merge: cc662fa 2938d27
Author: Ingo Molnar mi...@kernel.org
Date:   Tue May 28 09:49:51 2013 +0200

Merge branch 'timers/urgent'

-- 
Thanks
Alex
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-04 Thread Thomas Gleixner
On Fri, 5 Jul 2013, Alex Shi wrote:
 On 07/05/2013 04:27 AM, Thomas Gleixner wrote:
  We find some benchmarks drop a lot on tip/sched/core on many Intel boxes,
  And that branch does NOT have that commit included. So how can you see
  a regression on a branch caused by a commit NOT included into that
  branch?
  
  The offending commit is in tip timers/core and not in tip
  sched/core. What I'm wanted to say is, that we need a proper
  description of problems and not some random association.
  
  It tricked you to assume, that I'm not able to figure it out myself :)
  
  See? These things are complex and subtle, so we need precise
  descriptions and not some sloppy semi correct data.
  
  I'm well aware of the issue and with Peters help I got a reasonable
  explanation for it. A proper fix is about to be sent out.
  
 
 Ingo had merged your branch into sched/core. :)
 
 commit f9bed7021710a3e45c331f7d7781de914cc1b939
 Merge: 7e76057 67dd331
 Author: Ingo Molnar mi...@kernel.org
 Date:   Wed May 29 11:21:59 2013 +0200
 
 Merge branch 'timers/urgent'

Not really.

tip$ git branch --contains f9bed70217
* master

tip$ git branch --contains 5d33b883ae
* master
  timers/core

So you are testing tip/master not tip/sched/core.

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-03 Thread Alex Shi
We find some benchmarks drop a lot on tip/sched/core on many Intel boxes,
like oltp, tbench, hackbench etc. and bisected the commit 5d33b883ae
cause this regression. Due to this commit, the clocksource was changed
to hpet from tsc even tsc will be set CLOCK_SOURCE_VALID_FOR_HRES later
in clocksource_watchdog. 

Tim Chen said the hpet reading cost much. That cause this regression.

This patchset fixed this bug by re-select clocksource after this flag set
on tsc.

BTW,
If the tsc is marked as constant and nonstop, could we set it as system
clocksource when do tsc register? w/o checking it on clocksource_watchdog?

regards!
Alex

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[URGENT rfc patch 0/3] tsc clocksource bug fix

2013-07-03 Thread Alex Shi
We find some benchmarks drop a lot on tip/sched/core on many Intel boxes,
like oltp, tbench, hackbench etc. and bisected the commit 5d33b883ae
cause this regression. Due to this commit, the clocksource was changed
to hpet from tsc even tsc will be set CLOCK_SOURCE_VALID_FOR_HRES later
in clocksource_watchdog. 

Tim Chen said the hpet reading cost much. That cause this regression.

This patchset fixed this bug by re-select clocksource after this flag set
on tsc.

BTW,
If the tsc is marked as constant and nonstop, could we set it as system
clocksource when do tsc register? w/o checking it on clocksource_watchdog?

regards!
Alex

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/