[ANNOUNCE] 3.6.4-rt10

2012-10-29 Thread Thomas Gleixner
Dear RT Folks,

I'm pleased to announce the 3.6.4-rt10 release. This is just an update
to 3.6.4 with no RT related changes

The RT patch against 3.6.4 can be found here:

  
http://www.kernel.org/pub/linux/kernel/projects/rt/3.6/patch-3.6.4-rt10.patch.xz

The split quilt queue is available at:

  
http://www.kernel.org/pub/linux/kernel/projects/rt/3.6/patches-3.6.4-rt10.tar.xz

Enjoy,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT pull] Futex fix for 3.7

2012-11-12 Thread Thomas Gleixner
Linus,

please pull the latest core-urgent-for-linus git tree from:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
core-urgent-for-linus

Single fix for a long standing futex race when taking over a futex
whose owner died. You can end up with two owners, which violates quite
some rules.

Thanks,

tglx

--
Thomas Gleixner (1):
  futex: Handle futex_pi OWNER_DIED take over correctly


 kernel/futex.c |   41 ++---
 1 files changed, 22 insertions(+), 19 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index 3717e7b..20ef219 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -716,7 +716,7 @@ static int futex_lock_pi_atomic(u32 __user *uaddr, struct 
futex_hash_bucket *hb,
struct futex_pi_state **ps,
struct task_struct *task, int set_waiters)
 {
-   int lock_taken, ret, ownerdied = 0;
+   int lock_taken, ret, force_take = 0;
u32 uval, newval, curval, vpid = task_pid_vnr(task);
 
 retry:
@@ -755,17 +755,15 @@ retry:
newval = curval | FUTEX_WAITERS;
 
/*
-* There are two cases, where a futex might have no owner (the
-* owner TID is 0): OWNER_DIED. We take over the futex in this
-* case. We also do an unconditional take over, when the owner
-* of the futex died.
-*
-* This is safe as we are protected by the hash bucket lock !
+* Should we force take the futex? See below.
 */
-   if (unlikely(ownerdied || !(curval  FUTEX_TID_MASK))) {
-   /* Keep the OWNER_DIED bit */
+   if (unlikely(force_take)) {
+   /*
+* Keep the OWNER_DIED and the WAITERS bit and set the
+* new TID value.
+*/
newval = (curval  ~FUTEX_TID_MASK) | vpid;
-   ownerdied = 0;
+   force_take = 0;
lock_taken = 1;
}
 
@@ -775,7 +773,7 @@ retry:
goto retry;
 
/*
-* We took the lock due to owner died take over.
+* We took the lock due to forced take over.
 */
if (unlikely(lock_taken))
return 1;
@@ -790,20 +788,25 @@ retry:
switch (ret) {
case -ESRCH:
/*
-* No owner found for this futex. Check if the
-* OWNER_DIED bit is set to figure out whether
-* this is a robust futex or not.
+* We failed to find an owner for this
+* futex. So we have no pi_state to block
+* on. This can happen in two cases:
+*
+* 1) The owner died
+* 2) A stale FUTEX_WAITERS bit
+*
+* Re-read the futex value.
 */
if (get_futex_value_locked(curval, uaddr))
return -EFAULT;
 
/*
-* We simply start over in case of a robust
-* futex. The code above will take the futex
-* and return happy.
+* If the owner died or we have a stale
+* WAITERS bit the owner TID in the user space
+* futex is 0.
 */
-   if (curval  FUTEX_OWNER_DIED) {
-   ownerdied = 1;
+   if (!(curval  FUTEX_TID_MASK)) {
+   force_take = 1;
goto retry;
}
default:
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REGRESSION] 3.7-rc3+git hard lockup on CPU after inserting/removing USB stick

2012-11-12 Thread Thomas Gleixner
On Mon, 12 Nov 2012, Martin Steigerwald wrote:
 Am Sonntag, 11. November 2012 schrieb Liu, Chuansheng:
   The first bad commit is:
   
   commit 73d4066055e0e2830533041f4b91df8e6e5976ff
   Author: Chuansheng Liu chuansheng@intel.com
   Date:   Tue Sep 11 16:00:30 2012 +0800
   
   USB/host: Cleanup unneccessary irq disable code
   
   Because the IRQF_DISABLED as the flag is now a NOOP and has been
   deprecated and in hardirq context the interrupt is disabled.
   
   so in usb/host code:
   Removing the usage of flag IRQF_DISABLED;
   Removing the calling local_irq save/restore actions in irq
   handler usb_hcd_irq();
   
   Signed-off-by: liu chuansheng chuansheng@intel.com
   Acked-by: Alan Stern st...@rowland.harvard.edu
   Signed-off-by: Greg Kroah-Hartman gre...@linuxfoundation.org
   
   
   But:
   
   This ony happens with threadirqs option!
   
   When I remove threadirqs from kernel command line and reboot with this
   last bisect kernel USB sticks work.
   
   That may explain why nobody else has seen this.
   
   So I will try a 3.7-rc4 now, but without threadirqs enabled.
   
  Thanks your pointing out, the USB HCD irq handler is designed to
  execute in irq handler with irq disabled.  When threadirqs is in
  commandline, it will be executed in thread context with local irq
  enabling, which causes this hardlockup.

No. The problem is caused by the commit above. USB with threaded
interrupt handlers worked perfectly fine in the past.
 
  --- a/drivers/usb/core/hcd.c
  +++ b/drivers/usb/core/hcd.c
  @@ -2349,7 +2349,7 @@ static int usb_hcd_request_irqs(struct usb_hcd *hcd,
  if (hcd-driver-irq) {
  snprintf(hcd-irq_descr, sizeof(hcd-irq_descr), %s:usb%d,
  hcd-driver-description, hcd-self.busnum);
  -   retval = request_irq(irqnum, usb_hcd_irq, irqflags,
  +   retval = request_irq(irqnum, usb_hcd_irq, 
  irqflags|IRQF_NO_THREAD,
  hcd-irq_descr, hcd);

NAK. This is exactly the wrong thing to do.

We want to be able to run that code in an handler thread. So you
removed the local_irq_save/restore() in the driver code and with
forced threaded irqs this breaks. Now setting IRQF_NO_THREAD is just
working around the problem that the above commit broke it.

There is no hard requirement to run USB interrupts in hard interrupt
context. I'd rather see the above commit reverted and then a proper
analysis done why removing local_irq_save/restore() breaks forced
threaded interrupt handlers.

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v1 02/31] ARC: irqflags

2012-11-12 Thread Thomas Gleixner
On Wed, 7 Nov 2012, Vineet Gupta wrote:
 + **
 + *  Inline ASM macros to read/write AUX Regs
 + *  Essentially invocation of lr/sr insns from C
 + */
 +
 +#if 1

Leftover ???

 +#define read_aux_reg(reg)__builtin_arc_lr(reg)
 +
 +/* gcc builtin sr needs reg param to be long immediate */
 +#define write_aux_reg(reg_immed, val)\
 + __builtin_arc_sr((unsigned int)val, reg_immed)
 +
 +#else
 +/*
 + * Conditionally Enable IRQs

  Unconditionally methinks


The following two functions are related to irq chips I guess. So why
would you want them here ?

 +static inline void arch_mask_irq(unsigned int irq)
 +{
 + unsigned int ienb;
 +
 + ienb = read_aux_reg(AUX_IENABLE);
 + ienb = ~(1  irq);
 + write_aux_reg(AUX_IENABLE, ienb);
 +}
 +
 +static inline void arch_unmask_irq(unsigned int irq)
 +{
 + unsigned int ienb;
 +
 + ienb = read_aux_reg(AUX_IENABLE);
 + ienb |= (1  irq);
 + write_aux_reg(AUX_IENABLE, ienb);
 +}

The only user is the interrupt controller code, right?

 diff --git a/arch/arc/kernel/irq.c b/arch/arc/kernel/irq.c
 new file mode 100644
 index 000..16fcbe8
 --- /dev/null
 +++ b/arch/arc/kernel/irq.c
 @@ -0,0 +1,32 @@
 +/*
 + * Copyright (C) 2011-12 Synopsys, Inc. (www.synopsys.com)
 + *
 + * This program is free software; you can redistribute it and/or modify
 + * it under the terms of the GNU General Public License version 2 as
 + * published by the Free Software Foundation.
 + *
 + */
 +
 +#include linux/interrupt.h
 +#include linux/module.h
 +#include asm/irqflags.h
 +#include asm/arcregs.h
 +
 +void arch_local_irq_enable(void)
 +{
 +
 + unsigned long flags;
 + flags = arch_local_save_flags();
 + flags |= (STATUS_E1_MASK | STATUS_E2_MASK);
 +
 + /*
 +  * If called from hard ISR (between irq_enter and irq_exit)
 +  * don't allow Level 1. In Soft ISR we allow further Level 1s
 +  */
 +
 + if (in_irq())
 + flags = ~(STATUS_E1_MASK | STATUS_E2_MASK);

Hmm. This looks weird and the comment is not very helpful. So using my
crystal ball you want to enforce, that nothing enables interrupts
while a hard interrupt handler is running, right?

Is there a chip limitation which you have to enforce here? If yes,
then please explain it.

Btw, all hard interrupt handlers in Linux run with interrupts disabled and
they are not supposed to reenable interrupts, which is true for almost
all drivers except for a few archaic IDE drivers. In fact you might
even WARN about it at least once, so the offending code gets fixed.

Also the code flow is backwards. What about:

 unsigned long flags;

 if (in_irq())
return;

 flags =    


 + arch_local_irq_restore(flags);
 +}

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v1 12/31] ARC: Interrupt Handling

2012-11-12 Thread Thomas Gleixner
On Wed, 7 Nov 2012, Vineet Gupta wrote:
 +void __init init_IRQ(void)
 +{
 + const int irq = TIMER0_IRQ;
 +
 + /*
 +  * Each CPU needs to register irq of it's private TIMER0.
 +  * The APIs request_percpu_irq()/enable_percpu_irq() will not be
 +  * functional, if we don't prep the generic IRQ sub-system with
 +  * the following:
 +  * -Ensure that devid passed to request_percpu_irq() is indeed per cpu
 +  * -disable NOAUTOEN, w/o which the device handler never gets called

What sets NOAUTOEN in the first place? The core code definitely does
not.

 +  */
 + irq_set_percpu_devid(irq);
 + irq_modify_status(irq, IRQ_NOAUTOEN, 0);

Aside of that we have irq_clear_status_flags() for this.

 + plat_init_IRQ();
 +}

 +int __init get_hw_config_num_irq(void)

How is that function used ?

 +{
 + uint32_t val = read_aux_reg(ARC_REG_VECBASE_BCR);
 +
 + switch (val  0x03) {
 + case 0:
 + return 16;
 + case 1:
 + return 32;
 + case 2:
 + return 8;
 + default:
 + return 0;
 + }
 +
 + return 0;
 +}

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v1 15/31] ARC: Process/scheduling/clock/Timers/Delay Management

2012-11-12 Thread Thomas Gleixner
On Wed, 7 Nov 2012, Vineet Gupta wrote:
 +void cpu_idle(void)
 +{
 + /* Since we SLEEP in idle loop, TIF_POLLING_NRFLAG can't be set */
 +
 + /* endless idle loop with no priority at all */
 + while (1) {
 + tick_nohz_idle_enter();
 +
 + while (!need_resched())
 + arch_idle();
 +
 + tick_nohz_idle_exit();
 +
 + preempt_enable_no_resched();
 + schedule();
 + preempt_disable();

schedule_preempt_disabled() please

 + }

 diff --git a/arch/arc/kernel/time.c b/arch/arc/kernel/time.c
 
 +static void arc_periodic_timer_setup(unsigned int limit)
 +{
 + /* setup start and end markers */
 + write_aux_reg(ARC_REG_TIMER0_LIMIT, limit);
 + write_aux_reg(ARC_REG_TIMER0_CNT, 0);   /* start from 0 */
 +
 + /* IE: Interrupt on count = limit,
 +  * NH: Count cycles only when CPU running (NOT Halted)
 +  */
 + write_aux_reg(ARC_REG_TIMER0_CTRL, TIMER_CTRL_IE | TIMER_CTRL_NH);
 +}
 +
 +/*
 + * Acknowledge the interrupt  enable/disable the interrupt
 + */
 +static void arc_periodic_timer_ack(unsigned int irq_reenable)
 +{
 + /* 1. Ack the interrupt by writing to CTRL reg.
 +  *Any write will cause intr to be ack, however it has to be one of
 +  *writable bits (NH: Count when not halted)
 +  * 2. If required by caller, re-arm timer to Interrupt at the end of
 +  *next cycle.
 +  *
 +  * Small optimisation:
 +  * Normal code would have been
 +  *  if (irq_reenable) CTRL_REG = (IE | NH); else CTRL_REG = NH;
 +  * However since IE is BIT0 we can fold the branch
 +  */
 + write_aux_reg(ARC_REG_TIMER0_CTRL, irq_reenable | TIMER_CTRL_NH);
 +}



 +/** Clock Event Device */
 +
 +static int arc_clkevent_set_next_event(unsigned long delta,
 + struct clock_event_device *dev)
 +{
 + arc_periodic_timer_setup(delta);

This is confusing. Is arc_periodic_timer_setup() setting up a periodic
timer or a oneshot timer? It looks you use it for both and the
differentiation happens in arc_periodic_timer_ack(). So I assume the
timer only knows about periodic mode, but you trick it into oneshot
with the ack function, right ? So it's just me being confused about
the function names, but that could do with some explanatory comments.

 + return 0;
 +}
 +
 +static void arc_clkevent_set_mode(enum clock_event_mode mode,
 +struct clock_event_device *dev)
 +{
 + pr_info(Device [%s] clockevent mode now [%d]\n, dev-name, mode);

Please remove the debug leftover.

 + switch (mode) {
 + case CLOCK_EVT_MODE_PERIODIC:
 + arc_periodic_timer_setup(CONFIG_ARC_PLAT_CLK / HZ);
 + break;
 + case CLOCK_EVT_MODE_ONESHOT:
 + break;
 + default:
 + break;
 + }
 +
 + return;
 +}
 +
 +static DEFINE_PER_CPU(struct clock_event_device, arc_clockevent_device) = {
 + .name   = ARC Timer0,
 + .features   = CLOCK_EVT_FEAT_ONESHOT | CLOCK_EVT_FEAT_PERIODIC,
 + .mode   = CLOCK_EVT_MODE_UNUSED,
 + .rating = 300,
 + .irq= TIMER0_IRQ,   /* hardwired, no need for resources */
 + .set_next_event = arc_clkevent_set_next_event,
 + .set_mode   = arc_clkevent_set_mode,
 +};
 +
 +irqreturn_t timer_irq_handler(int irq, void *dev_id)

static please

 +static int arc_finished_booting;
 +
 +/*
 + * Scheduler clock - returns current time in nanosec units.
 + * It's return value must NOT wrap around.
 + *
 + * Although the return value is nanosec units based, what's more important
 + * is whats the source of this value. The orig jiffies based computation
 + * was only as granular as jiffies itself (10ms on ARC).
 + * We need something that is more granular, so use the same mechanism as
 + * gettimeofday(), which uses ARC Timer T1 wrapped as a clocksource.
 + * Unfortunately the first call to sched_clock( ) is way before that subsys
 + * is initialiased, thus use the jiffies based value in the interim.
 + */
 +unsigned long long sched_clock(void)
 +{
 + if (!arc_finished_booting) {
 + return (unsigned long long)(jiffies - INITIAL_JIFFIES)
 + * (NSEC_PER_SEC / HZ);
 + } else {
 + struct timespec ts;
 + getrawmonotonic(ts);

This can live lock. sched_clock() is used by the tracer. So assume you
are function tracing and you trace a function called from within the
timekeeping seqcount write locked region. You spin forever in
getrawmonotonic(). Not what you want, right ?

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ANNOUNCE] 3.6.6-rt17

2012-11-12 Thread Thomas Gleixner
Dear RT Folks,

I'm pleased to announce the 3.6.6-rt17 release. 3.6.6-rt16 is just a
not announced update release to 3.6.6.

Changes since 3.6.6-rt16:

   * Finally make the NOHZ softirq pending detection work with the new
 softirq scheme.

   * Remove the WARN_ON from __raise_softirq_irqoff(). I got the
 information I want for now.

The delta patch against 3.6.6-rt16 is appended below and can be found
here:

  
http://www.kernel.org/pub/linux/kernel/projects/rt/3.6/incr/patch-3.6.6-rt16-rt17.patch.xz

The RT patch against 3.6.6 can be found here:

  
http://www.kernel.org/pub/linux/kernel/projects/rt/3.6/patch-3.6.6-rt17.patch.xz

The split quilt queue is available at:

  
http://www.kernel.org/pub/linux/kernel/projects/rt/3.6/patches-3.6.6-rt17.tar.xz

Enjoy,

tglx

-
Index: linux-stable/kernel/softirq.c
===
--- linux-stable.orig/kernel/softirq.c
+++ linux-stable/kernel/softirq.c
@@ -100,20 +100,15 @@ void softirq_check_pending_idle(void)
 {
static int rate_limit;
struct softirq_runner *sr = __get_cpu_var(softirq_runners);
-   u32 warnpending, pending = local_softirq_pending();
+   u32 warnpending = local_softirq_pending();
+   int i;
 
if (rate_limit = 10)
return;
 
-   warnpending = pending;
-
-   while (pending) {
-   struct task_struct *tsk;
-   int i = __ffs(pending);
-
-   pending = ~(1  i);
+   for (i = 0; i  NR_SOFTIRQS; i++) {
+   struct task_struct *tsk = sr-runner[i];
 
-   tsk = sr-runner[i];
/*
 * The wakeup code in rtmutex.c wakes up the task
 * _before_ it sets pi_blocked_on to NULL under
@@ -638,7 +633,7 @@ static void do_raise_softirq_irqoff(unsi
 void __raise_softirq_irqoff(unsigned int nr)
 {
do_raise_softirq_irqoff(nr);
-   if (WARN_ON_ONCE(!in_irq()  !current-softirq_nestcnt))
+   if (!in_irq()  !current-softirq_nestcnt)
wakeup_softirqd();
 }
 
Index: linux-stable/localversion-rt
===
--- linux-stable.orig/localversion-rt
+++ linux-stable/localversion-rt
@@ -1 +1 @@
--rt16
+-rt17
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [STABLE REQUEST] add: e1000: fix lockdep splat in shutdown handler

2012-10-14 Thread Thomas Gleixner
On Thu, 11 Oct 2012, Steven Rostedt wrote:
 commit 3a3847e007aae732d64d8fd1374126393e9879a3
 Author: Jesse Brandeburg jesse.brandeb...@intel.com
 Date:   Wed Jan 4 20:23:33 2012 +
 
 e1000: fix lockdep splat in shutdown handler

as I discussed with Jesse on IRC, there is another possible deadlock
lurking in the e1000 code.

static void e1000_reinit_safe(struct e1000_adapter *adapter)
{
while (test_and_set_bit(__E1000_RESETTING, adapter-flags))
msleep(1);
mutex_lock(adapter-mutex);
e1000_down(adapter);

e1000_down() waits on the various work tasks to shut down, but those
work functions might be blocked on the adapter mutex.

I have no idea how I managed to trigger that one, but it's real. The
task dump I got out of the machine shows stuff waiting on each other
forever.

I can't give you a receipe to reprodruce. Looking at the code this is
not very surprising. It takes quite some coincidence of having
e1000_reinit_safe() being invoked and the delayed work timer bringing
the work on right after e1000_reinit_safe() took the adapter mutex.

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] posix timers: allocate timer id per task

2012-10-15 Thread Thomas Gleixner
On Mon, 15 Oct 2012, Stanislav Kinsbursky wrote:

 This patch is required CRIU project (www.criu.org).
 To migrate processes with posix timers we have to make sure, that we can
 restore posix timer with proper id.
 Currently, this is not true, because timer ids are allocated globally.
 So, this is precursor patch and it's purpose is make posix timer id to be
 allocated per task.

You can't allocate them per task. posix timers are process wide.

What's the reason why you did not make the posix timer ids per name
space instead of going down to the per process level ?

 Patch replaces global idr with global hash table for posix timers and
 makes timer ids unique not globally, but per task. Next free timer id is type
 of integer and stored on signal struct (posix_timer_id). If free timer id
 reaches negative value on timer creation, it will be dropped to zero and
 -EAGAIN will be returned to user.

So you want to allow 2^31 posix timers created for a single process?

 +static struct k_itimer *__posix_timers_find(struct hlist_head *head, struct 
 signal_struct *sig, timer_t id)
 +{
 + struct hlist_node *node;
 + struct k_itimer *timer;
 +
 + hlist_for_each_entry(timer, node, head, t_hash) {
 + if ((timer-it_signal == sig)  (timer-it_id == id))
 + return timer;
 + }
 + return NULL;
 +}
 +
 +static struct k_itimer *posix_timer_find(timer_t id, unsigned long *flags)
 +{
 + struct k_itimer *timer;
 + struct signal_struct *sig = current-signal;
 + struct hlist_head *head = posix_timers_hashtable[hash(sig, id)];
 +
 + spin_lock_irqsave(hash_lock, *flags);

This is not going to fly. You just reintroduced a massive scalability
problem. See commit 8af08871

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ANNOUNCE] 3.6.1-rt2

2012-10-16 Thread Thomas Gleixner
Dear RT Folks,

I'm pleased to announce the 3.6.1-rt2 release.

Changes since 3.6.1-rt1:

* Picked up Pauls git friendly quilt queue

* Compile fix for !RT_FULL (Paul Gortemaker)

* Crypto init order fix

* Tiny RCU fix which affects UP and is a long standing bug
  affecting 3.2 and 3.4-rt as well.

The delta patch against 3.6.1-rt1 is appended below and can be found here

  
http://www.kernel.org/pub/linux/kernel/projects/rt/3.6/incr/patch-3.6.1-rt1-rt2.patch.xz

The RT patch against 3.6.1 can be found here:

  
http://www.kernel.org/pub/linux/kernel/projects/rt/3.6/patch-3.6.1-rt2.patch.xz

The split quilt queue is available at:

  
http://www.kernel.org/pub/linux/kernel/projects/rt/3.6/patches-3.6.1-rt2.tar.xz

Enjoy,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] 3.6.1-rt2

2012-10-16 Thread Thomas Gleixner
On Tue, 16 Oct 2012, Javier Sanz wrote:

 Hello,
 
 Testing, and FYI
 
 $uname -a
 Linux darkstar 3.6.1-rt2 #1 SMP PREEMPT RT Tue Oct 16 22:47:06 CEST 2012
 i686 i686 i386 GNU/Linux
 
 shows all time  ...
 
 [   30.543233] fuse init (API version 7.20)
 [   33.262077] Crap, ksoftirqd/0 looping forever in softirq
 [   33.344865] Crap, ksoftirqd/2 looping forever in softirq
 [   33.401736] Crap, ksoftirqd/0 looping forever in softirq
 [   33.409743] Crap, ksoftirqd/0 looping forever in softirq
 [   33.421658] Crap, ksoftirqd/0 looping forever in softirq
 [   33.428628] Crap, ksoftirqd/0 looping forever in softirq
 [   33.496468] Crap, ksoftirqd/2 looping forever in softirq

Grrr. Forgot to remove that printk. Will do in the next spin.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Thomas Gleixner
Rafael,

On Thu, 2007-09-20 at 22:39 +0200, Rafael J. Wysocki wrote:
  Works as well. What's the difference between this and the real thing ?
 
 The real thing also calls device_power_down(PMSG_FREEZE), which is a
 counterpart of sysdev_shutdown(), more or less, and I think that's what goes
 belly up.
 
 You can use the patch below (on top of -rc6-mm1), which just disables the 
 image
 creation (that should be irrelevant anyway) and see what happens.

In meantime I figured out what's happening. The ordering in
hibernate_snapshot() is wrong. It does:

swsusp_shrink_memory();
suspend_console();
device_suspend(PMSG_FREEZE);
platform_prepare(platform_mode);

disable_nonboot_cpus();

swsusp_suspend();

enable_nonboot_cpus();

platform_finish(platform_mode);
device_resume();
resume_console();

We disable everything in device_suspend() including timekeeping, so any
code which is depending on working timekeeping and timer functionality
(which is suspended in timekeeping_suspend() as well) is busted.

enable_nonboot_cpus() definitely relies on working timekeeping and
timers depending on the codepath. It's just a surprise that this did not
blow up earlier (also before clock events).

I changed the ordering of the above to:

disable_nonboot_cpus();

swsusp_shrink_memory();
suspend_console();
device_suspend(PMSG_FREEZE);
platform_prepare(platform_mode);
swsusp_suspend();
platform_finish(platform_mode);
device_resume();
resume_console();

enable_nonboot_cpus();

and non-surprisingly the my VAIO needs help from keyboard problem went
away immediately. See patch below. (on top of rc7-hrt1, -mm1 does not
work at all on my VAIO due to some yet not identified wreckage)

I did not yet look into the suspend to ram code, but I guess that there
is an equivalent problem.

But I have no idea why this affects Andrews jinxed VAIO (UP machine),
though I suspect that we have more timekeeping/timer depending code
somewhere waiting to bite us.

Also I still need to debug why the HIBERNATION_TEST code path (which has
a msleep(5000) in it) does not fail, but I postpone this until tomorrow
morning. I'm dead tired after hunting this Heisenbug which changes with
every other printk added to the code. I'm going to add some really noisy
messages for everything which accesses timekeeping / timers _after_
those systems have been shut down.

We really need to fix this once and forever _before_ 2.6.23 final, even
if it requires a -rc8.

Thanks,

tglx

--- a/kernel/power/disk.c   2007-09-11 09:25:24.0 +0200
+++ b/kernel/power/disk.c   2007-09-20 22:47:30.0 +0200
@@ -130,10 +130,14 @@ int hibernation_snapshot(int platform_mo
 {
int error;
 
+   error = disable_nonboot_cpus();
+   if (error)
+   goto resume_cpus;
+
/* Free memory before shutting down devices. */
error = swsusp_shrink_memory();
if (error)
-   return error;
+   goto resume_cpus;
 
suspend_console();
error = device_suspend(PMSG_FREEZE);
@@ -144,23 +148,22 @@ int hibernation_snapshot(int platform_mo
if (error)
goto Resume_devices;
 
-   error = disable_nonboot_cpus();
-   if (!error) {
-   if (hibernation_mode != HIBERNATION_TEST) {
-   in_suspend = 1;
-   error = swsusp_suspend();
-   /* Control returns here after successful restore */
-   } else {
-   printk(swsusp debug: Waiting for 5 seconds.\n);
-   mdelay(5000);
-   }
+   if (hibernation_mode != HIBERNATION_TEST) {
+   in_suspend = 1;
+   error = swsusp_suspend();
+   /* Control returns here after successful restore */
+   } else {
+   printk(swsusp debug: Waiting for 5 seconds.\n);
+   mdelay(5000);
}
-   enable_nonboot_cpus();
+
  Resume_devices:
platform_finish(platform_mode);
device_resume();
  Resume_console:
resume_console();
+resume_cpus:
+   enable_nonboot_cpus();
return error;
 }
 



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Thomas Gleixner
Rafael,

On Thu, 2007-09-20 at 23:45 +0200, Rafael J. Wysocki wrote:
  We disable everything in device_suspend()
 
 No, we don't.  sysdevs are _not_ suspended in device_suspend().
 They are suspended in device_power_down(), which is called
 _after_ disable_nonboot_cpus() (from swsusp_suspend()).
 
  including timekeeping,
 
 No, the timekeeping is suspended in device_power_down() (or at least it should
 be).

Damn, you are right. Reading through 30 different logs confused me.

  enable_nonboot_cpus();
 
 Actually, we can't do this here, because of ACPI and some interrupt handling
 related problems.  Unfortunately, platform_finish() needs to go _after_
 enable_nonboot_cpus() and device_resume() needs to go after platform_finish().
 Analogously, disable_nonboot_cpus() has to go after platform_prepare().

 Otherwise, some systems will break.

Well, I don't buy this one. The system would break in the same way, when
I take CPU#1 offline before I initiate the suspend.

  and non-surprisingly the my VAIO needs help from keyboard problem went
  away immediately. See patch below. (on top of rc7-hrt1, -mm1 does not
  work at all on my VAIO due to some yet not identified wreckage)
 
 Hm, I really don't know why it helps, but that's not because of the 
 timekeeping
 suspend, IMO.

It is related. We rely on some subtle thing which is not up when we
resume the non boot cpu.

  I did not yet look into the suspend to ram code, but I guess that there
  is an equivalent problem.
 
 Yes, the code ordering is the same, but it's not totally wrong, IMHO.
 
  But I have no idea why this affects Andrews jinxed VAIO (UP machine),
  though I suspect that we have more timekeeping/timer depending code
  somewhere waiting to bite us.
 
 That's possible.
 
  Also I still need to debug why the HIBERNATION_TEST code path (which has
  a msleep(5000) in it) does not fail,
 
 See above. :-)

Yes. It makes sense. When I change the TEST code path to:

-   printk(swsusp debug: Waiting for 5 seconds.\n);
-   msleep(5000);
+   printk(swsusp debug: before swsusp_suspend\n);
+   error = swsusp_suspend();

then I have the same effect as I get from real hibernation. And we
actually shut down time keeping somewhere in that code path.

ACPI: PCI interrupt for device :00:1b.0 disabled
swsusp debug: before swsusp_suspend
Suspend timekeeping
swsusp: critical section: 
swsusp: Need to copy 112429 pages
swsusp: Normal pages needed: 35399 + 1024 + 40, available pages: 193876
swsusp: critical section: done (112429 pages copied)
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
Resume timekeeping
ACPI: PCI Interrupt :00:02.0[A] - GSI 16 (level, low) - IRQ 16
- works fine

This is with my patch applied. Without that I get:

CPU1 is down
swsusp debug: before swsusp_suspend
Suspend timekeeping
swsusp: critical section: 
swsusp: Need to copy 112429 pages
swsusp: Normal pages needed: 35399 + 1024 + 40, available pages: 193876
swsusp: critical section: done (112429 pages copied)
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
Resume timekeeping
Enabling non-boot CPUs
-- Waits for ever until a key is pressed

Thanks,

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Thomas Gleixner
Rafael,

On Thu, 2007-09-20 at 23:54 +0200, Rafael J. Wysocki wrote:
  Hmm. This is close to the ordering we have in STR too.
  
  I have some dim memory of there being some ACPI reason why it had to be 
  done that way.
 
 Yes.  We're executing _INI from the CPU initialization code and that shouldn't
 be done after _WAK, which is called from platform_finish().

If I tear down CPU#1 right before I tell the kernel to hibernate, then
the box must explode in the same way. It does not. On none of 4 tested
laptops. 

Of course only the jinxed VAIO one exposes the please press a key
problem.

I need to follow down the swsusp_suspend() code path to figure out, why
this breaks the box.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Thomas Gleixner
Linus,

On Thu, 2007-09-20 at 14:55 -0700, Linus Torvalds wrote:
 And I think that's a damn reasonable thing to agree on: timers (and 
 anything else that CPU shutdown/bringup could *possibly* care about) 
 should be considered core enough that they had better be on the 
 suspend_late/resume_early list.
 
 Thomas, Rafael, can you verify that at least STR is ok in this respect?

-ETOOTIRED led me too a wrong conclusion, but still it is a valuable
hint that this change is making things work again. I need to go down
into the details of the swsusp_suspend() code path to figure out, what's
the root cause. 

Sorry for the noise, but I'm zooming in.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Thomas Gleixner
On Thu, 2007-09-20 at 19:35 -0400, Len Brown wrote:
   (Btw, the above commit message points to just my response with a testing 
   patch to the real email: the actual explanation of the INSANE ordering is 
   from Len Brown in
   
 
   https://lists.linux-foundation.org/pipermail/linux-pm/2006-November/004161.html
   
   and there Len claims that we *must* wake up CPU's early).
  
  ..and points to commit 1a38416cea8ac801ae8f261074721f35317613dc which in 
  turn talks about http://bugzilla.kernel.org/show_bug.cgi?id=5651 
  
  Howerver, it seems that bugzilla entry may just be bogus. It talks about 
  it appears that some firmware in the future may depend on that sequence 
  for correction operation
  
  Len, Shaohua, what are the real issues here? 
 
 Intel's reference BIOS for Core Duo performs some re-initialization
 in _WAK that will get blow away if INIT follows _WAK.
 IIR, it is related to re-initializing the thermal sensors.
 I opened bug 5651 when the BIOS team informed me of this issue.
 
 Yes, bringing a processor offline and then online again w/o
 an intervening suspend or reset would not evaluate _WAK,
 and thus may still run into the issue.

If this is true, then we should disable the sys//cpu/online entry
right away.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Thomas Gleixner
On Fri, 2007-09-21 at 14:51 +1000, Paul Mackerras wrote:
 Linus Torvalds writes:
 
  It would indeed be nice if we could just take CPU's down early (while 
  everything is working), and run the whole suspend code with just one CPU, 
  rather than having to worry about the ordering between CPU and device 
  takedown.
 
 That is certainly what we want to do on powerpc.

I would have expected that we do it exactly this way and it took me by
surprise, that we do not.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Thomas Gleixner
Rafael,

On Fri, 2007-09-21 at 00:30 +0200, Rafael J. Wysocki wrote:
  -ETOOTIRED led me too a wrong conclusion, but still it is a valuable
  hint that this change is making things work again.
 
 Yes, it is.
 
  I need to go down into the details of the swsusp_suspend() code path to
  figure out, what's the root cause. 
 
 If you need any help from me with that, please let me know.

I'm zooming in. It seems, that the ACPI idle code plays tricks with us.
After debugging the swsusp_suspend() code path I figured out, that we
end up in C2 or deeper power states while we run the suspend code. The
same happens when we come back on resume. It looks like we disable stuff
in the ACPI BIOS, which makes the C2 and deeper power states misbehave.
I hacked the idle loop arch code to use halt() right before we call
device_suspend() and switch back to the acpi idle code right after
device_resume(). This solves the problem as well.

Len, any opinion on this one ?

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Thomas Gleixner
Rafael,

On Fri, 2007-09-21 at 16:20 +0200, Rafael J. Wysocki wrote:
   If you need any help from me with that, please let me know.
  
  I'm zooming in. It seems, that the ACPI idle code plays tricks with us.
  After debugging the swsusp_suspend() code path I figured out, that we
  end up in C2 or deeper power states while we run the suspend code. The
  same happens when we come back on resume. It looks like we disable stuff
  in the ACPI BIOS, which makes the C2 and deeper power states misbehave.
 
 Hm, can you please run the test I've suggested in another branch of the
 thread, ie.
 
 # echo shutdown  /sys/power/disk
 # echo disk  /sys/power/state
 
 without your debugging code in disk.c?
 
 This makes the hibernation code omit the major ACPI hooks, so if it works,
 we'll know that these hooks are responsible for the problem.

Yes, this works fine. We still go into C3, but this seems not longer to
brick the box.

  I hacked the idle loop arch code to use halt() right before we call
  device_suspend() and switch back to the acpi idle code right after
  device_resume(). This solves the problem as well.
 
 Well, that seems less intrusive than changing the code ordering right before
 the major kernel release, but I think we should do our best to understand what
 _exactly_ is happening here.

I found some other subtle thinko in the clock events code while I was
heading down the swsusp_suspend code path. I wait for confirmation that
it does not brick some endangered boxen, though. Still with this change
in the clock events code, my VAIO goes into C2 or C3 and causes the box
to wait for a helping keystroke.

The correct solution would be, that the ACPI code ignores the lower
C-states during suspend / resume. I simply rmmod'ed the processor module
before suspend and the problem is solved as well. The cpuidle patches
make this problem more prominent due to the possible more direct switch
into lower power states, when we wait for a long time on something.

I think we really should not fiddle with the various cpu states during
the critical parts of suspend / resume. Let's keep it simple. We have
the same policy during boot and I think the suspend / resume critical
parts have similar constraints.

tglx






-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Thomas Gleixner
Rafael,

On Fri, 2007-09-21 at 21:20 +0200, Rafael J. Wysocki wrote:
 On Friday, 21 September 2007 18:27, Thomas Gleixner wrote:
  I simply rmmod'ed the processor module before suspend and the problem is
  solved as well. The cpuidle patches make this problem more prominent due
  to the possible more direct switch into lower power states, when we wait for
  a long time on something. 
 
 So, perhaps we can add a .suspend()/.resume() routines to the processor driver
 and use them to disable/enable the cpuidle functionality during a
 suspend/resume?

http://tglx.de/private/tglx/p.diff

untested yet, but I'm on the way to do that :)

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: clockevents: fix resume logic

2007-09-22 Thread Thomas Gleixner
On Mon, 2007-09-17 at 18:37 +, Pavel Machek wrote:
  That's a bit tricky because hitting the keyboard is what unsticks things. 
  And the video is black after resume-from-RAM (has always been thus) and we
 
 Ok, can we try to fix the video issue for you? That should make the
 development easier... I assume you tried s2ram from suspend.sf.net,
 and no combination of switches helped?

I have the same issue. Blank screen after suspend to ram. Hibernate
works.

Do you have a debug patch or something ?

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: A revised timerfd API

2007-09-22 Thread Thomas Gleixner
On Sat, 2007-09-22 at 18:07 +0200, Michael Kerrisk wrote:
 Hello Bernd,
 
 Please don't trim the CC list when replying!  I nearly did not see
 your reply, and others will have missed it also.

Yup.

 On 9/22/07, Bernd Eckenfels [EMAIL PROTECTED] wrote:
  In article [EMAIL PROTECTED] you wrote:
1. This design stretches the POSIX timers API in strange
   ways.
 
  Maybe it is possible to reimplement the POSIX API in usermode using the
  kernel's FD implementation?

Yikes.

 It's a clever idea...  Without thinking on it too long, I'm not sure
 whether or not there might be some details which would make this
 difficult.

You'd need be quite masochistic to start such a project. The POSIX timer
API consists mostly of corner cases and I doubt that you get them even
halfway under control in a pure user space implementation.

It would be a rather huge performance penalty as well. You need at least
two user space context switches to get the most simple cases resolved.

  (and drop the posix support from kernel)
 
 However we couldn't drop POSIX support from the kernel, because that
 would break the ABI.

True. So there is no point in reinventing the wheel.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: A revised timerfd API

2007-09-22 Thread Thomas Gleixner
Michael,

On Sat, 2007-09-22 at 15:12 +0200, Michael Kerrisk wrote:
 Davide, Andrew, Linus, et al.
 
 At the start of this thread
 (http://thread.gmane.org/gmane.linux.kernel/581115 ), I proposed 4
 alternatives to Davide's original timerfd API.  Based on the feedback in
 that thread (and one or two earlier comments):
 
 Let's dismiss option (a), since it is an unlovely multiplexing interface.
 
 Option (b) seems a viable.  The most notable concern was from Thomas
 Gleixner, that we might end up duplicating code from the POSIX timers API
 within the timerfd API -- some eventual refactoring might mitigate this
 problem.

It should be possible to use the timerfd syscalls as wrappers for the
posix timer implementation and add the discussed SIGEV_TIMERFD only
internally in the kernel to signal the posix timer code new delivery
mechanism.

tglx




-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] usb-gadget-ether: Prevent oops caused by error interrupt race -V2 (comments update)

2007-09-22 Thread Thomas Gleixner
From: Benedikt Spranger [EMAIL PROTECTED]
 
eth_start_xmit() can race against a disconnect interrupt in the gadget
device driver, which nukes all pending request. Right now we access the
pending request list unconditionally and dereference the request list
head itself in such a case, which results in an Oops.

Check whether the list is empty before actually dereferencing
dev-tx_reqs.next. Also add a comment for the second list_empty check
further down to avoid confusion.

Long standing bug. Patch should be applied to stable as well.

Signed-off-by: Benedikt Spranger [EMAIL PROTECTED]
Signed-off-by: Thomas Gleixner [EMAIL PROTECTED]

diff --git a/drivers/usb/gadget/ether.c b/drivers/usb/gadget/ether.c
index 593e235..f2a7bd5 100644
--- a/drivers/usb/gadget/ether.c
+++ b/drivers/usb/gadget/ether.c
@@ -1989,8 +1989,21 @@ static int eth_start_xmit (struct sk_buff *skb, struct 
net_device *net)
}
 
spin_lock_irqsave(dev-req_lock, flags);
+   /*
+* dev-tx_reqs may be empty. We raced against a disconnect
+* interrupt in the gadget device driver, which nuked all
+* pending requests.
+*/
+   if (list_empty(dev-tx_reqs)) {
+   netif_stop_queue(net);
+   spin_unlock_irqrestore(dev-req_lock, flags);
+   return 1;
+   }
+
req = container_of (dev-tx_reqs.next, struct usb_request, list);
list_del (req-list);
+
+   /* last request in list: stop queue */
if (list_empty (dev-tx_reqs))
netif_stop_queue (net);
spin_unlock_irqrestore(dev-req_lock, flags);


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [4/50] x86: add cpu codenames for Kconfig.cpu

2007-09-22 Thread Thomas Gleixner
On Sat, 2007-09-22 at 00:32 +0200, Andi Kleen wrote:
 From: Oliver Pinter [EMAIL PROTECTED]
 
 add cpu core name for arch/i386/Kconfig.cpu:Pentium 4 sections help
 add Pentium D for arch/i386/Kconfig.cpu
 add Pentium D for arch/x86_64/Kconfig
 
 Signed-off-by: Oliver Pinter [EMAIL PROTECTED]
 Signed-off-by: Andi Kleen [EMAIL PROTECTED]
 Acked-by: Sam Ravnborg [EMAIL PROTECTED]
 Cc: Andi Kleen [EMAIL PROTECTED]
 Signed-off-by: Andrew Morton [EMAIL PROTECTED]
 ---
 
  arch/i386/Kconfig.cpu |   34 +++---
  arch/x86_64/Kconfig   |6 +++---
  2 files changed, 34 insertions(+), 6 deletions(-)
 
 Index: linux/arch/i386/Kconfig.cpu
 ===
 --- linux.orig/arch/i386/Kconfig.cpu
 +++ linux/arch/i386/Kconfig.cpu
 @@ -115,11 +115,39 @@ config MPENTIUM4
   bool Pentium-4/Celeron(P4-based)/Pentium-4 M/older Xeon
   help
 Select this for Intel Pentium 4 chips.  This includes the
 -   Pentium 4, P4-based Celeron and Xeon, and Pentium-4 M
 -   (not Pentium M) chips.  This option enables compile flags
 -   optimized for the chip, uses the correct cache shift, and
 +   Pentium 4, Pentium D, P4-based Celeron and Xeon, and
 +   Pentium-4 M (not Pentium M) chips.  This option enables compile
 +   flags optimized for the chip, uses the correct cache shift, and
 applies any applicable Pentium III optimizations.
  
 +   CPUIDs: F[0-6][1-A] (in /proc/cpuinfo show = cpu family : 15 )
 +
 +   Select this for:
 + Pentiums (Pentium 4, Pentium D, Celeron, Celeron D) corename:
 + -Willamette
 + -Northwood
 + -Mobile Pentium 4
 + -Mobile Pentium 4 M
 + -Extreme Edition (Gallatin)
 + -Prescott
 + -Prescott 2M
 + -Cedar Mill
 + -Presler
 + -Smithfiled
 + Xeons (Intel Xeon, Xeon MP, Xeon LV, Xeon MV) corename:
 + -Foster
 + -Prestonia
 + -Gallatin
 + -Nocona
 + -Irwindale
 + -Cranford
 + -Potomac
 + -Paxville
 + -Dempsey
 +
 +   more info: http://balusc.xs4all.nl/srv/har-cpu.html

This will never be up to date. Also the URL above is redirected to an
empty bye/bye page. Put this up to one of the kernel related wikis, if
you think it might be useful at all. 99% of the users do not even know
which CPU they have in their system.

tglx




-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [9/50] i386: validate against ACPI motherboard resources

2007-09-22 Thread Thomas Gleixner
On Sat, 2007-09-22 at 10:28 -0600, Robert Hancock wrote:
 Yinghai Lu wrote:
  No!
  
  MMCONFIG will not work with acpi=off any more.
 
 I don't think this is unreasonable. The ACPI MCFG table is how we are 
 supposed to learn about the area in the first place. If we can't get the 
 table location via an approved mechanism, and can't validate it doesn't 
 overlap with another memory reservation or something, I really don't 
 think we should be using it.

We all know how correct ACPI tables are. Specifications are nice,
reality tells a different story.

 I don't think it's much of an issue anyway - the chances that somebody 
 will want to run without ACPI on a system with MCFG are pretty low given 
 that you'll end up losing a bunch of functionality (not least of which 
 is multi-cores).

acpi=off is an often used debug switch and it _is_ quite useful. Taking
away debug functionality is not a good idea.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [19/50] Experimental: detect if SVM is disabled by BIOS

2007-09-22 Thread Thomas Gleixner
On Sat, 2007-09-22 at 00:32 +0200, Andi Kleen wrote:
 Also allow to set svm lock.

Please use two separate patches. The detection and cpuinfo display is
not related to set svm lock.

 TBD double check, documentation, i386 support

Yes, documentation would be useful. See below.

 Signed-off-by: Andi Kleen [EMAIL PROTECTED]
 
 ---
  arch/x86_64/kernel/setup.c|   25 +++--
  include/asm-i386/cpufeature.h |1 +
  include/asm-i386/msr-index.h  |3 +++
  3 files changed, 27 insertions(+), 2 deletions(-)
 
 Index: linux/arch/x86_64/kernel/setup.c
 ===
 --- linux.orig/arch/x86_64/kernel/setup.c
 +++ linux/arch/x86_64/kernel/setup.c
 @@ -565,7 +565,7 @@ static void __cpuinit early_init_amd(str
  
  static void __cpuinit init_amd(struct cpuinfo_x86 *c)
  {
 - unsigned level;
 + unsigned level, flags, dummy;
  
  #ifdef CONFIG_SMP
   unsigned long value;
 @@ -634,7 +634,28 @@ static void __cpuinit init_amd(struct cp
   /* Family 10 doesn't support C states in MWAIT so don't use it */
   if (c-x86 == 0x10  !force_mwait)
   clear_bit(X86_FEATURE_MWAIT, c-x86_capability);
 +
 + if (c-x86 = 0xf  c-x86 = 0x11 
 + !rdmsr_safe(MSR_VM_CR, flags, dummy) 
 + (flags  0x18))
 + set_bit(X86_FEATURE_VIRT_DISABLED, c-x86_capability);

Why the check for 0x18  And please can we use understandable
constants for this.

bit 3 (SVM_LOCK) controls only the writeability of bit 4 (SVME_DISABLE),
which controls whether SVM is allowed to be enabled or not. 

bit 3   bit 4
0   0   SVM can be enabled in EFER, SVME_DISABLE is writeable
1   0   SVM can be enabled in EFER, SVME_DISABLE is not writeable
0   1   SVM can not be enabled in EFER, SVME_DISABLE is writeable
1   1   SVM can not be enabled in EFER, SVME_DISABLE is not writeable

So SVM is disabled, when bit 4 is set.

 +}
 +
 +static int enable_svm_lock(char *s)
 +{
 + if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD 
 + boot_cpu_data.x86 = 0xf  boot_cpu_data.x86 = 0x11) {
 + unsigned a,b;
 + if (rdmsr_safe(MSR_VM_CR, a, b))
 + return 0;
 + a |= (1  3);  /* set SVM lock */

SVM_LOCK is read only according to data sheet. You can set bit 4
(SVME_DISABLE) to prevent KVM or what else using that feature.

tglx




-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [20/50] x86_64: Fix some broken white space in arch/x86_64/mm/init.c

2007-09-22 Thread Thomas Gleixner

On Sat, 2007-09-22 at 00:32 +0200, Andi Kleen wrote:
 No functional changes
 Signed-off-by: Andi Kleen [EMAIL PROTECTED]

Can we please fix _ALL_ white space and coding style issues in this file
while we are at it?

Updated patch below.

tglx

diff --git a/arch/x86_64/mm/init.c b/arch/x86_64/mm/init.c
index 458893b..346c962 100644
--- a/arch/x86_64/mm/init.c
+++ b/arch/x86_64/mm/init.c
@@ -70,10 +70,11 @@ void show_mem(void)
 
printk(KERN_INFO Mem-info:\n);
show_free_areas();
-   printk(KERN_INFO Free swap:   %6ldkB\n, 
nr_swap_pages(PAGE_SHIFT-10));
+   printk(KERN_INFO Free swap:   %6ldkB\n,
+  nr_swap_pages(PAGE_SHIFT-10));
 
for_each_online_pgdat(pgdat) {
-   for (i = 0; i  pgdat-node_spanned_pages; ++i) {
+   for (i = 0; i  pgdat-node_spanned_pages; ++i) {
/* this loop can take a while with 256 GB and 4k pages
   so update the NMI watchdog */
if (unlikely(i % MAX_ORDER_NR_PAGES == 0)) {
@@ -89,7 +90,7 @@ void show_mem(void)
cached++;
else if (page_count(page))
shared += page_count(page) - 1;
-   }
+   }
}
printk(KERN_INFO %lu pages of RAM\n, total);
printk(KERN_INFO %lu reserved pages\n,reserved);
@@ -100,21 +101,22 @@ void show_mem(void)
 int after_bootmem;
 
 static __init void *spp_getpage(void)
-{ 
+{
void *ptr;
if (after_bootmem)
-   ptr = (void *) get_zeroed_page(GFP_ATOMIC); 
+   ptr = (void *) get_zeroed_page(GFP_ATOMIC);
else
ptr = alloc_bootmem_pages(PAGE_SIZE);
if (!ptr || ((unsigned long)ptr  ~PAGE_MASK))
-   panic(set_pte_phys: cannot allocate page data %s\n, 
after_bootmem?after bootmem:);
+   panic(set_pte_phys: cannot allocate page data %s\n,
+ after_bootmem?after bootmem:);
 
Dprintk(spp_getpage %p\n, ptr);
return ptr;
-} 
+}
 
 static __init void set_pte_phys(unsigned long vaddr,
-unsigned long phys, pgprot_t prot)
+   unsigned long phys, pgprot_t prot)
 {
pgd_t *pgd;
pud_t *pud;
@@ -130,10 +132,11 @@ static __init void set_pte_phys(unsigned long vaddr,
}
pud = pud_offset(pgd, vaddr);
if (pud_none(*pud)) {
-   pmd = (pmd_t *) spp_getpage(); 
+   pmd = (pmd_t *) spp_getpage();
set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE | _PAGE_USER));
if (pmd != pmd_offset(pud, 0)) {
-   printk(PAGETABLE BUG #01! %p - %p\n, pmd, 
pmd_offset(pud,0));
+   printk(PAGETABLE BUG #01! %p - %p\n, pmd,
+  pmd_offset(pud,0));
return;
}
}
@@ -162,7 +165,7 @@ static __init void set_pte_phys(unsigned long vaddr,
 }
 
 /* NOTE: this is meant to be run only at boot */
-void __init 
+void __init
 __set_fixmap (enum fixed_addresses idx, unsigned long phys, pgprot_t prot)
 {
unsigned long address = __fix_to_virt(idx);
@@ -177,7 +180,7 @@ __set_fixmap (enum fixed_addresses idx, unsigned long phys, 
pgprot_t prot)
 unsigned long __meminitdata table_start, table_end;
 
 static __meminit void *alloc_low_page(unsigned long *phys)
-{ 
+{
unsigned long pfn = table_end++;
void *adr;
 
@@ -187,8 +190,8 @@ static __meminit void *alloc_low_page(unsigned long *phys)
return adr;
}
 
-   if (pfn = end_pfn) 
-   panic(alloc_low_page: ran out of memory); 
+   if (pfn = end_pfn)
+   panic(alloc_low_page: ran out of memory);
 
adr = early_ioremap(pfn * PAGE_SIZE, PAGE_SIZE);
memset(adr, 0, PAGE_SIZE);
@@ -197,13 +200,13 @@ static __meminit void *alloc_low_page(unsigned long *phys)
 }
 
 static __meminit void unmap_low_page(void *adr)
-{ 
+{
 
if (after_bootmem)
return;
 
early_iounmap(adr, PAGE_SIZE);
-} 
+}
 
 /* Must run before zap_low_mappings */
 __meminit void *early_ioremap(unsigned long addr, unsigned long size)
@@ -224,7 +227,8 @@ __meminit void *early_ioremap(unsigned long addr, unsigned 
long size)
vaddr += addr  ~PMD_MASK;
addr = PMD_MASK;
for (i = 0; i  pmds; i++, addr += PMD_SIZE)
-   set_pmd(pmd + i,__pmd(addr | _KERNPG_TABLE | 
_PAGE_PSE));
+   set_pmd(pmd + i,
+   __pmd(addr | _KERNPG_TABLE | _PAGE_PSE));
__flush_tlb();
return (void *)vaddr;
next:
@@ -284,8 +288,9 @@ phys_pmd_update(pud_t *pud, unsigned long address, unsigned 
long end)
__flush_tlb_all();
 }
 
-static void __meminit phys_pud_init(pud_t *pud_page, unsigned long addr, 
unsigned long end)
-{ 

Re: [PATCH] [31/50] x86_64: honor notify_die() returning NOTIFY_STOP

2007-09-22 Thread Thomas Gleixner
On Sat, 2007-09-22 at 00:32 +0200, Andi Kleen wrote:
 - notify_die(DIE_OOPS, str, regs, err, current-thread.trap_no, SIGSEGV);
 + if (notify_die(DIE_OOPS, str, regs, err, current-thread.trap_no, 
 SIGSEGV) == NOTIFY_STOP)

80 chars please.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] usb-gadget-ether: Prevent oops caused by error interrupt race -V2 (comments update)

2007-09-22 Thread Thomas Gleixner

On Sat, 2007-09-22 at 12:18 -0700, David Brownell wrote:
 I think you misread my comment.  Those requests are **NOT** pending!!
 So this update has a *MORE* incorrect description of the issue. 
 
 That's just the freelist ... it's a fairly conventional model whereby
 there's a pool of free request slots which can be issued.  When the
 pool empties, the TX queue shuts down until one of the requests which
 is pending in the hardware completes, and makes a slot free.
 
 The problem you're addressing is that there's a small window where a
 disconnect IRQ can shut down the TX queue (and empty that freelist)
 after upper layers in the network stack started a transmission on
 an active (pre-disconnect) TX queue.
 
 That problem is *NOT* related to any pending requests at all!!

Sorry, I misunderstood your comment. Can you please add the correct
comment yourself before we play some more rounds of ping pong ? 

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [35/50] i386: Do cpuid_device_create() in CPU_UP_PREPARE instead of CPU_ONLINE.

2007-09-22 Thread Thomas Gleixner
On Sat, 2007-09-22 at 00:32 +0200, Andi Kleen wrote:
 From: Akinobu Mita [EMAIL PROTECTED]
 
 Do cpuid_device_create() in CPU_UP_PREPARE instead of CPU_ONLINE.
 
 Cc: H. Peter Anvin [EMAIL PROTECTED]
 Signed-off-by: Akinobu Mita [EMAIL PROTECTED]
 Signed-off-by: Andi Kleen [EMAIL PROTECTED]
 Cc: Gautham R Shenoy [EMAIL PROTECTED]
 Cc: Oleg Nesterov [EMAIL PROTECTED]
 Signed-off-by: Andrew Morton [EMAIL PROTECTED]
 ---
 
  arch/i386/kernel/cpuid.c |   32 +++-
  1 file changed, 19 insertions(+), 13 deletions(-)
 
 Index: linux/arch/i386/kernel/cpuid.c
 ===
 --- linux.orig/arch/i386/kernel/cpuid.c
 +++ linux/arch/i386/kernel/cpuid.c
 @@ -136,15 +136,18 @@ static const struct file_operations cpui
   .open = cpuid_open,
  };
  
 -static int __cpuinit cpuid_device_create(int i)
 +static int cpuid_device_create(int cpu)

__cpuinit please

Thanks,

tglx



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] usb-gadget-ether: Prevent oops caused by error interrupt race -V2 (comments update)

2007-09-22 Thread Thomas Gleixner

On Sat, 2007-09-22 at 13:14 -0700, David Brownell wrote:
 How's this?  Note that the queue should already have been stopped,
 so I removed what should be an extra call (as well as fixing the
 comments).

Yeah, stop queue should be not necessary.

 - Dave
 
 
 From: Thomas Gleixner [EMAIL PROTECTED]

Please change to:

From: Benedikt Spranger [EMAIL PROTECTED]

He did all the grump work of figuring out what's going wrong. I was just
the messenger.

 This patch fixes a longstanding race in the Ethernet gadget driver,
 which can cause an oops on device disconnect.  The fix is just to
 make the TX path check whether its freelist is empty.  That check
 is otherwise not necessary, since the queue is always stopped when
 that list empties (and restarted when request completion puts an
 entry back on that freelist).

Sigh. I need a real deep look inside that code to understand, why
tx_reqs is not a requestlist but a freelist. Very intuitive naming :)

 The race window starts when the network code decides to transmit a
 packet, and ends when hard_start_xmit() grabs the freelist lock.
 If disconnect() is called inside that window, it shuts down the
 TX queue and breaks the otherwise-solid assumption that packets are
 never sent when the TX queue is stopped.

Please add our signed offs as well

Signed-off-by: Benedikt Spranger [EMAIL PROTECTED]
Signed-off-by: Thomas Gleixner [EMAIL PROTECTED]

 Signed-off-by: David Brownell [EMAIL PROTECTED]

Thanks,
tglx


 --- a/drivers/usb/gadget/ether.c
 +++ b/drivers/usb/gadget/ether.c
 @@ -1989,8 +1989,20 @@ static int eth_start_xmit (struct sk_buff *skb, struct 
 net_device *net)
   }
  
   spin_lock_irqsave(dev-req_lock, flags);
 + /*
 +  * the freelist can be empty if an interrupt triggered disconnect()
 +  * and reconfigured the gadget (shutting down this queue) after the
 +  * network stack decided to xmit but before we got the spinlock.
 +  */
 + if (list_empty(dev-tx_reqs)) {
 + spin_unlock_irqrestore(dev-req_lock, flags);
 + return 1;
 + }
 +
   req = container_of (dev-tx_reqs.next, struct usb_request, list);
   list_del (req-list);
 +
 + /* temporarily stop TX queue when the freelist empties */
   if (list_empty (dev-tx_reqs))
   netif_stop_queue (net);
   spin_unlock_irqrestore(dev-req_lock, flags);
 
 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: A revised timerfd API

2007-09-22 Thread Thomas Gleixner
On Sat, 2007-09-22 at 14:07 -0700, Davide Libenzi wrote:
 On Sat, 22 Sep 2007, Michael Kerrisk wrote:
 
  So I'm inclined to implement option (b), unless someone has strong
  objections.  Davide, could I persuade you to help?
 
 I guess I better do, otherwise you'll continue to stress me ;)
 
 int timerfd_create(int clockid);
 int timerfd_settime(int ufd, int flags,
 const struct itimerspec *utmr,
 struct itimerspec *otmr);
 int timerfd_gettime(int ufd, struct itimerspec *otmr);
 
 Patch below. Builds, not tested yet (you need to remove the broken 
 status from CONFIG_TIMERFD in case you want to test - and plug the new 
 syscall to arch/xxx).
 May that work for you?
 Thomas-san, hrtimer_try_to_cancel() does not touch -expires and I assume
 it'll never do, granted?

Davide-san, I have no intention to change that, but remember there is
this file Documentation/stable_api_nonsense.txt :)

tglx



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 0/2] suspend/resume regression fixes

2007-09-22 Thread Thomas Gleixner
Sorry, it took me quite a while to realize the real root cause of the
VAIO - and probably many other machines - suspend/resume regressions,
which were unearthed by the dyntick / clockevents patches.

We disable a lot of ACPI/BIOS functionality during suspend, but we
keep the lower idle C-states functionality active across
suspend/resume. It seems that this causes trouble with certain BIOSes,
but I assume that the problem is more wide spread and just not
surfacing due to the various scenarios in which a machine goes into
suspend/resume. I spent some quality time to figure out a set of debug
mechanisms, which did not influence the problem. So it is quite likely
that a lot of machines might be affected by this, but due to the
configuration, interrupt scenarios,  the problem just does
not show up. 

My final enlightment was, when I removed the ACPI processor module,
which controls the lower idle C-states, right before resume; this
worked fine all the time even without all the workaround hacks.

I really hope that this two patches finally set an end to the jinxed
VAIO heisenbug series, which started when we removed the periodic
tick with the clockevents/dyntick patches.

Venki, can you please add the analogous fix to the cpuidle patch set ?

Thanks,

tglx
-- 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 1/2] ACPI: disable lower idle C-states across suspend/resume

2007-09-22 Thread Thomas Gleixner
device_suspend() calls ACPI suspend functions, which seems to have undesired
side effects on lower idle C-states. It took me some time to realize that
especially the VAIO BIOSes (both Andrews jinxed UP and my elfstruck SMP one)
show this effect. I'm quite sure that other bug reports against suspend/resume
about turning the system into a brick have the same root cause.

After fishing in the dark for quite some time, I realized that removing the ACPI
processor module before suspend (this removes the lower C-state functionality)
made the problem disappear. Interestingly enough the propability of having a
bricked box is influenced by various factors (interrupts, size of the ram image,
...). Even adding a bunch of printks in the wrong places made the problem go
away. The previous periodic tick implementation simply pampered over the
problem, which explains why the dyntick / clockevents changes made this more
prominent.

We avoid complex functionality during the boot process and we have to do the
same during suspend/resume. It is a similar scenario and equaly fragile.

Add suspend / resume functions to the ACPI processor code and disable the lower
idle C-states across suspend/resume. Fall back to the default idle
implementation (halt) instead.

Signed-off-by: Thomas Gleixner [EMAIL PROTECTED]
Tested-by: Andrew Morton [EMAIL PROTECTED]
Cc: Len Brown [EMAIL PROTECTED]
Cc: Venkatesh Pallipadi [EMAIL PROTECTED]
Cc: Rafael J. Wysocki [EMAIL PROTECTED]

---
 drivers/acpi/processor_core.c |2 ++
 drivers/acpi/processor_idle.c |   19 ++-
 include/acpi/processor.h  |2 ++
 3 files changed, 22 insertions(+), 1 deletion(-)

Index: linux-2.6/drivers/acpi/processor_core.c
===
--- linux-2.6.orig/drivers/acpi/processor_core.c2007-09-23 
00:01:00.0 +0200
+++ linux-2.6/drivers/acpi/processor_core.c 2007-09-23 00:01:00.0 
+0200
@@ -102,6 +102,8 @@ static struct acpi_driver acpi_processor
.add = acpi_processor_add,
.remove = acpi_processor_remove,
.start = acpi_processor_start,
+   .suspend = acpi_processor_suspend,
+   .resume = acpi_processor_resume,
},
 };
 
Index: linux-2.6/drivers/acpi/processor_idle.c
===
--- linux-2.6.orig/drivers/acpi/processor_idle.c2007-09-23 
00:01:00.0 +0200
+++ linux-2.6/drivers/acpi/processor_idle.c 2007-09-23 00:01:00.0 
+0200
@@ -325,6 +325,23 @@ static void acpi_state_timer_broadcast(s
 
 #endif
 
+/*
+ * Suspend / resume control
+ */
+static int acpi_idle_suspend;
+
+int acpi_processor_suspend(struct acpi_device * device, pm_message_t state)
+{
+   acpi_idle_suspend = 1;
+   return 0;
+}
+
+int acpi_processor_resume(struct acpi_device * device)
+{
+   acpi_idle_suspend = 0;
+   return 0;
+}
+
 static void acpi_processor_idle(void)
 {
struct acpi_processor *pr = NULL;
@@ -355,7 +372,7 @@ static void acpi_processor_idle(void)
}
 
cx = pr-power.state;
-   if (!cx) {
+   if (!cx || acpi_idle_suspend) {
if (pm_idle_save)
pm_idle_save();
else
Index: linux-2.6/include/acpi/processor.h
===
--- linux-2.6.orig/include/acpi/processor.h 2007-09-23 00:01:00.0 
+0200
+++ linux-2.6/include/acpi/processor.h  2007-09-23 00:01:00.0 +0200
@@ -320,6 +320,8 @@ int acpi_processor_power_init(struct acp
 int acpi_processor_cst_has_changed(struct acpi_processor *pr);
 int acpi_processor_power_exit(struct acpi_processor *pr,
  struct acpi_device *device);
+int acpi_processor_suspend(struct acpi_device * device, pm_message_t state);
+int acpi_processor_resume(struct acpi_device * device);
 
 /* in processor_thermal.c */
 int acpi_processor_get_limit_info(struct acpi_processor *pr);

-- 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 2/2] clockevents: remove the suspend/resume workaround^Wthinko

2007-09-22 Thread Thomas Gleixner
In a desparate attempt to fix the suspend/resume problem on Andrews
VAIO I added a workaround which enforced the broadcast of the oneshot
timer on resume. This was actually resolving the problem on the VAIO
but was just a stupid workaround, which was not tackling the root
cause: the assignement of lower idle C-States in the ACPI processor_idle
code. The cpuidle patches, which utilize the dynamic tick feature and
go faster into deeper C-states exposed the problem again. The correct
solution is the previous patch, which prevents lower C-states across
the suspend/resume.

Remove the enforcement code, including the conditional broadcast timer
arming, which helped to pamper over the real problem for quite a time.
The oneshot broadcast flag for the cpu, which runs the resume code can
never be set at the time when this code is executed. It only gets set,
when the CPU is entering a lower idle C-State.

Signed-off-by: Thomas Gleixner [EMAIL PROTECTED]
Tested-by: Andrew Morton [EMAIL PROTECTED]
Cc: Len Brown [EMAIL PROTECTED]
Cc: Venkatesh Pallipadi [EMAIL PROTECTED]
Cc: Rafael J. Wysocki [EMAIL PROTECTED]

---
 kernel/time/tick-broadcast.c |   17 +
 1 file changed, 1 insertion(+), 16 deletions(-)

Index: linux-2.6/kernel/time/tick-broadcast.c
===
--- linux-2.6.orig/kernel/time/tick-broadcast.c 2007-09-23 00:00:59.0 
+0200
+++ linux-2.6/kernel/time/tick-broadcast.c  2007-09-23 00:01:00.0 
+0200
@@ -382,23 +382,8 @@ static int tick_broadcast_set_event(ktim
 
 int tick_resume_broadcast_oneshot(struct clock_event_device *bc)
 {
-   int cpu = smp_processor_id();
-
-   /*
-* If the CPU is marked for broadcast, enforce oneshot
-* broadcast mode. The jinxed VAIO does not resume otherwise.
-* No idea why it ends up in a lower C State during resume
-* without notifying the clock events layer.
-*/
-   if (cpu_isset(cpu, tick_broadcast_mask))
-   cpu_set(cpu, tick_broadcast_oneshot_mask);
-
clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT);
-
-   if(!cpus_empty(tick_broadcast_oneshot_mask))
-   tick_broadcast_set_event(ktime_get(), 1);
-
-   return cpu_isset(cpu, tick_broadcast_oneshot_mask);
+   return 0;
 }
 
 /*

-- 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 0/2] suspend/resume regression fixes

2007-09-22 Thread Thomas Gleixner
Linus,

On Sat, 2007-09-22 at 15:59 -0700, Linus Torvalds wrote:
  My final enlightment was, when I removed the ACPI processor module,
  which controls the lower idle C-states, right before resume; this
  worked fine all the time even without all the workaround hacks.
  
  I really hope that this two patches finally set an end to the jinxed
  VAIO heisenbug series, which started when we removed the periodic
  tick with the clockevents/dyntick patches.
 
 Ok, so the patches look fine, but I somehow have this slight feeling that 
 you gave up a bit too soon on the *why* does this happen? question.

Yeah, I gave up at the point where I was not longer able to dig
deeper :)

 I realize that the answer is easily because ACPI screwed up, but I'm 
 wondering if there's something we do to trigger that screw-up.

Fair enough.

 In particular, I also suspect that this may not really fix the problem - 
 maybe it just makes the window sufficiently small that it no longer 
 triggers. Because we don't necessarily understand what the real background 
 for the problem is, I'm not sure we can say that it is solved.
 
 The reason I say this is that I have a suspicion on what triggers it.
 
 I suspect that the problem is that we do
 
   pm_ops-prepare();
   disable_nonboot_cpus()
   suspend_enter();
   enable_nonboot_cpus()
   pm_finish()
 
 and here the big thing to notice is that pm_ops-prepare() call, which 
 sets the wakup vector etc etc.
 
 So maybe the real problem here is that once we've done the -prepare() 
 call and ACPI has set up various stuff, we MUST NOT do any calls to any 
 ACPI routines to set low-power states, because the stupid firmware isn't 
 expecting it.

That's what I suspect and deduced from the various experiments including
a force the cpu into a lower c-state one, which triggered the problem
fully reproducible. Note that in case of the force a lower c-state I
verified, that the PIT was activated to avoid the local apic stops in c3
issue. But I never got an PIT interrupt. Either the box was completely
stuck or I was able to recover by hitting a key, which is as well one of
the unexplained phenomenons.

 Now, if this is the cause, then I think your patch should indeed fix it, 
 since you get called by the early-suspend code (which happens *before* the 
 -prepare() call), but at the same time, I wonder if maybe it would be 
 slightly more correct to instead of using the suspend/resume callbacks, 
 simply do this in the acpi_pm_prepare() stage, since that is likely the 
 thing that triggers it?

Yeah, probably that's the correct point, but I leave this to the ACPI
wizards.

 But hey, I think I'll apply the patches as-is. I'd just feel even better 
 if we actually understood *why* doing the CPU Cx states is not something 
 we can do around the suspend code!

That needs some explanation of the folks who can actually look beyond
the ACPI/BIOS internals.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] usb-gadget-ether: Prevent oops caused by error interrupt race -V2 (comments update)

2007-09-22 Thread Thomas Gleixner
On Sat, 2007-09-22 at 13:53 -0700, David Brownell wrote:
  Sigh. I need a real deep look inside that code to understand, why
  tx_reqs is not a requestlist but a freelist. Very intuitive naming :)
 
 It *is* a list of requests:  free ones -- the only kind this level of
 driver is allowed to remember!  ;)
 
 Yeah, I had to go back and read the driver again before I understood
 just what problem this patch was trying to fix.  Which is why I wanted
 to make sure the mismatch between comments and contents was resolved.

Fair enough. Thanks for sanitizing the comments.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [35/50] i386: Do cpuid_device_create() in CPU_UP_PREPARE instead of CPU_ONLINE.

2007-09-23 Thread Thomas Gleixner
On Sun, 2007-09-23 at 10:52 +0900, Akinobu Mita wrote:
arch/i386/kernel/cpuid.c |   32 +++-
1 file changed, 19 insertions(+), 13 deletions(-)
  
   Index: linux/arch/i386/kernel/cpuid.c
   ===
   --- linux.orig/arch/i386/kernel/cpuid.c
   +++ linux/arch/i386/kernel/cpuid.c
   @@ -136,15 +136,18 @@ static const struct file_operations cpui
 .open = cpuid_open,
};
  
   -static int __cpuinit cpuid_device_create(int i)
   +static int cpuid_device_create(int cpu)
 
  __cpuinit please
 
 
 Yes. This eliminates earlier patch in this series.
 ([22/50] i386: Misc cpuinit annotation)

No, it's even worse:

#22 is applied before #35. 
#35 is reverting the __cpuinit anotation of #22 with its modificiations
of cpuid_device_create()

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents

2007-09-23 Thread Thomas Gleixner
On Sun, 2007-09-23 at 12:57 +0200, Rafael J. Wysocki wrote:
 Hi Thomas,
 
 Unfortunately, my observation that the patch series:
 
 http://tglx.de/projects/hrtimers/2.6.23-rc4/patch-2.6.23-rc4-hrt1.patches.tar.bz2
 
 worked with 2.6.23-rc4 was wrong.  It _sometimes_ works, but usually doesn't
 boot, just like 2.6.23-rc4-mm1, 2.6.23-rc6-mm1 and everything in between with
 the above patch series applied.  I've also tried:
 
 http://tglx.de/projects/hrtimers/2.6.23-rc5/patch-2.6.23-rc5-hrt1.patches.tar.bz2
 http://tglx.de/projects/hrtimers/2.6.23-rc6/patch-2.6.23-rc6-hrt2.patch
 
 with the same result.
 
 The problematic patch is x86_64-convert-to-clockevents.patch .
 
 Since the boot fails very early, before any messages reach the (VGA) console,
 I have no idea what to do next, except for digging in the code.

Ok, lets track it down. Is there any difference when you add:

nohz=off
highres=off
noapictimer

or any combinations of the above to the kernel command line ?

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents

2007-09-23 Thread Thomas Gleixner
On Sun, 2007-09-23 at 22:08 +0200, Rafael J. Wysocki wrote:
   Since the boot fails very early, before any messages reach the (VGA) 
   console,
   I have no idea what to do next, except for digging in the code.
  
  Ok, lets track it down. Is there any difference when you add:
  
  nohz=off
  highres=off
  noapictimer
  
  or any combinations of the above to the kernel command line ?
 
 First, for now, I build all kernels with NO_HZ and HIGH_RES_TIMERS unset
 (.config for 2.6.23-rc6-mm1 is attached).
 
 Second, noacpitimer added to the command line makes all of the kernels, up to
 and including 2.6.23-rc6-mm1, boot (this seems to be 100% reproducible).

That's valuable information. Can you please provide a boot log of one of
those with an additional apic=verbose on the command line ?

Thanks,

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents

2007-09-24 Thread Thomas Gleixner
On Sun, 2007-09-23 at 22:52 +0200, Rafael J. Wysocki wrote:
   Second, noacpitimer added to the command line makes all of the kernels, 
   up to
   and including 2.6.23-rc6-mm1, boot (this seems to be 100% reproducible).
  
  That's valuable information. Can you please provide a boot log of one of
  those with an additional apic=verbose on the command line ?
 
 Attached is the dmesg output from the 2.6.23-rc6 kernel with the patchset:
 
 http://tglx.de/projects/hrtimers/2.6.23-rc4/patch-2.6.23-rc4-hrt1.patches.tar.bz2
 
 applied.  I also have the 2.6.23-rc6-mm1 dmesg output ready, but there's some
 -mm-specific noise in it.  Please let me know if you want it, though.

Hmm:

 Command line: root=/dev/sda3 vga=792 resume=/dev/sda1 noacpitimer 
 apic=verbose 2
^^^

noacpitimer is not a valid commandline option.

I asked for: 
   noapictimer

So I really wonder, why noacpitimer on the kernel command line makes any
difference. I'm confused.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/3] new timerfd API - new timerfd API

2007-09-24 Thread Thomas Gleixner
Davide,

On Sun, 2007-09-23 at 15:49 -0700, Davide Libenzi wrote:
 This is the new timerfd API as it is implemented by the following patch:
 ---
  fs/compat.c  |   32 ++-
  fs/timerfd.c |  199 
 ++-
  include/linux/compat.h   |7 +
  include/linux/syscalls.h |7 +
  4 files changed, 168 insertions(+), 77 deletions(-)
 
 Index: linux-2.6.mod/fs/timerfd.c
 ===
 --- linux-2.6.mod.orig/fs/timerfd.c   2007-09-23 15:18:09.0 -0700
 +++ linux-2.6.mod/fs/timerfd.c2007-09-23 15:25:55.0 -0700
 @@ -23,15 +23,17 @@
  
  struct timerfd_ctx {
   struct hrtimer tmr;
 + int clockid;
   ktime_t tintv;
   wait_queue_head_t wqh;
   int expired;
 + u64 ticks;
  };

Can you please restructure the struct in a way which does not result in
padding by the compiler ?

struct timerfd_ctx {
struct hrtimer tmr;
ktime_t tintv;
wait_queue_head_t wqh;
u64 ticks;
int expired;
int clockid;
};

 + ticks += (u64)
   hrtimer_forward(ctx-tmr,
   hrtimer_cb_get_time(ctx-tmr),

You need to use ctx-tmr.base-get_time() here, otherwise you might read
a stale time value (in case that CONFIG_HIGH_RES_TIMERS is off).

 - ctx-tintv);
 + ctx-tintv) - 1;
   hrtimer_restart(ctx-tmr);

 +asmlinkage long sys_timerfd_create(int clockid)
  {
 - int error;
 + int error, ufd;
   struct timerfd_ctx *ctx;
   struct file *file;
   struct inode *inode;
 - struct itimerspec ktmr;
 -
 - if (copy_from_user(ktmr, utmr, sizeof(ktmr)))
 - return -EFAULT;
  
   if (clockid != CLOCK_MONOTONIC 
   clockid != CLOCK_REALTIME)
   return -EINVAL;
 +
 + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
 + if (!ctx)
 + return -ENOMEM;
 +
 + init_waitqueue_head(ctx-wqh);
 + ctx-clockid = clockid;
 + hrtimer_init(ctx-tmr, clockid, HRTIMER_MODE_ABS);
 +
 + error = anon_inode_getfd(ufd, inode, file, [timerfd],
 +  timerfd_fops, ctx);
 + if (error)
 + goto err_kfree_ctx;
 +
 + return ufd;
 +
 +err_kfree_ctx:
 + kfree(ctx);
 + return error;

You really can avoid the goto here.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents

2007-09-24 Thread Thomas Gleixner
On Mon, 2007-09-24 at 14:57 +0200, Rafael J. Wysocki wrote:
   http://tglx.de/projects/hrtimers/2.6.23-rc4/patch-2.6.23-rc4-hrt1.patches.tar.bz2
   
   applied.  I also have the 2.6.23-rc6-mm1 dmesg output ready, but there's 
   some
   -mm-specific noise in it.  Please let me know if you want it, though.
  
  Hmm:
  
   Command line: root=/dev/sda3 vga=792 resume=/dev/sda1 noacpitimer 
   apic=verbose 2
  ^^^
  
  noacpitimer is not a valid commandline option.
  
  I asked for: 
 noapictimer
 
 I'm blind, sorry.
 
  So I really wonder, why noacpitimer on the kernel command line makes any
  difference. I'm confused.
 
 \metoo
 
 Well, it was probably read as noacpi. :-)

Hmm, ACPI is in the log all over the place.

 Fortunately, noapictimer helps as well, dmesg attached (I have the one
 from 2.6.23-rc6-mm1 ready, too).

Ok, at which point is the box stopping, when you omit noa* ? Is
earlyprintk giving you any useful info ?

tglx




-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents

2007-09-24 Thread Thomas Gleixner
On Mon, 2007-09-24 at 15:52 +0200, Rafael J. Wysocki wrote:
So I really wonder, why noacpitimer on the kernel command line makes any
difference. I'm confused.
   
   \metoo
   
   Well, it was probably read as noacpi. :-)
  
  Hmm, ACPI is in the log all over the place.
 
 Well, noacpi seems to be a synonym for pci=noacpi.
 
 Anyway, it causes acpi_disable_pci() to be executed, which according to
 Documentation/kernel-parameters.txt means Do not use ACPI for IRQ routing or
 for PCI scanning (it works like this on x86_64 too, although the doc says 
 it's
 x86_32-specific).

Hrm. The local apic timer calibration does not use anything which is
related to interrupts, but if we use the local APIC timer we switch off
PIT.

Can you boot Linus latest (w/o hrt patches) and add apicmaintimer to
the kernel command line please ?

 And yes, it matches noacpiwhatever in the command line with noacpi.  Sigh.

Urgh.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/3] new timerfd API - new timerfd API

2007-09-24 Thread Thomas Gleixner
On Mon, 2007-09-24 at 08:42 -0700, Davide Libenzi wrote:
   + ticks += (u64)
 hrtimer_forward(ctx-tmr,
 hrtimer_cb_get_time(ctx-tmr),
  
  You need to use ctx-tmr.base-get_time() here, otherwise you might read
  a stale time value (in case that CONFIG_HIGH_RES_TIMERS is off).
 
 Is the particular position of hrtimer_cb_get_time() in the code that would 
 break here? Because function was added by your patch ;)
 Did something change later?

For non high res systems we speed up the access to now by storing the
current time when we start to process the hrtimer softirq callbacks.

hrtimer_cb_get_time(timer) reads timer-base-now

For high resolution systems hrtimer_cb_get_time() resolves to
timer-base-get_time().

In the timerfd case we are not in softirq context and we read at any
given later time. Also on SMP the base-now variable might be changed by
the softirq running on the other CPU.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents

2007-09-24 Thread Thomas Gleixner
On Mon, 2007-09-24 at 17:18 +0200, Rafael J. Wysocki wrote:
   Well, noacpi seems to be a synonym for pci=noacpi.
   
   Anyway, it causes acpi_disable_pci() to be executed, which according to
   Documentation/kernel-parameters.txt means Do not use ACPI for IRQ 
   routing or
   for PCI scanning (it works like this on x86_64 too, although the doc 
   says it's
   x86_32-specific).
  
  Hrm. The local apic timer calibration does not use anything which is
  related to interrupts, but if we use the local APIC timer we switch off
  PIT.
  
  Can you boot Linus latest (w/o hrt patches) and add apicmaintimer to
  the kernel command line please ?
 
 Works, dmesg attached.

/me scratches head

We know, that
- disabling local apic timers work
- local apic timers (which turn off PIT) work. when noacpiFSCKEDPARSING
is given on the kernel command line.

I have no clue, what might be the difference of noacpiFSCKEDPARSING. The
boot log is not giving any hint at all.

acpi_disable_pci() sets acpi_pci_disabled and acpi_noirq to 1.

What happens, if you set acpi=noirq instead ?

tglx








-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents

2007-09-24 Thread Thomas Gleixner
On Mon, 2007-09-24 at 21:11 +0200, Rafael J. Wysocki wrote:
  /me scratches head
 
 Retested.
 
  We know, that
  - disabling local apic timers work
 
 This works reproducibly accross the board.

Ok

  - local apic timers (which turn off PIT) work. when noacpiFSCKEDPARSING
 
 This stopped working, although it evidently worked yesterday (wtf?).
 
 There seems to be a history effect in the box, to make things more
 interesting.

Did you connect this box to Andrews VAIO during KS ?

 I think the only solid data point so far is that noapictimer makes the box
 boot.

Ok. Can you add nmi_watchdog=1 to the command line please. This runs
through the calibration of APIC, but registers it as a dummy clock
source (the PIT must run to make the watchdog work).

If it boots, please provide the output of /proc/timer_list

Thanks, 

tlgx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc7-mm1

2007-09-24 Thread Thomas Gleixner
On Mon, 2007-09-24 at 12:34 -0700, Andrew Morton wrote:
  It prints twice 'System halted' and blinks the keyboard leds, but does
  not switch off. On all other kernel version I only see one keyboard
  blink before the power goes out.
 
 ok...
 
  I compared its dmesg to vanilla-rc7 and -rc4-mm1, but expect that rc-4
  assigns different IRQs I can't see any differences except the normal
  variation in BogoMips etc.

Can your check whether 2.6.23-rc7 +
http://tglx.de/projects/hrtimers/2.6.23-rc7/patch-2.6.23-rc7-hrt1.patch

works for you ?

 hm, dunno.  The only substantial patch which touches
 arch/x86_64/kernel/process.c (which is where cpu_idle lives) is
 x86_64-prep-idle-loop-for-dynticks.patch.
 
 The problem is, 2.6.23-rc6-mm1's git-acpi patch had all the new cpuidle
 code in it.  Len dropped all that code over the weekend (which is when I
 picked this copy of his tree), so 2.6.23-rc7-mm1 doesn't have the cpuidle
 code.  Len will be reapplying the cpuidle patches today(ish) so next -mm
 _will_ have the cpuidle code.
 
 So what we have in rc7-mm1 is this transient no-cpuidle state.  It could be
 that the x86_64 dynticks code (which was developed previously tested in
 conjunction with the cpuidle patches) has some dependency on cpuidle.

It should not. cpuidle makes use of dynticks not the other way round.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/4] new timerfd API v2 - introduce a new hrtimer_forward_now() function

2007-09-24 Thread Thomas Gleixner

On Mon, 2007-09-24 at 13:22 -0700, Davide Libenzi wrote:
 I think that advancing the timer against the timer's current now can
 be a pretty common usage, so, w/out exposing hrtimer's internals, we add
 a new hrtimer_forward_now() function.
 
 
 
 Signed-off-by: Davide Libenzi [EMAIL PROTECTED]

Reviewed-and-Acked-by: Thomas Gleixner [EMAIL PROTECTED]

 
 - Davide
 
 
 ---
  include/linux/hrtimer.h |7 +++
  1 file changed, 7 insertions(+)
 
 Index: linux-2.6.mod/include/linux/hrtimer.h
 ===
 --- linux-2.6.mod.orig/include/linux/hrtimer.h2007-09-24 
 12:27:20.0 -0700
 +++ linux-2.6.mod/include/linux/hrtimer.h 2007-09-24 12:29:39.0 
 -0700
 @@ -298,6 +298,13 @@
  extern unsigned long
  hrtimer_forward(struct hrtimer *timer, ktime_t now, ktime_t interval);
  
 +/* Forward a hrtimer so it expires after the hrtimer's current now */
 +static inline unsigned long hrtimer_forward_now(struct hrtimer *timer,
 + ktime_t interval)
 +{
 + return hrtimer_forward(timer, timer-base-get_time(), interval);
 +}
 +
  /* Precise sleep: */
  extern long hrtimer_nanosleep(struct timespec *rqtp,
 struct timespec __user *rmtp,
 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: x86-64 sporadic hang in 2.6.23rc7 and 2.6.22

2007-09-24 Thread Thomas Gleixner

On Mon, 2007-09-24 at 23:08 +0200, Helge Hafting wrote:
 The two kernels mentioned hangs occationally.
 Typically when I compile something and pass the time
 by surfing the web.
 
 A few minutes and then I notice that the mouse (and everything else in X)
 stops.  kbd LEDs does not react to numlock/capslock.
 The only thing that still works is sysrq+B
 So far this has happened while running X, so no messages.
 
 I have gone back to 2.6.22rc4, which seems to work.
 
 This is a single opteron, although on a dual-slot board.

Can you switch to serial console, so we can get some information out of
that box? Sysrq-B is working, so we can get info from other sysrq
functions as well.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc7-mm1

2007-09-25 Thread Thomas Gleixner

On Tue, 2007-09-25 at 09:32 +0200, Torsten Kaiser wrote:
 On 9/24/07, Thomas Gleixner [EMAIL PROTECTED] wrote:
  Can your check whether 2.6.23-rc7 +
  http://tglx.de/projects/hrtimers/2.6.23-rc7/patch-2.6.23-rc7-hrt1.patch
 
  works for you ?
 
 Yes, powers off normally.

Ok, so it's probably some merge artifact in -mm. We'll get this sorted
out once Len has his new tree available.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents

2007-09-25 Thread Thomas Gleixner
On Tue, 2007-09-25 at 10:14 +0400, Mikhail Kshevetskiy wrote:
 Hello Thomas, Rafael
 
  We know, that
  - disabling local apic timers work
 
 As i can see from the log, you are booting on computer with dualcore AMD
 processor. Do you have C1E feature enabled? 
 
 i386 kernel disable lapic on dualcore AMD with C1E support (see 
 http://lkml.org/lkml/2007/3/29/199). x86_64 kernel do not have this
 patch still (it's required for tickless kernel only).

Well it is required for non tickless mode as well.

  As result, if
 you run x86_64 kernel with hrt patch on such computer, the system
 will stall during boot on lapic timer calibration.

Thanks for the reminder. I have a look into this.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Why do so many machines need noapic?

2007-09-25 Thread Thomas Gleixner
Chuck,

On Thu, 2007-09-13 at 12:38 -0400, Chuck Ebbert wrote:
 On 09/10/2007 03:44 PM, Andi Kleen wrote:
  Yes, it has an hpet. And I tried every combination of options I could
  think of.
  
  But, even stranger, x86_64 works (only i386 fails.)
  
  x86-64 has quite different time code (at least until the dyntick patches
  currently in mm) 
  
  Obvious thing would be to diff the boot messages and see if anything
  jumps out (e.g. in interrupt routing).  
  
  Or check with mm and if x86-64 is broken there too then it's likely
  the new time code.
 
 I reported too soon that x86_64 works. It does not work, it just takes
 a bit longer before it freezes. There are message threads all over the
 place discussing this problem with the HP Pavilion tx 1000, and it seems
 the best workaround is to use the nolapic option instead of noapic.
 Using that, it is totally stable _and_ there are no spurious interrupts
 that would otherwise break USB. Interrupt setup is a bit strange, though:

can you please send me 32 and 64 bit boot logs of mainline and fedora
kernels ?

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents

2007-09-25 Thread Thomas Gleixner
Rafael,

On Tue, 2007-09-25 at 10:07 +0200, Thomas Gleixner wrote:
 On Tue, 2007-09-25 at 10:14 +0400, Mikhail Kshevetskiy wrote:
  Hello Thomas, Rafael
  
   We know, that
   - disabling local apic timers work
  
  As i can see from the log, you are booting on computer with dualcore AMD
  processor. Do you have C1E feature enabled? 
  
  i386 kernel disable lapic on dualcore AMD with C1E support (see 
  http://lkml.org/lkml/2007/3/29/199). x86_64 kernel do not have this
  patch still (it's required for tickless kernel only).
 
 Well it is required for non tickless mode as well.
 
   As result, if
  you run x86_64 kernel with hrt patch on such computer, the system
  will stall during boot on lapic timer calibration.
 
 Thanks for the reminder. I have a look into this.

Can you please boot mainline and provide the output of:

# cat /proc/interrupts; sleep 10; cat /proc/interrupts

Thanks,

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc8-mm1, -rc7-mm1 kill audio on HP nx6325

2007-09-25 Thread Thomas Gleixner
On Tue, 2007-09-25 at 14:08 +0200, Rafael J. Wysocki wrote:
 Hi,
 
 This patch from Andi:
 
 x86_64-mm-cpa-einval.patch
 
 makes the hda_intel audio driver stop working on my HP nx6325.
 
 The following line appears in dmesg (from 2.6.23-rc7-mm1:
 
 ALSA /home/rafael/src/mm/linux-2.6.23-rc7-mm1/sound/pci/hda/hda_intel.c:1755: 
 hd
 a-intel: ioremap error
 
 and the driver doesn't work afterwards.
 
 Still, I'm not sure if the patch above is wrong or rather it exposes a problem
 in the driver.

The patch is correct. Instead of returning Success in the case of a
failure of lookup_address, it now returns -EINVAL, which in turn makes
the ioremap fail.

OTOH, the driver ioremap call looks straight forward. Can you apply the
patch below and provide the resulting debug output please ?

Thanks,

tglx

Index: linux-2.6.23-rc8-mm/arch/x86_64/mm/pageattr.c
===
--- linux-2.6.23-rc8-mm.orig/arch/x86_64/mm/pageattr.c  2007-09-25 
14:05:41.0 +0200
+++ linux-2.6.23-rc8-mm/arch/x86_64/mm/pageattr.c   2007-09-25 
14:09:35.0 +0200
@@ -156,8 +156,10 @@ __change_page_attr(unsigned long address
pgprot_t ref_prot2;
 
kpte = lookup_address(address);
-   if (!kpte)
+   if (!kpte) {
+   printk(lookup failed for %lu\n, address);
return -EINVAL;
+   }
 
kpte_page = virt_to_page(((unsigned long)kpte)  PAGE_MASK);
BUG_ON(PageCompound(kpte_page));
Index: linux-2.6.23-rc8-mm/sound/pci/hda/hda_intel.c
===
--- linux-2.6.23-rc8-mm.orig/sound/pci/hda/hda_intel.c  2007-09-25 
14:05:43.0 +0200
+++ linux-2.6.23-rc8-mm/sound/pci/hda/hda_intel.c   2007-09-25 
14:09:28.0 +0200
@@ -1752,7 +1752,8 @@ static int __devinit azx_create(struct s
chip-addr = pci_resource_start(pci, 0);
chip-remap_addr = ioremap_nocache(chip-addr, pci_resource_len(pci,0));
if (chip-remap_addr == NULL) {
-   snd_printk(KERN_ERR SFX ioremap error\n);
+   snd_printk(KERN_ERR SFX ioremap error: %lu %lu\n,
+  chip-addr, pci_resource_len(pci, 0));
err = -ENXIO;
goto errout;
}



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents

2007-09-25 Thread Thomas Gleixner
On Tue, 2007-09-25 at 14:20 +0200, Rafael J. Wysocki wrote:
As i can see from the log, you are booting on computer with dualcore AMD
processor. Do you have C1E feature enabled? 
 
 That's possible, how to check?
 
i386 kernel disable lapic on dualcore AMD with C1E support (see 
http://lkml.org/lkml/2007/3/29/199). x86_64 kernel do not have this
patch still (it's required for tickless kernel only).
   
   Well it is required for non tickless mode as well.
   
 As result, if
you run x86_64 kernel with hrt patch on such computer, the system
will stall during boot on lapic timer calibration.
   
   Thanks for the reminder. I have a look into this.
  
  Can you please boot mainline and provide the output of:
  
  # cat /proc/interrupts; sleep 10; cat /proc/interrupts
 
 albercik:~ # cat /proc/interrupts; sleep 10; cat /proc/interrupts
CPU0   CPU1
   0:1159492  0  local-APIC-edge  timer
 LOC:  01158220   Local interrupts

   0:1161996  0  local-APIC-edge  timer
 LOC:  01160723   Local interrupts

Hmm. That's strange. It looks like the local apic timer is not used, but
x86_64 definitely lacks the above check. Can you please remove/disable
the acpi processor module and recheck ?

tglx




-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents

2007-09-25 Thread Thomas Gleixner
On Tue, 2007-09-25 at 15:16 +0200, Rafael J. Wysocki wrote:
   There seems to be a history effect in the box, to make things more
   interesting.
  
  Did you connect this box to Andrews VAIO during KS ?
 
 No, but it's famous for being interestingly broken nevertheless.

:)

   I think the only solid data point so far is that noapictimer makes the 
   box
   boot.
  
  Ok. Can you add nmi_watchdog=1 to the command line please. This runs
  through the calibration of APIC, but registers it as a dummy clock
  source (the PIT must run to make the watchdog work).
  
  If it boots, please provide the output of /proc/timer_list
 
 No, it doesn't.

I start to get desperate. Below is a patch, which moves the apic timer
disable check after the calibration routine. Can you please apply on top
of -hrt and add noapictimer to the command line ? Does it boot ?

tglx

Index: linux-2.6.23-rc7/arch/x86_64/kernel/apic.c
===
--- linux-2.6.23-rc7.orig/arch/x86_64/kernel/apic.c 2007-09-24 
20:30:00.0 +0200
+++ linux-2.6.23-rc7/arch/x86_64/kernel/apic.c  2007-09-25 15:05:32.0 
+0200
@@ -927,6 +927,7 @@ static void __init calibrate_APIC_clock(
 
 void __init setup_boot_APIC_clock (void)
 {
+#if 0
/*
 * The local apic timer can be disabled via the kernel commandline.
 * Register the lapic timer as a dummy clock event source on SMP
@@ -940,7 +941,7 @@ void __init setup_boot_APIC_clock (void)
setup_APIC_timer();
return;
}
-
+#endif
printk(KERN_INFO Using local APIC timer interrupts.\n);
calibrate_APIC_clock();
 
@@ -949,11 +950,13 @@ void __init setup_boot_APIC_clock (void)
 * PIT/HPET going.  Otherwise register lapic as a dummy
 * device.
 */
-   if (nmi_watchdog != NMI_IO_APIC)
+   if (!disable_apic_timer  nmi_watchdog != NMI_IO_APIC)
lapic_clockevent.features = ~CLOCK_EVT_FEAT_DUMMY;
+#if 0
else
printk(KERN_WARNING APIC timer registered as dummy,
due to nmi_watchdog=1!\n);
+#endif
 
setup_APIC_timer();
 }


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc8-mm1, -rc7-mm1 kill audio on HP nx6325

2007-09-25 Thread Thomas Gleixner
On Tue, 2007-09-25 at 15:20 +0200, Rafael J. Wysocki wrote:
  The patch is correct. Instead of returning Success in the case of a
  failure of lookup_address, it now returns -EINVAL, which in turn makes
  the ioremap fail.
  
  OTOH, the driver ioremap call looks straight forward. Can you apply the
  patch below and provide the resulting debug output please ?
 
 lookup failed for 18446604438082158592
 [--snipped some USB messages--]
 ALSA /home/rafael/src/mm/linux-2.6.23-rc8-mm1/sound/pci/hda/hda_intel.c:1756: 
 hda-intel: ioremap error: 2349334528 16384

Stupid me, hex formatting would have been easier to read :)

Lookup failed for 0x 8100 8C08 
ioremap:  0x  8C08  length 16384

It seems, that this patch only reveals some other wreckage. The code is
called as part of ioremap, where it adjusts the caching attributes of
the mapping, which was setup right before change_page_attr_address() is
called.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc8-mm1, -rc7-mm1 kill audio on HP nx6325

2007-09-25 Thread Thomas Gleixner
On Tue, 2007-09-25 at 16:29 +0200, Rafael J. Wysocki wrote: 
   lookup failed for 18446604438082158592
   [--snipped some USB messages--]
   ALSA 
   /home/rafael/src/mm/linux-2.6.23-rc8-mm1/sound/pci/hda/hda_intel.c:1756: 
   hda-intel: ioremap error: 2349334528 16384
  
  Stupid me, hex formatting would have been easier to read :)
  
  Lookup failed for 0x 8100 8C08 
  ioremap:  0x  8C08  length 16384
  
  It seems, that this patch only reveals some other wreckage. The code is
  called as part of ioremap, where it adjusts the caching attributes of
  the mapping, which was setup right before change_page_attr_address() is
  called.
 
 Hm, it looks like the first address is a kernel one and the second one is
 physical, so they apparently match, which means that the lookup shouldn't 
 fail,
 if I understand this correctly.

Yes, the lookup address is virtual and it should be the one, which was
mapped right before the call to change_page_attr_address(). I'm looking
into that right now.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86-64: Disable local APIC timer use on AMD systems with C1E

2007-09-25 Thread Thomas Gleixner
commit 3556ddfa9284a86a59a9b78fe5894430f6ab4eef titled

 [PATCH] x86-64: Disable local APIC timer use on AMD systems with C1E

solves a problem with AMD dual core laptops e.g. HP nx6325 (Turion 64
X2) with C1E enabled:

When both cores go into idle at the same time, then the system switches
into C1E state, which is basically the same as C3. This stops the local
apic timer.

This was debugged right after the dyntick merge on i386 and despite the
patch title it fixes only the 32 bit path.

x86_64 is still missing this fix. It seems that mainline is not really
affected by this issue, as the PIT is running and keeps jiffies
incrementing, but that's just waiting for trouble. 

-mm suffers from this problem due to the x86_64 high resolution timer
patches.

This is a quick and dirty port of the i386 code to x86_64. 

I spent quite a time with Rafael to debug the -mm / hrt wreckage until
someone pointed us to this. I really had forgotten that we debugged this
half a year ago already. 

Sigh, is it just me or is there something yelling arch/x86 into my ear?

Signed-off-by: Thomas Gleixner [EMAIL PROTECTED]

diff --git a/arch/x86_64/kernel/setup.c b/arch/x86_64/kernel/setup.c
index af838f6..32054bf 100644
--- a/arch/x86_64/kernel/setup.c
+++ b/arch/x86_64/kernel/setup.c
@@ -546,6 +546,37 @@ static void __init amd_detect_cmp(struct cpuinfo_x86 *c)
 #endif
 }
 
+#define ENABLE_C1E_MASK0x1800
+#define CPUID_PROCESSOR_SIGNATURE  1
+#define CPUID_XFAM 0x0ff0
+#define CPUID_XFAM_K8  0x
+#define CPUID_XFAM_10H 0x0010
+#define CPUID_XFAM_11H 0x0020
+#define CPUID_XMOD 0x000f
+#define CPUID_XMOD_REV_F   0x0004
+
+/* AMD systems with C1E don't have a working lAPIC timer. Check for that. */
+static __cpuinit int amd_apic_timer_broken(void)
+{
+   u32 lo, hi;
+   u32 eax = cpuid_eax(CPUID_PROCESSOR_SIGNATURE);
+   switch (eax  CPUID_XFAM) {
+   case CPUID_XFAM_K8:
+   if ((eax  CPUID_XMOD)  CPUID_XMOD_REV_F)
+   break;
+   case CPUID_XFAM_10H:
+   case CPUID_XFAM_11H:
+   rdmsr(MSR_K8_ENABLE_C1E, lo, hi);
+   if (lo  ENABLE_C1E_MASK)
+   return 1;
+   break;
+   default:
+   /* err on the side of caution */
+   return 1;
+   }
+   return 0;
+}
+
 static void __cpuinit init_amd(struct cpuinfo_x86 *c)
 {
unsigned level;
@@ -617,6 +648,9 @@ static void __cpuinit init_amd(struct cpuinfo_x86 *c)
/* Family 10 doesn't support C states in MWAIT so don't use it */
if (c-x86 == 0x10  !force_mwait)
clear_bit(X86_FEATURE_MWAIT, c-x86_capability);
+
+   if (amd_apic_timer_broken())
+   disable_apic_timer = 1;
 }
 
 static void __cpuinit detect_ht(struct cpuinfo_x86 *c)
diff --git a/include/asm-x86_64/apic.h b/include/asm-x86_64/apic.h
index 85125ef..e458020 100644
--- a/include/asm-x86_64/apic.h
+++ b/include/asm-x86_64/apic.h
@@ -20,6 +20,7 @@ extern int apic_verbosity;
 extern int apic_runs_main_timer;
 extern int ioapic_force;
 extern int apic_mapped;
+extern int disable_apic_timer;
 
 /*
  * Define the default level of output to be very little


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] UML - time build fix

2007-09-25 Thread Thomas Gleixner

On Tue, 2007-09-25 at 13:37 -0400, Jeff Dike wrote:
 Put back an implementation of timeval_to_ns in
 arch/um/os-Linux/time.c.  tglx pointed out in his review of tickless
 support that there was a perfectly good implementation of it in
 linux/time.h.  The problem is that this is userspace code which can't
 pull in kernel headers and there doesn't seem to be a libc version.

Oops. Did not notice. Can't we move it into some header file which is
accessible from everywhere ?

tglx

 So, I'm copying the version from linux/time.h rather than resurrecting
 my version.  This causes some declaration changes as it now returns a
 signed value rather than an unsigned value.
 
 Signed-off-by: Jeff Dike [EMAIL PROTECTED]
 ---
  arch/um/include/os.h|4 ++--
  arch/um/os-Linux/time.c |   22 +++---
  2 files changed, 21 insertions(+), 5 deletions(-)
 
 Index: linux-2.6.22/arch/um/include/os.h
 ===
 --- linux-2.6.22.orig/arch/um/include/os.h2007-09-25 09:26:42.0 
 -0400
 +++ linux-2.6.22/arch/um/include/os.h 2007-09-25 09:28:42.0 -0400
 @@ -252,9 +252,9 @@ extern void os_dump_core(void);
  extern void idle_sleep(unsigned long long nsecs);
  extern int set_interval(void);
  extern int timer_one_shot(int ticks);
 -extern unsigned long long disable_timer(void);
 +extern long long disable_timer(void);
  extern void uml_idle_timer(void);
 -extern unsigned long long os_nsecs(void);
 +extern long long os_nsecs(void);
  
  /* skas/mem.c */
  extern long run_syscall_stub(struct mm_id * mm_idp,
 Index: linux-2.6.22/arch/um/os-Linux/time.c
 ===
 --- linux-2.6.22.orig/arch/um/os-Linux/time.c 2007-09-25 09:26:42.0 
 -0400
 +++ linux-2.6.22/arch/um/os-Linux/time.c  2007-09-25 09:28:42.0 
 -0400
 @@ -39,7 +39,23 @@ int timer_one_shot(int ticks)
   return 0;
  }
  
 -unsigned long long disable_timer(void)
 +/**
 + * timeval_to_ns - Convert timeval to nanoseconds
 + * @ts:  pointer to the timeval variable to be converted
 + *
 + * Returns the scalar nanosecond representation of the timeval
 + * parameter.
 + *
 + * Ripped from linux/time.h because it's a kernel header, and thus
 + * unusable from here.
 + */
 +static inline long long timeval_to_ns(const struct timeval *tv)
 +{
 + return ((long long) tv-tv_sec * UM_NSEC_PER_SEC) +
 + tv-tv_usec * UM_NSEC_PER_USEC;
 +}
 +
 +long long disable_timer(void)
  {
   struct itimerval time = ((struct itimerval) { { 0, 0 }, { 0, 0 } });
  
 @@ -47,10 +63,10 @@ unsigned long long disable_timer(void)
   printk(UM_KERN_ERR disable_timer - setitimer failed, 
  errno = %d\n, errno);
  
 - return tv_to_nsec(time.it_value);
 + return timeval_to_ns(time.it_value);
  }
  
 -unsigned long long os_nsecs(void)
 +long long os_nsecs(void)
  {
   struct timeval tv;
  
 -
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents

2007-09-25 Thread Thomas Gleixner
Rafael,

On Tue, 2007-09-25 at 22:07 +0200, Rafael J. Wysocki wrote:
 On Tuesday, 25 September 2007 15:17, Thomas Gleixner wrote:
  On Tue, 2007-09-25 at 15:16 +0200, Rafael J. Wysocki wrote:
 [--snip--]
  
  I start to get desperate. Below is a patch, which moves the apic timer
  disable check after the calibration routine. Can you please apply on top
  of -hrt and add noapictimer to the command line ? Does it boot ?

 2.6.23-rc7 with patch-2.6.23-rc7-hrt1.patch and the patch below applied boots
 with noapictimer and doesn't boot without it.

That was expected. I explicitly asked to add noapictimer to the kernel
command line.

Ok, so we ruled out the apic timer calibration routine. I did not expect
that this would be the culprit, but with dark screen as the only debug
info, I need to resort to small steps.

Can you please send me the output of /proc/timer_list of 2.6.23-rc7-hrt1
after booting with noapictimer ?

I'm a bit confused by your earlier confirmation, that mainline w/o the
-hrt patches boots fine, when you add apicmaintimer to the kernel
command line. apicmaintimer stops the PIT like we do in -hrt and we
just use the local APIC timer for everything. Can you please retest and
confirm that this is correct ?

Is the 32 bit kernel working on that box ?

Thanks for your patience.

tglx

PS: I just sent out the disable APIC timer for AMD C1E boxen patch. We
debugged this half a year ago on a nx6325, but I completely forgot about
that. The explanation from AMD was sensible, but your apicmaintimer
works statement is contradictory.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86-64: Disable local APIC timer use on AMD systems with C1E

2007-09-25 Thread Thomas Gleixner
On Tue, 2007-09-25 at 22:55 +0200, Rafael J. Wysocki wrote:
 I have reworked the patch a bit so that it applies on top of 2.6.23-rc8-mm1
 and compiles (my version is attached).
 
 With this patch applied, the kernel boots correctly on the nx6325.

I know. It's basically enforced noapictimer. 

But this still does not explain why your nasty box booted current
mainline with apicmaintimer on the kernel command line.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents

2007-09-25 Thread Thomas Gleixner
Rafael,

On Tue, 2007-09-25 at 23:28 +0200, Rafael J. Wysocki wrote:
  I'm a bit confused by your earlier confirmation, that mainline w/o the
  -hrt patches boots fine, when you add apicmaintimer to the kernel
  command line. apicmaintimer stops the PIT like we do in -hrt and we
  just use the local APIC timer for everything. Can you please retest and
  confirm that this is correct ?
 
 No, it's not.  The mainline _usually_ doesn't boot with apicmaintimer.
 
 It seems to me that _sometimes_ the CPU just doesn't enter this C1E state
 and then everything goes fine ...

I'm relieved. I really started to go nuts on this contradicting
patterns.

Your box seems to be worse than the VAIO, it has some random surprise
generator built in :)

  Is the 32 bit kernel working on that box ?
 
 Can't tell, I have only 64-bit userland here.

Should be fine. The check is there since late 2.6.21-rc. I really could
kick my own ass that I did not remember the nx6325 wreckage in the
2.6.21-rc time frame. Sigh, way too much broken hardware out there to
keep track of it.

  Thanks for your patience.
 
 Well, I'm only making sure that future kernels will run on my box. ;-)

Nothing wrong with that. Thanks again for your help,

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc8-mm1: somewhat broken forced HPET on ICH5

2007-09-25 Thread Thomas Gleixner
Alexey,

On Wed, 2007-09-26 at 00:50 +0400, Alexey Dobriyan wrote:
 ich-force-hpet-ich5-quirk-to-force-detect-enable.patch
 is causing the following on Etch boot:
 
   [initscripts as usual]
   Setting system clock:
   [nothing happens for several seconds]
   select to /dev/rtc to wait for clock tick timed out
   [initscripts as usual]
 
 Then clock is skewed for 3 hours (GMT/MSK difference).

Can you please check, whether 

http://tglx.de/projects/hrtimers/2.6.23-rc7/patch-2.6.23-rc7-hrt1.patch

has the same problem ? It contains the hpet force enable patches as
well, but lacks the other crap^Wfeatures of -mm :)

Thanks,

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] UML - time build fix

2007-09-25 Thread Thomas Gleixner
Jeff,

On Tue, 2007-09-25 at 17:56 -0400, Jeff Dike wrote:
 On Tue, Sep 25, 2007 at 09:54:15PM +0200, Thomas Gleixner wrote:
  On Tue, 2007-09-25 at 13:37 -0400, Jeff Dike wrote:
   Put back an implementation of timeval_to_ns in
   arch/um/os-Linux/time.c.  tglx pointed out in his review of tickless
   support that there was a perfectly good implementation of it in
   linux/time.h.  The problem is that this is userspace code which can't
   pull in kernel headers and there doesn't seem to be a libc version.
  
  Oops. Did not notice. 
 
 It's a UML peculiarity...
 
  Can't we move it into some header file which is accessible from everywhere ?
 
 Not in the generic kernel.  UML has some generally includable headers
 of its own, but that doesn't really help.
 
 The one thing that would help is a libc timeval_to_ns.

Fair enough.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc8-mm1: somewhat broken forced HPET on ICH5

2007-09-26 Thread Thomas Gleixner
On Wed, 2007-09-26 at 13:14 +0400, Alexey Dobriyan wrote:
 On Tue, Sep 25, 2007 at 11:45:17PM +0200, Thomas Gleixner wrote:
  On Wed, 2007-09-26 at 00:50 +0400, Alexey Dobriyan wrote:
   ich-force-hpet-ich5-quirk-to-force-detect-enable.patch
   is causing the following on Etch boot:
   
 [initscripts as usual]
 Setting system clock:
 [nothing happens for several seconds]
 select to /dev/rtc to wait for clock tick timed out
 [initscripts as usual]
   
   Then clock is skewed for 3 hours (GMT/MSK difference).
  
  Can you please check, whether 
  
  http://tglx.de/projects/hrtimers/2.6.23-rc7/patch-2.6.23-rc7-hrt1.patch
  
  has the same problem ? It contains the hpet force enable patches as
  well, but lacks the other crap^Wfeatures of -mm :)
 
 Yes, exactly same delay and clock skew.

Ok, stupid me. Did not look at your config snippet right away. Can you
please enable CONFIG_HPET_EMULATE_RTC ?

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [discuss] 2.6.23-rc8-mm1, -rc7-mm1 kill audio on HP nx6325

2007-09-26 Thread Thomas Gleixner
On Wed, 2007-09-26 at 08:32 +0100, Jan Beulich wrote:
 ioremap_nocache() does __ioremap(..., _PAGE_PCD);, then __ioremap() does
 ioremap_page_range(..., _PAGE_PCD | other_stuff) That's one.
 
 __ioremap() then does ioremap_change_attr(..., _PAGE_PCD);.  That's two.
 
 So I _think_ we're setting _PAGE_PCD twice on those pte's?  Unclear.  The
 implementation is rather different from i386, too.
 
 I dunno why __change_page_attr() failed though.  Perhaps this, in
 change_page_attr_addr():
 
  if (!kernel_map || pte_present(pfn_pte(0, prot))) {
 
 should be 
 
 Definitely not, and this code has been that way for a while.
 
 I rather suspect this change
 
 - if (!kpte) return 0;
 + if (!kpte)
 + return -EINVAL;
 
 to be the reason for the failure (and I had already sent a comment to this
 respect to Andi upon his review request).

This change exposes the problem. The question is why we do not have a
page table entry for the address, which was mapped right before that.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents

2007-09-26 Thread Thomas Gleixner
On Wed, 2007-09-26 at 17:25 +0200, Rafael J. Wysocki wrote:
 There still are some oddities.
 
 First, with the x86-64: Disable local APIC timer use on AMD systems with C1E
 patch and my collection of suspend patches applied, the box doesn't boot
 (the suspend patches don't even thouch the boot code, so they should be
 irrelevant here).  However, it boots if patch-2.6.23-rc7-hrt1.patch (adjusted
 for 2.6.23-rc8) is applied in addition.  Is this expected?

No. That's odd. It is nothing else than adding noapictimer to the
kernel command line.

 Next, on 2.6.23-rc8 with the patches from:
 
 http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.23-rc8/patches/
 
 plus the x86-64: Disable local APIC timer use on AMD systems with C1E patch
 and patch-2.6.23-rc7-hrt1.patch (adjusted for 2.6.23-rc8), hibernation doesn't
 work correctly.  Although the box hibernates and restores, there is a 
 temporary
 hang during the resume hardware sequence, after which the lock led 
 starts
 to blink (and remains in this state) and something like this appears in dmesg:
 
 Extended CMOS year: 2000
 Enabling non-boot CPUs ...
 SMP alternatives: switching to SMP code
 Booting processor 1/2 APIC 0x1
 Initializing CPU#1
 Calibrating delay using timer specific routine.. 3990.36 BogoMIPS 
 (lpj=7980735)
 CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
 CPU: L2 Cache: 512K (64 bytes/line)
 Unable to handle kernel paging request at 806c64d4 RIP: 
  [802104cb] identify_cpu+0x2ac/0x5a1

Hmm. That's really early in the CPU bring up. The only change in this
area is the C1E patch. Can you decode the exact source line, where it is
failing ?

tglx



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)

2007-09-26 Thread Thomas Gleixner
Rafael,

On Wed, 2007-09-26 at 23:00 +0200, Rafael J. Wysocki wrote:
First, with the x86-64: Disable local APIC timer use on AMD systems 
with C1E
patch and my collection of suspend patches applied, the box doesn't boot
(the suspend patches don't even thouch the boot code, so they should be
irrelevant here).  However, it boots if patch-2.6.23-rc7-hrt1.patch 
(adjusted
for 2.6.23-rc8) is applied in addition.  Is this expected?
   
   No. That's odd. It is nothing else than adding noapictimer to the
   kernel command line.
  
  Seems to be reproducible, though.  I'll investigate further.
 
 So far, the results are the following:
 
 1) current Linus' tree doesn't boot with any command line (regression)
 
 [  Linus, please revert commit e66485d747505e9d960b864fc6c37f8b2afafaf0
 
x86-64: Disable local APIC timer use on AMD systems with C1E
 
It's not necessary for 2.6.23 and actually kills the box that it's 
 supposed to fix. ]
 
 2) 2.6.23-rc8 w/ the x86-64: Disable local APIC timer use on AMD systems 
 with C1E
patch applied behaves like the current -git
 
 3) 2.6.23-rc8 w/o this patch doesn't boot with either noapictimer _or_

OK, this explains 2) and 3). I just looked into the code and the logic
vs. noapictimer on SMP is completely broken.

On i386 the noapictimer option not only disables the local APIC timer,
it also registers the CPUs for broadcasting via IPI on SMP systems. 

The x8664 code uses the broadcast only when the local apic timer is
active, i.e. noapictimer is not on the command line. This defeats the
whole purpose of noapictimer. It should be there to make boxen work,
where the local APIC timer actually has a hardware problem, e.g. the
nx6325.

The current implementation of x86_64 only fixes the ACPI c-states
related problem where the APIC timer stops in C3(2), nothing else.

On nx6325 and other AMD X2 equipped systems which have the C1E enabled
we run into the following:

PIT keeps jiffies (and the system) running, but the local APIC timer
interrupts can get out of sync due to this C1E effect. 

I don't think this is a critical problem, but it is wrong nevertheless.

I think it's safe to revert the C1E patch and postpone the fix to the
clock events conversion.

   apicmaintimer

on your box is not going to work. See the C1E patch. apicmaintimer
switches off PIT and then waits for ever for the local APIC timer
interrupts.

 4) 2.6.22 behaves like 2.6.23-rc8

No surprise

 5) 2.6.23-rc8 with (adjusted) patch-2.6.23-rc7-hrt1.patch boots only with
noapictimer
 
 6) 2.6.23-rc8 with (adjusted) patch-2.6.23-rc7-hrt1.patch and with the
x86-64: Disable local APIC timer use on AMD systems with C1E patch boots
without any extra command line options

That's consistent behaviour.

 Tested for a couple of times with each kernel, the results seem to be
 reproducible 100% of the time.

Thanks for going through this debug marathon.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)

2007-09-26 Thread Thomas Gleixner
On Wed, 2007-09-26 at 15:22 -0700, Linus Torvalds wrote:
 
 On Wed, 26 Sep 2007, Thomas Gleixner wrote:
   
   1) current Linus' tree doesn't boot with any command line (regression)
   
   [  Linus, please revert commit e66485d747505e9d960b864fc6c37f8b2afafaf0
 
 Reverted.
 
  OK, this explains 2) and 3). I just looked into the code and the logic
  vs. noapictimer on SMP is completely broken.
 
 ..and thanks for the explanation.
 
 Thanks for finding it so quickly guys. Sounds like this will be fixed 
 properly in 2.6.24 with the x86 merge (which hopefully brings in the hrt 
 patch too)

It's even worse than I thought on the first check:

noapictimer on the command line of an SMP box prevents _ONLY_ the boot
CPU apic timer from being used. But the secondary CPU is still
unconditionally setting up the APIC timer and uses the non calibrated
variable calibration_result, which is of course 0, to setup the APIC
timer. Wreckage guaranteed.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)

2007-09-26 Thread Thomas Gleixner
On Thu, 2007-09-27 at 01:30 +0200, Rafael J. Wysocki wrote:
   Tested for a couple of times with each kernel, the results seem to be
   reproducible 100% of the time.
  
  Thanks for going through this debug marathon.
 
 No big deal.  I'm glad that you've found what's up.
 
 Well, we still have the CPU hotplug during suspend w/ the hrt patch problem
 to debug ... ;-)

Yeah. Knowing the actual line of code where it breaks might be helpful.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Compile handle_percpu_irq even for uniprocessor kernels

2007-09-27 Thread Thomas Gleixner

On Thu, 2007-09-27 at 12:24 +0100, Ralf Baechle wrote:
 Compiling handle_percpu_irq only on uniprocessor generates an artificial
 special case so a typical use like:
 
   set_irq_chip_and_handler(irq, some_irq_type, handle_percpu_irq);
 
 needs to be conditionally compiled only on SMP systems as well and an
 alternative UP construct is usually needed - for no good reason.
 
 Signed-off-by: Ralf Baechle [EMAIL PROTECTED]

Makes sense.

Acked-by: Thomas Gleixner [EMAIL PROTECTED]

 ---
 This fixes uniprocessor configurations for some MIPS SMP systems.
 
 diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
 index f1a73f0..9b5dff6 100644
 --- a/kernel/irq/chip.c
 +++ b/kernel/irq/chip.c
 @@ -503,7 +503,6 @@ out_unlock:
   spin_unlock(desc-lock);
  }
  
 -#ifdef CONFIG_SMP
  /**
   *   handle_percpu_IRQ - Per CPU local irq handler
   *   @irq:   the interrupt number
 @@ -529,8 +528,6 @@ handle_percpu_irq(unsigned int irq, struct irq_desc *desc)
   desc-chip-eoi(irq);
  }
  
 -#endif /* CONFIG_SMP */
 -
  void
  __set_irq_handler(unsigned int irq, irq_flow_handler_t handle, int 
 is_chained,
 const char *name)
 -
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc8-mm2: problems on HP nx6325

2007-09-27 Thread Thomas Gleixner
On Thu, 2007-09-27 at 17:59 +0200, Rafael J. Wysocki wrote:
  2) CPU hotplug is busted (onlining of CPU1 kills the kernel), probably due 
  to
 the same issue that I'm having with the -hrt version of 2.6.23-rc8 (we're
 debugging it right now)
 
 This one is fixed by the following patch:
 
 ---
 From: Rafael J. Wysocki [EMAIL PROTECTED]
 
 Fix CPU hotplug breakage on HP nx6325 and similar boxes caused by a reference
 to disable_apic_timer (labeled as __initdata) from the CPU initialization 
 code.
 
 Signed-off-by: Rafael J. Wysocki [EMAIL PROTECTED]

Doh, I knew I blew it.

Good catch, thanks,

tglx

 ---
  arch/x86_64/kernel/apic.c |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 Index: linux-2.6.23-rc8-mm2/arch/x86_64/kernel/apic.c
 ===
 --- linux-2.6.23-rc8-mm2.orig/arch/x86_64/kernel/apic.c
 +++ linux-2.6.23-rc8-mm2/arch/x86_64/kernel/apic.c
 @@ -42,7 +42,7 @@
  
  int apic_verbosity;
  static int apic_calibrate_pmtmr __initdata;
 -int disable_apic_timer __initdata;
 +int disable_apic_timer __cpuinitdata;
  
  /* Local APIC timer works in C2? */
  int local_apic_timer_c2_ok;

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NO_HZ hangs up AMD MK-36

2007-09-27 Thread Thomas Gleixner
On Thu, 2007-09-27 at 23:28 +0300, Dmitry Tyschenko wrote:
 I have laptop Asus X50M. Using old Debian Etch from February.
 Kernel from 2.6.21 doesn't boot, hangs up just in 10seconds -  1minute
 after GRUB screen.
 I have tryed different versions of gcc (4.1.1, 4.1.2, 4.2.1) to build
 2.6.22.8 kernel, but no results.
 But if I disable NO_HZ option 2.6.21 is working fine for me.

We have fixed a bunch of bugs in this area. Can you please try the
latest mainline kernel, whether the problem still persists ?

Thanks,

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NO_HZ hangs up AMD MK-36

2007-09-27 Thread Thomas Gleixner
On Fri, 2007-09-28 at 00:01 +0300, Dmitry Tyschenko wrote:
 Sorry, I am newbie in linux. Hope you was talking about:
 /boot/vmlinuz-2.6.22-1-k7 root=/dev/sda5 ro nohz=off

Yes.

 But it doesn't help for Debians 2.6.22-1 (I don't have another
 prebuiled) still same problems.

Can you please add: nolapic_timer instead ?

Thanks,

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] clockevents: fix bogus next_event reset for oneshot broadcast devices

2007-09-27 Thread Thomas Gleixner
In periodic broadcast mode the next_event member of the broadcast device
structure is set to KTIME_MAX in the interrupt handler. This is wrong,
as we calculate the next periodic interrupt with this variable.

Remove it.

Noticed by Ralf. MIPS is the first user of this mode, it does not affect
existing users.

Signed-off-by: Thomas Gleixner [EMAIL PROTECTED]
Acked-and-tested-by: Ralf Baechle [EMAIL PROTECTED]
---

diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 0962e05..acf15b4 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -176,8 +176,6 @@ static void tick_do_periodic_broadcast(void)
  */
 static void tick_handle_periodic_broadcast(struct clock_event_device *dev)
 {
-   dev-next_event.tv64 = KTIME_MAX;
-
tick_do_periodic_broadcast();
 
/*


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc6-mm1 powerpc - kgdb is broken

2007-09-28 Thread Thomas Gleixner
On Fri, 2007-09-28 at 16:07 +0530, Kamalesh Babulal wrote:
 The kgdb is also broken with 2.6.23-rc8-mm2 on the powerpc .
 The below patch disables the kgdb from getting compiled over
 powerpc platform.
 
 Signed-off-by : Kamalesh Babulal [EMAIL PROTECTED]
 ---
 
 --- linux-2.6.23-rc8/lib/Kconfig.kgdb   2007-09-28 06:33:37.0 +0530
 +++ linux-2.6.23-rc8/lib/~Kconfig.kgdb  2007-09-28 23:48:33.0 +0530
 @@ -14,7 +14,7 @@ config KGDB
 bool KGDB: kernel debugging with remote gdb
 select WANT_EXTRA_DEBUG_INFORMATION
 select KGDB_ARCH_HAS_SHADOW_INFO if X86_64
 -   depends on DEBUG_KERNEL  (ARM || X86 || MIPS || (SUPERH  
 !SUPERH64) || IA64 || PPC)
 +   depends on DEBUG_KERNEL  (ARM || X86 || MIPS || (SUPERH  
 !SUPERH64) || IA64 || !PPC)

This enables the KGDB config for _ALL_ platforms except powerpc. 

Just remove PPC completely.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] spin_lock_unlocked cleanups

2007-09-28 Thread Thomas Gleixner
On Fri, 2007-09-28 at 09:56 +0100, Andy Whitcroft wrote:
   I think we're ready to wire checkpatch up to a email robot which monitors
   the mailing lists and sends people nastygrams.  I bet that'll be popular 
   ;)
  
  We should wire it up to git-commit as well. A lot of that comes in via
  git subsystems.
 
 The problem with git-commit is who's repo to add the hook to.  I did
 attempt to do this by picking up each of linus' main releases and then
 using the git blame engine to attribute each failure to a particular
 commit.  The plan then would be to send a nasty-gram to the committer
 about violations there-in.
 
 I'll try and find some time to get this bit polished and at least
 emailing me.

The question is, whether we can convince the git developers to integrate
it. When a commit happens and checkpatch.pl is in scripts/, then run the
patch through it before doing the actual commit.

tglx



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] disable non-boot CPUs before poweroff

2007-09-28 Thread Thomas Gleixner

On Fri, 2007-09-28 at 09:52 -0400, Mark Lord wrote:
 We need to disable all CPUs other than the boot CPU (usually 0)
 before attempting to power-off modern SMP machines.
 This seems to fix the hang-on-poweroff issue
 that one of my SMP boxes exhibits.  More testing required.
 
 Signed-off-by: Mark Lord [EMAIL PROTECTED]

Fixes my new toybox as well. Thanks for tracking it down before I had to
dig in.

Acked-by: Thomas Gleixner [EMAIL PROTECTED]

 ---
 
 --- linux/kernel/sys.c.orig   2007-09-13 09:49:11.0 -0400
 +++ linux/kernel/sys.c2007-09-28 09:48:54.0 -0400
 @@ -32,6 +32,7 @@
  #include linux/getcpu.h
  #include linux/task_io_accounting_ops.h
  #include linux/seccomp.h
 +#include linux/cpu.h
  
  #include linux/compat.h
  #include linux/syscalls.h
 @@ -879,6 +880,7 @@
   if (pm_power_off_prepare)
   pm_power_off_prepare();
   sysdev_shutdown();
 + disable_nonboot_cpus();
   printk(KERN_EMERG Power down.\n);
   machine_power_off();
  }
 -
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] disable non-boot CPUs before poweroff

2007-09-28 Thread Thomas Gleixner
On Fri, 2007-09-28 at 17:05 +0200, Rafael J. Wysocki wrote:
  if (pm_power_off_prepare)
  pm_power_off_prepare();
  sysdev_shutdown();
  +   disable_nonboot_cpus();
 
 Before sysdev_shutdown(), please.
 
 sysdev_shutdown() may touch things that belong to CPU0.

Damn, you're right. Missed that.

tglx




-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REGRESSION from 2.6.23-rc8]

2007-09-28 Thread Thomas Gleixner
On Fri, 2007-09-28 at 11:07 -0400, Chuck Ebbert wrote:
 On 09/26/2007 06:35 PM, Thomas Gleixner wrote:
  It's even worse than I thought on the first check:
  
  noapictimer on the command line of an SMP box prevents _ONLY_ the boot
  CPU apic timer from being used. But the secondary CPU is still
  unconditionally setting up the APIC timer and uses the non calibrated
  variable calibration_result, which is of course 0, to setup the APIC
  timer. Wreckage guaranteed.
  
 
 Is this why I get 1000 spurious interrupts/second on IRQ7 when booting
 x86_64 with noapic?

No, thats a different problem. The wreckage is a stuck local apic timer
interrupt.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] spin_lock_unlocked cleanups

2007-09-28 Thread Thomas Gleixner
On Fri, 2007-09-28 at 01:26 -0700, Andrew Morton wrote:
 On Fri, 28 Sep 2007 10:17:30 +0200 Thomas Gleixner [EMAIL PROTECTED] wrote:
 
  can we please add this to checkpatch.pl ? 
  
   -spinlock_t bpci_lock = SPIN_LOCK_UNLOCKED;
   +DEFINE_SPINLOCK(bpci_lock);
 
 That check is already in checkpatch.  Problem is that hardly anyone
 runs the thing.

Sigh, I forgot that perl is write only. :)

 I think we're ready to wire checkpatch up to a email robot which monitors
 the mailing lists and sends people nastygrams.  I bet that'll be popular ;)

We should wire it up to git-commit as well. A lot of that comes in via
git subsystems.

 (I'd love it if it could detect wordwrapped and tab-expanded patches, too. 
 You wouldn't _believe_...)

I know ...

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] spin_lock_unlocked cleanups

2007-09-28 Thread Thomas Gleixner
On Thu, 2007-09-27 at 23:36 +0200, roel wrote:
 Replace some SPIN_LOCK_UNLOCKED with DEFINE_SPINLOCK
 
 Signed-off-by: Roel Kluin [EMAIL PROTECTED]

Acked-by: Thomas Gleixner [EMAIL PROTECTED]

Andy, Randy,

can we please add this to checkpatch.pl ? 

 -spinlock_t bpci_lock = SPIN_LOCK_UNLOCKED;
 +DEFINE_SPINLOCK(bpci_lock);

This code was introduced in June 2007, almost two years after the first
big DEFINE_SPINLOCK cleanup. Sigh.

Thanks,

tglx




-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 0/2] suspend/resume regression fixes

2007-09-28 Thread Thomas Gleixner
On Fri, 2007-09-28 at 16:27 -0400, Mark Lord wrote:
 Linus Torvalds wrote:
  
  On Sat, 22 Sep 2007, Thomas Gleixner wrote:
  My final enlightment was, when I removed the ACPI processor module,
  which controls the lower idle C-states, right before resume; this
  worked fine all the time even without all the workaround hacks.
 
  I really hope that this two patches finally set an end to the jinxed
  VAIO heisenbug series, which started when we removed the periodic
  tick with the clockevents/dyntick patches.
  
  Ok, so the patches look fine, but I somehow have this slight feeling that 
  you gave up a bit too soon on the *why* does this happen? question.
 
 On a closely related note:  I just now submitted a patch to fix SMP-poweroff,
 by having it do disable_nonboot_cpus before doing poweroff.
 
 Which has led me to thinking..
 ..are similar precautions perhaps necessary for *all* ACPI BIOS calls?
 
 Because one never knows what the other CPUs are doing at the same time,
 and what the side effects may be on the ACPI BIOS functions.
 
 And also, I wonder if at a minimum we should be guaranteeing ACPI BIOS calls
 only ever happen from CPU#0 (or the boot CPU)?   Or do we do that already?

The ACPI calls are serialized in the kernel, AFAICT. But the fragile
situations (suspend, resume, shutdown, reboot) are probably those, where
some BIOS implementation expect that certain things are not called or
not active.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)

2007-09-30 Thread Thomas Gleixner

On Sun, 30 Sep 2007, Andi Kleen wrote:


OK, this explains 2) and 3). I just looked into the code and the logic
vs. noapictimer on SMP is completely broken.


noapictimer really doesn't make any sense on non SMP imho with the old
timer architecture. That is why I never bothered to implement it.
It's purely a UP hack.


It does not matter whether it makes sense to you or not. It is a command 
line option which bricks systems. There is neither an explanation in 
Dokumentation/kernel-parameters.txt nor a check in the code, which 
disables this completely.


It makes a lot of sense even with the existing architecture. Trouble 
shooting a box, where the local apic timer does not work correctly is not 
an UP only requirement.


Yes, it is a hack, a _bad_ hack.


..and thanks for the explanation.

Thanks for finding it so quickly guys. Sounds like this will be fixed
properly in 2.6.24 with the x86 merge (which hopefully brings in the hrt
patch too)


There is nothing really to fix currently.  Clockevents changes behaviour
majorly (always using APIC timers without irq 0 backups[1]) and that causes
problems that need new workarounds and new fixes (surprise surprise!)

That merge would probably fix a few more such Thomas doesn't understand
the code bugs I guess because he hacks much more on i386 than x86-64;
but if the overall result will be really better is a totally different
question.


I understand the code quite well. I'm just surprised from time to time by 
interesting hacks in the so clean x8664 tree.



[1]  Or let's call it I trust all my time to the CPU and no more southrbridge
aka put all eggs in one basket. Given the trends in CPU power saving that
is a quite dangerous strategy.


No, it's not dangerous. We spent quite some time to make the clock events 
layer flexible enough to handle the current problems and the design allows 
to add more infrastructure when necessary. The maybe new (mis)features of 
upcoming CPUs need to be addressed with or without clock events and they 
need to be done careful and not by random hacks.


  tglx
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] robust futex thread exit race

2007-09-30 Thread Thomas Gleixner

On Sun, 30 Sep 2007, Ingo Molnar wrote:
 * Martin Schwidefsky [EMAIL PROTECTED] wrote:
 
  Hi Ingo,
  I finally found the bug that causes tst-robust8 from the glibc to fail
  on s390x. Turned out to be a common code problem with the processing of
  the robust futex list. The patch below fixes the bug for me.
 
 good catch! A quick preliminary review of your patch indicates it's fine 
 - and it might be v2.6.23 material.
 
 Acked-by: Ingo Molnar [EMAIL PROTECTED]

  Acked-by: Thomas Gleixner [EMAIL PROTECTED]
 
  Calling handle_futex_death in exit_robust_list for the different 
  robust mutexes of a thread basically frees the mutex. Another thread 
  might grab the lock immediately which updates the next pointer of the 
  mutex. fetch_robust_entry over the next pointer might therefore branch 
  into the robust mutex list of a different thread. This can cause two 
  problems: 1) some mutexes held by the dead thread are not getting 
  freed and 2) some mutexs held by a different thread are freed. The 
  next point need to be read before calling handle_futex_death.
 
 nasty race... Ulrich, Thomas, do you concur?

Yes. Where do they sell those brown paperbags again ?

tglx

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: x86-64 sporadic hang in 2.6.23rc7 and 2.6.22

2007-09-30 Thread Thomas Gleixner
On Sat, 29 Sep 2007, Helge Hafting wrote:
 Thomas Gleixner wrote:
   I have gone back to 2.6.22rc4, which seems to work.
   
   This is a single opteron, although on a dual-slot board.
   
  
  Can you switch to serial console, so we can get some information out of
  that box? Sysrq-B is working, so we can get info from other sysrq
  functions as well.

 I didn't need the serial - it crashes during console work too.
 I think a make clean was in progress at the time. There must be work going
 on in order to crash.
 
 This time 2.6.22rc4 died on me with a general protection fault
 
 I got two reports, the first one scrolled partially off screen but
 the whole trace was there:

That's why I asked for a serial console. That way we can get all the
information from the reports including the register dumps 

 Then I got:
 spinlock lockup on cpu #0, kswapd 0/212

That's probably caused by the previous one.

   tglx
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Fwd: x86_64 and AMD with C1E

2007-10-01 Thread Thomas Gleixner
On Mon, 1 Oct 2007, Mikhail Kshevetskiy wrote:
 No, it boot and work normally. The only thing i bother, is the
 additional 260 timer interrupts per seconds.
 Here is short result:
 
 c1e enabled:
   -- power consumption about 23 watts
   -- there is only C1 power state enabled
   -- there are about 260 timer interrupts per seconds
 tested with  x86_64(2.6.22, 2.6.23-rc8, 2.6.23-rc8-hrt1 ),
 i386(2.6.21, 2.6.22, 2.6.23-rc5-hrt1)
 
 c1e disabled:
   -- power consumption about 27 watts
   -- there are no any power state enabled (including C1)
   -- there are no additional 260 timer interrupts per seconds
 tested with 2.6.23-rc6-hrt1/x86_64.
 
 I want to reduce the power consumption of my notebook. I see the 2 
 possibility:
   -- remove 260 additional timer interrupts (c1e enabled case )

There is work in progress on a patch, which allows to utilize the hpet
timers as per cpu timers. This should solve the problem. Be patient.

Thanks,

tglx
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Fwd: x86_64 and AMD with C1E

2007-10-01 Thread Thomas Gleixner
On Mon, 1 Oct 2007, Andi Kleen wrote:
  There is work in progress on a patch, which allows to utilize the hpet
  timers as per cpu timers. This should solve the problem. Be patient.
 
 Given that e.g. ICH8 only has 3 HPET timers that seems doubtful
 except for the special case of single-socket non hyper threaded dual core.
 You'll probably do a lot of broadcasting and IPI'ing still.
 
 Also you'll likely make user space unhappy which often requires 
 at least one free HPET timer for /dev/rtc. Ok I suppose that 
 could be replaced with a hrtimer.

Yes, we can replace rtc with a hrtimer. Also HPET can operate in non
legacy irq mode, so the legacy rtc is still available. So if the
number of hpet channels is greater/equal to the number of possible
CPUs it's perfectly fine and does not need IPI at all.

Thanks,

tglx

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Fwd: x86_64 and AMD with C1E

2007-10-01 Thread Thomas Gleixner
On Mon, 1 Oct 2007, Andi Kleen wrote:
  So if the
  number of hpet channels is greater/equal to the number of possible
  CPUs it's perfectly fine and does not need IPI at all.
 
 That is only a stop gap then. I don't see this being
 generally true in the future. e.g. Intel announced SMT will be soon 
 back so even a standard dual core would exceed it with
 current southbridges.

Sigh. We have to deal with current hardware and the problems of exactly 
that hardware. We have the possibility to solve problems and witchcrafting 
what might happen next is not a good reason not to do so.

 Also I'm not sure but I suspect non Intel HPETs have less than
 three timers. Certainly they generally miss the 64bitness.

two timers are enough and 64 bit is nice to have, but not a requirement.

tglx

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Thomas Gleixner
On Mon, 1 Oct 2007, Andi Kleen wrote:

 On Wednesday 26 September 2007 20:03:12 David Bahi wrote:
  Thanks to tglx and ghaskins for all the help in tracking down a very
  early nmi_watchdog crash on certain x86_64 machines.
 
 The patch is totally bogus. irq 0 doesn't say anything about whether
 the current CPU still works or not. You always need some local
 interrupt. This basically disables the NMI watchdog for the non boot CPUs.
 
 It's even wrong on i386 -- i wonder how that broken patch
 made it in there. I'll remove it there.

Right, it's wrong for the broadcast case, but simply removing it will
trigger false positives on the CPU which runs the broadcast timer. I
fix this proper.

tglx
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Thomas Gleixner
On Mon, 1 Oct 2007, Andi Kleen wrote:
 On Monday 01 October 2007 20:54:21 Thomas Gleixner wrote:
  On Mon, 1 Oct 2007, Andi Kleen wrote:
  
   On Wednesday 26 September 2007 20:03:12 David Bahi wrote:
Thanks to tglx and ghaskins for all the help in tracking down a very
early nmi_watchdog crash on certain x86_64 machines.
   
   The patch is totally bogus. irq 0 doesn't say anything about whether
   the current CPU still works or not. You always need some local
   interrupt. This basically disables the NMI watchdog for the non boot CPUs.
   
   It's even wrong on i386 -- i wonder how that broken patch
   made it in there. I'll remove it there.
  
  Right, it's wrong for the broadcast case, but simply removing it will
  trigger false positives on the CPU which runs the broadcast timer. I
  fix this proper.
 
 I already did this here by checking for cpu != 0. But it also needs either 
 tracking
 or forbidding migrations of irq 0. I can take care of the patch.

I was thinking about the same fix. On i386 we already have the irq 
migration / balancing of irq 0 disabled. That's why we setup IRQ0 with
IRQ_NOBALANCING.

tglx




-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Thomas Gleixner
On Mon, 1 Oct 2007, Arjan van de Ven wrote:
   I already did this here by checking for cpu != 0. But it also needs
   either tracking or forbidding migrations of irq 0. I can take care
   of the patch.
  
  I was thinking about the same fix. On i386 we already have the irq 
  migration / balancing of irq 0 disabled. That's why we setup IRQ0 with
  IRQ_NOBALANCING.
 
 btw doing this is a problem if the user decides to hot(un)plug cpu 0...
 he then can't move the irqs away to do that

IRQ_NOBALANCING is not preventing cpu unplug. It moves the affinity to the 
next CPU, but the check in NMI watchdog for CPU == 0 would not longer 
work.

Fix below. Post .23 material. I work out a separate one for the x8664 
clock events series.

tglx



[PATCH] i386: Fix nmi watchdog per cpu timer irq accounting

The clock events patches changed the interrupt distribution and the
local apic timer interrupt accounting for the broadcast case. The per
cpu clock events handler of the cpu, which runs the broadcast
interrupt, is executed directly in the broadcast irq context. This
does not invoke the low level arch code, which does the local apic
timer irq accounting. The work around for false positives in the nmi
watchdog was to add the irq0 interrupts (broadcast device) to the
local apic timer interrupts. This falsifies the results for the CPUs
which are not handling the broadcast interrupt, i.e. stuck CPUs might
be not detected, as noticed by Andi Kleen.

It would be possible to move the clockevents handler invocation of the
CPU which runs the broadcast interrupt into the tick device broadcast
function, but this would require to handle the per cpu device to this
function and perform the direct operation in the clock device specific
architecture code. Right now this is only i386 and x86_64, but MIPS is
on the way to use the broadcast mode as well.

Introduce a weak function tick_broadcast_account(), which allows x86
to adjust the local apic timer interrupt counter in the case when the
cpu local timer handler has been invoked. This keeps the cpu local
handler decision and invocation in the common code and allows x86 to
handle the nmi watchdog accounting correctly.

Signed-off-by: Thomas Gleixner [EMAIL PROTECTED]

diff --git a/arch/i386/kernel/apic.c b/arch/i386/kernel/apic.c
index 3d67ae1..180dde8 100644
--- a/arch/i386/kernel/apic.c
+++ b/arch/i386/kernel/apic.c
@@ -283,6 +283,16 @@ static void lapic_timer_broadcast(cpumask_t mask)
 }
 
 /*
+ * Called from the broadcasting code to keep the local apic timer irq
+ * accounting straight for the nmi watchdog. Is called with interrupts
+ * disabled.
+ */
+void tick_broadcast_account(int cpu)
+{
+   per_cpu(irq_stat, cpu).apic_timer_irqs++;
+}
+
+/*
  * Setup the local APIC timer for this CPU. Copy the initilized values
  * of the boot CPU and register the clock event in the framework.
  */
diff --git a/arch/i386/kernel/nmi.c b/arch/i386/kernel/nmi.c
index c7227e2..03cdcaf 100644
--- a/arch/i386/kernel/nmi.c
+++ b/arch/i386/kernel/nmi.c
@@ -349,11 +349,7 @@ __kprobes int nmi_watchdog_tick(struct pt_regs * regs, 
unsigned reason)
cpu_clear(cpu, backtrace_mask);
}
 
-   /*
-* Take the local apic timer and PIT/HPET into account. We don't
-* know which one is active, when we have highres/dyntick on
-*/
-   sum = per_cpu(irq_stat, cpu).apic_timer_irqs + kstat_cpu(cpu).irqs[0];
+   sum = per_cpu(irq_stat, cpu).apic_timer_irqs;
 
/* if the none of the timers isn't firing, this cpu isn't doing much */
if (!touched  last_irq_sums[cpu] == sum) {
diff --git a/include/linux/tick.h b/include/linux/tick.h
index 9a7252e..99b3021 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -73,6 +73,7 @@ static inline void tick_cancel_sched_timer(int cpu) { }
 # ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
 extern struct tick_device *tick_get_broadcast_device(void);
 extern cpumask_t *tick_get_broadcast_mask(void);
+extern void tick_broadcast_account(int cpu);
 
 #  ifdef CONFIG_TICK_ONESHOT
 extern cpumask_t *tick_get_broadcast_oneshot_mask(void);
diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 0962e05..43d0085 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -123,6 +123,16 @@ int tick_device_uses_broadcast(struct clock_event_device 
*dev, int cpu)
 }
 
 /*
+ * Weak function for cpu local interrupt accounting. Used by x86 to
+ * keep the lapic accounting correct for nmi_watchdog.
+ *
+ * Must be called with interrupts disabled.
+ */
+void __attribute__((weak)) tick_broadcast_account(int cpu)
+{
+}
+
+/*
  * Broadcast the event to the cpus, which are set in the mask
  */
 int tick_do_broadcast(cpumask_t mask)
@@ -137,6 +147,7 @@ int tick_do_broadcast(cpumask_t mask)
cpu_clear(cpu, mask);
td = per_cpu(tick_cpu_device, cpu);
td-evtdev-event_handler(td-evtdev

Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Thomas Gleixner
On Mon, 1 Oct 2007, Andi Kleen wrote:
 
  IRQ_NOBALANCING is not preventing cpu unplug. It moves the affinity to the
  next CPU, but the check in NMI watchdog for CPU == 0 would not longer
  work.
 
 That cannot happen right now because cpu_disable() on both i386/x86-64
 reject CPU #0. So just setting IRQ_NOBALANCING is sufficient and both
 do that already. I was wrong earlier in being concerned about this.
 
   int tick_do_broadcast(cpumask_t mask)
  @@ -137,6 +147,7 @@ int tick_do_broadcast(cpumask_t mask)
  cpu_clear(cpu, mask);
  td = per_cpu(tick_cpu_device, cpu);
  td-evtdev-event_handler(td-evtdev);
  +   tick_broadcast_account(cpu);
 
 That would not handle the case with a single CPU running only
 irq  0 but not broadcasting I think.

Hmm. The only situation where this can happen is when you add
nolapic_timer to the command line on a single CPU system. We do not
register the lapic dummy clock event device then.

 I believe 
 ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt/patches/fix-watchdog
 is the correct fix

Yup, I completely missed the fact, that we reject CPU#0 unplugging, so
your fix seems indeed to be more correct and simpler.

OTOH, the accounting hook would allow us to remove the IRQ#0 - CPU#0
restriction. Not sure whether it's worth the trouble.

 tglx
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Thomas Gleixner
On Tue, 2 Oct 2007, Andi Kleen wrote:
 
  OTOH, the accounting hook would allow us to remove the IRQ#0 - CPU#0
  restriction. Not sure whether it's worth the trouble.
 
 Some SIS chipsets hang the machine when you migrate irq 0 to another
 CPU. It's better to keep that Also I wouldn't be surprised if there are some
 other assumptions about this elsewhere.
 
 Ok in theory it could be done only on SIS, but that probably would really
 not be worth the trouble

Agreed.

I just got a x8664-hrt report, where I found the following oddity:

 0:   1197 172881   IO-APIC-edge  timer

That's one of those infamous AMD C1E boxen. Strange, all my systems have 
IRQ#0 on CPU#0 and nowhere else. Any idea ?

tglx

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-02 Thread Thomas Gleixner
On Tue, 2 Oct 2007, Andi Kleen wrote:
  Agreed.
  
  I just got a x8664-hrt report, where I found the following oddity:
  
   0:   1197 172881   IO-APIC-edge  timer
  
  That's one of those infamous AMD C1E boxen. Strange, all my systems have 
  IRQ#0 on CPU#0 and nowhere else. Any idea ?
 
 Hmm, in lowestpriority mode it would be possible that the APIC changes
 the CPU to #1 once; but IRQ 0 is always set to fixed mode. Also even 
 if that happens you should have them all on 1.
 
 Maybe the chipset is just ignoring the IO-APIC configuration in this case?
 
 Is it always the same chipset? Is it seen on i386 too?
 
 The problem is really that if this happens it's more than the NMI watchdog
 that is broken. If you don't run an additional APIC timer interrupt on CPU #0
 it's possible that CPU #0 won't schedule at all.

 The only workaround for chipsets ignoring IRQ affinity would be to keep
 track on which CPU irq 0 happens and then restart APIC timer interrupts
 on the others (or send IPIs) as needed. But that would be fairly ugly.

The clock events code does handle this already. The broadcast interrupt 
can come in on any cpu. It's just the nmi watchdog which would be affected 
by that.

tglx
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: x86 patches was Re: -mm merge plans for 2.6.24

2007-10-02 Thread Thomas Gleixner
On Tue, 2 Oct 2007, Andi Kleen wrote:
 On Tue, Oct 02, 2007 at 09:37:03AM +0200, Ingo Molnar wrote:
  
  * Andrew Morton [EMAIL PROTECTED] wrote:
  
   On 02 Oct 2007 08:18:17 +0200 Andi Kleen [EMAIL PROTECTED] wrote:
  
The clockevents patches are not included in this; but given the 
recent trouble i'm not 100% sure they are even ready yet.
  
  i'm curious, which recent trouble do you refer to? (The NMI watchdog 
  bug [which is off by default] was fixed quickly. The C1E bug was found 
  and fixed quickly. Anything else i missed?)
 
 C1e and now the misrouted irq 0s Thomas reported.
 
 Also i'm a little worried about the missing C1e check; it looks
 like it needs a re-review to make sure not other infrastructure was 
 missing.

I had completely forgotten about the C1E problem, which we debugged
half a year ago on 32bit. I went through the other pitfalls we had in
32bit carefully again and they are all covered on 64 bit too. C1E was
the only one I missed.

The irq0 problem is not a real one. The clock events code has no irq0 
bound to cpuX assumption at all. The only affected part is nmi_watchdog 
and I have a fix ready to handle this even for the irq#0 not on cpu#0 
case.

  tglx
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.23-rc9 and a heads-up for the 2.6.24 series..

2007-10-02 Thread Thomas Gleixner
On Mon, 1 Oct 2007, Linus Torvalds wrote:
 This is also a good time to warn about the fact that we're doing the x86 
 merge very soon (as in the next day or two) after 2.6.23 is out, so if you 
 have pending patches for the next series that touch arch/i386 or x86-64, 
 you should get in touch with Thomas Gleixner and Ingo Molnar, who are the 
 keepers of the merge scripts, and will help you prepare..
 
 Doing it as early as possible in the 2.6.24-rc4 series (basically I'll do 
 it first thing) will mean that we'll have the maximum amount of time to 
 sort out any issues, and the thing is, Thomas and Ingo already have a tree 
 ready to go, so people can check their work against that, and don't need 
 to think that they have to do any fixups after it his *my* tree. It would 
 be much better if everybody was just ready for it, and not taken by 
 surprise.
 
 In other words, people who know they may be affected and would want to 
 prepare can look at (for example)
 
   git://git.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-x86.git x86
 
 and generally get ready for the switch-over. 

I have uploaded an update of the arch/x86 tree based on -rc9 to

git://git.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-x86.git x86

For convenience there is a patch fixup script which helps you to
convert pending patches against this tree.

http://userweb.kernel.org/~tglx/x86/x86-fixup-patches.py

It's generated from the merge script and fixes the namespace of
patches. There will still be some rejects which can not be fixed up
automatically, but this should be rare.

I did a test with Andrews -mm series and only ~10 arch/x86 related
patches had rejects, out of 230+ patches, so the 100%-painless
conversion ratio is better than 95%. Those patches with rejects were
trivial to fix.

Usage: x86-fixup-patches.py sourcepatch destpatch

source and dest can be the same.

A helper script to convert complete quilt series is here:
http://userweb.kernel.org/~tglx/x86/fixupseries.sh

If there is anything we can help with the transition, please do not
hesitate to ask.

Thanks,

Thomas, Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


<    1   2   3   4   5   6   7   8   9   10   >