subject:"Strange hang on ia64 with CONFIG_PRINTK

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-20 Thread Roland Dreier

BTW, sorry I didn't get a chance to try some of the other debugging
you suggested yet... got busy with other stuff.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-20 Thread Roland Dreier

 > I just tried Ingo's patch[1] on a 2.6.25-rc2 kernel with printk timestamps
 > turned on ... and it booted just fine on my tiger4.  The default path
 > for non-boot cpus is from head.S to start_secondary(), and that
 > calls cpu_init() pretty quickly.  There shouldn't normally[2] be any
 > printk() calls on the non-boot cpu before it is safe to do so.

Hmm, current git + Ingo's patch still hangs on boot on my system
("Hitachi Cold Fusion-3e 2S4U" according to DMI).

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-20 Thread David Miller

From: Ingo Molnar <[EMAIL PROTECTED]>
Date: Wed, 20 Feb 2008 09:12:28 +0100

> 
> * Tony Luck <[EMAIL PROTECTED]> wrote:
> 
> > > Perhaps what is happening is that cpu0 comes online ... safely skips 
> > > over the early printk calls.  Calls cpu_init() which sets up the 
> > > resources *it* needs (ar.k3 points to per-cpu space), and then 
> > > executes sched_init() which marks it safe for all printk's. Then 
> > > cpu1 comes up and does a printk before it gets to cpu_init().
> > 
> > I just tried Ingo's patch[1] on a 2.6.25-rc2 kernel with printk 
> > timestamps turned on ... and it booted just fine on my tiger4.  The 
> > default path for non-boot cpus is from head.S to start_secondary(), 
> > and that calls cpu_init() pretty quickly.  There shouldn't normally[2] 
> > be any printk() calls on the non-boot cpu before it is safe to do so.
> 
> ok. I've queued up the patch again in sched.git, as a .25 fix. Can i add 
> your Tested-by and Acked-by tags?

Acked-by: David S. Miller <[EMAIL PROTECTED]>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-20 Thread Ingo Molnar


* Tony Luck <[EMAIL PROTECTED]> wrote:

> > Perhaps what is happening is that cpu0 comes online ... safely skips 
> > over the early printk calls.  Calls cpu_init() which sets up the 
> > resources *it* needs (ar.k3 points to per-cpu space), and then 
> > executes sched_init() which marks it safe for all printk's. Then 
> > cpu1 comes up and does a printk before it gets to cpu_init().
> 
> I just tried Ingo's patch[1] on a 2.6.25-rc2 kernel with printk 
> timestamps turned on ... and it booted just fine on my tiger4.  The 
> default path for non-boot cpus is from head.S to start_secondary(), 
> and that calls cpu_init() pretty quickly.  There shouldn't normally[2] 
> be any printk() calls on the non-boot cpu before it is safe to do so.

ok. I've queued up the patch again in sched.git, as a .25 fix. Can i add 
your Tested-by and Acked-by tags?

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-20 Thread Ingo Molnar


* Tony Luck [EMAIL PROTECTED] wrote:

  Perhaps what is happening is that cpu0 comes online ... safely skips 
  over the early printk calls.  Calls cpu_init() which sets up the 
  resources *it* needs (ar.k3 points to per-cpu space), and then 
  executes sched_init() which marks it safe for all printk's. Then 
  cpu1 comes up and does a printk before it gets to cpu_init().
 
 I just tried Ingo's patch[1] on a 2.6.25-rc2 kernel with printk 
 timestamps turned on ... and it booted just fine on my tiger4.  The 
 default path for non-boot cpus is from head.S to start_secondary(), 
 and that calls cpu_init() pretty quickly.  There shouldn't normally[2] 
 be any printk() calls on the non-boot cpu before it is safe to do so.

ok. I've queued up the patch again in sched.git, as a .25 fix. Can i add 
your Tested-by and Acked-by tags?

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-20 Thread David Miller

From: Ingo Molnar [EMAIL PROTECTED]
Date: Wed, 20 Feb 2008 09:12:28 +0100

 * Tony Luck [EMAIL PROTECTED] wrote:

   Perhaps what is happening is that cpu0 comes online ... safely skips 
   over the early printk calls.  Calls cpu_init() which sets up the 
   resources *it* needs (ar.k3 points to per-cpu space), and then 
   executes sched_init() which marks it safe for all printk's. Then 
   cpu1 comes up and does a printk before it gets to cpu_init().

  I just tried Ingo's patch[1] on a 2.6.25-rc2 kernel with printk 
  timestamps turned on ... and it booted just fine on my tiger4.  The 
  default path for non-boot cpus is from head.S to start_secondary(), 
  and that calls cpu_init() pretty quickly.  There shouldn't normally[2] 
  be any printk() calls on the non-boot cpu before it is safe to do so.

 ok. I've queued up the patch again in sched.git, as a .25 fix. Can i add 
 your Tested-by and Acked-by tags?

Acked-by: David S. Miller [EMAIL PROTECTED]
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-20 Thread Roland Dreier

  I just tried Ingo's patch[1] on a 2.6.25-rc2 kernel with printk timestamps
  turned on ... and it booted just fine on my tiger4.  The default path
  for non-boot cpus is from head.S to start_secondary(), and that
  calls cpu_init() pretty quickly.  There shouldn't normally[2] be any
  printk() calls on the non-boot cpu before it is safe to do so.

Hmm, current git + Ingo's patch still hangs on boot on my system
(Hitachi Cold Fusion-3e 2S4U according to DMI).

 - R.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-20 Thread Roland Dreier

BTW, sorry I didn't get a chance to try some of the other debugging
you suggested yet... got busy with other stuff.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-19 Thread Tony Luck

> Perhaps what is happening is that cpu0 comes online ... safely skips
> over the early printk calls.  Calls cpu_init() which sets up the resources
> *it* needs (ar.k3 points to per-cpu space), and then executes
> sched_init() which marks it safe for all printk's. Then cpu1 comes
> up and does a printk before it gets to cpu_init().

I just tried Ingo's patch[1] on a 2.6.25-rc2 kernel with printk timestamps
turned on ... and it booted just fine on my tiger4.  The default path
for non-boot cpus is from head.S to start_secondary(), and that
calls cpu_init() pretty quickly.  There shouldn't normally[2] be any
printk() calls on the non-boot cpu before it is safe to do so.

-Tony

[1] Attached
[2] If you set #define SMP_DEBUG in arch/ia64/kernel/smpboot.c
that enables at least one printk() that will cause problems if you have
also configured timestamps.
 kernel/sched.c |   14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

Index: linux-x86.q/kernel/sched.c
===
--- linux-x86.q.orig/kernel/sched.c
+++ linux-x86.q/kernel/sched.c
@@ -666,6 +666,8 @@ const_debug unsigned int sysctl_sched_rt
  */
 const_debug unsigned int sysctl_sched_rt_ratio = 62259;
 
+static __read_mostly int scheduler_running;
+
 /*
  * For kernel-internal use: high-speed (but slightly incorrect) per-cpu
  * clock constructed from sched_clock():
@@ -676,14 +678,16 @@ unsigned long long cpu_clock(int cpu)
 	unsigned long flags;
 	struct rq *rq;
 
-	local_irq_save(flags);
-	rq = cpu_rq(cpu);
 	/*
 	 * Only call sched_clock() if the scheduler has already been
 	 * initialized (some code might call cpu_clock() very early):
 	 */
-	if (rq->idle)
-		update_rq_clock(rq);
+	if (unlikely(!scheduler_running))
+		return 0;
+
+	local_irq_save(flags);
+	rq = cpu_rq(cpu);
+	update_rq_clock(rq);
 	now = rq->clock;
 	local_irq_restore(flags);
 
@@ -7255,6 +7259,8 @@ void __init sched_init(void)
 	 * During early bootup we pretend to be a normal task:
 	 */
 	current->sched_class = _sched_class;
+
+	scheduler_running = 1;
 }
 
 #ifdef CONFIG_DEBUG_SPINLOCK_SLEEP
--

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-19 Thread Tony Luck

 Perhaps what is happening is that cpu0 comes online ... safely skips
 over the early printk calls.  Calls cpu_init() which sets up the resources
 *it* needs (ar.k3 points to per-cpu space), and then executes
 sched_init() which marks it safe for all printk's. Then cpu1 comes
 up and does a printk before it gets to cpu_init().

I just tried Ingo's patch[1] on a 2.6.25-rc2 kernel with printk timestamps
turned on ... and it booted just fine on my tiger4.  The default path
for non-boot cpus is from head.S to start_secondary(), and that
calls cpu_init() pretty quickly.  There shouldn't normally[2] be any
printk() calls on the non-boot cpu before it is safe to do so.

-Tony

[1] Attached
[2] If you set #define SMP_DEBUG in arch/ia64/kernel/smpboot.c
that enables at least one printk() that will cause problems if you have
also configured timestamps.
 kernel/sched.c |   14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

Index: linux-x86.q/kernel/sched.c
===
--- linux-x86.q.orig/kernel/sched.c
+++ linux-x86.q/kernel/sched.c
@@ -666,6 +666,8 @@ const_debug unsigned int sysctl_sched_rt
  */
 const_debug unsigned int sysctl_sched_rt_ratio = 62259;
 
+static __read_mostly int scheduler_running;
+
 /*
  * For kernel-internal use: high-speed (but slightly incorrect) per-cpu
  * clock constructed from sched_clock():
@@ -676,14 +678,16 @@ unsigned long long cpu_clock(int cpu)
 	unsigned long flags;
 	struct rq *rq;
 
-	local_irq_save(flags);
-	rq = cpu_rq(cpu);
 	/*
 	 * Only call sched_clock() if the scheduler has already been
 	 * initialized (some code might call cpu_clock() very early):
 	 */
-	if (rq-idle)
-		update_rq_clock(rq);
+	if (unlikely(!scheduler_running))
+		return 0;
+
+	local_irq_save(flags);
+	rq = cpu_rq(cpu);
+	update_rq_clock(rq);
 	now = rq-clock;
 	local_irq_restore(flags);
 
@@ -7255,6 +7259,8 @@ void __init sched_init(void)
 	 * During early bootup we pretend to be a normal task:
 	 */
 	current-sched_class = fair_sched_class;
+
+	scheduler_running = 1;
 }
 
 #ifdef CONFIG_DEBUG_SPINLOCK_SLEEP
--

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-14 Thread Tony Luck

> We *ought* to be safe after cpu_init() ... which is called from setup_arch(),
> which is several calls before sched_init().

Perhaps what is happening is that cpu0 comes online ... safely skips
over the early printk calls.  Calls cpu_init() which sets up the resources
*it* needs (ar.k3 points to per-cpu space), and then executes
sched_init() which marks it safe for all printk's. Then cpu1 comes
up and does a printk before it gets to cpu_init().

Try with Ingo patch and CONFIG_SMP=n to see if you can come
up on a uni-processor.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-14 Thread Luck, Tony

> I guess sched_init() is too early... it does seem really strange to
> me, but I just double checked with Ingo's patch and it does indeed
> hang.  The slow way to make progress is just to go through
> start_kernel() line-by-line and enable cpu_clock() at each stage, and
> see where it stops hanging.  I'll give that a shot as a background
> process (my ia64 box takes quite a while to boot, so each test takes a
> long time but requires very little of my attention).

We *ought* to be safe after cpu_init() ... which is called from setup_arch(),
which is several calls before sched_init().

Thanks for looking at this though ... my ability to test just went
away for a while ... some lab re-organization means all my systems
just got powered off and removed from their rack so the rack can be
moved to a new location.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-14 Thread Roland Dreier

 > >  The strange thing is that Ingo's patch to make cpu_clock() a NOP until
 > >  after sched_init() didn't fix things for me...

 > Very strange.  I threw in an output line counter into the printk code() ... 
 > if I
 > disable the timestamps for the first 30 lines, then everything is good (so 
 > the
 > basic timestamping code does still work on ia64). But I would have thought
 > that Ingo's delay until sched_init() ought to be long enough too. Clearly I
 > need to figure out exactly what needs to be initialized to prevent the
 > hang/crash.

I guess sched_init() is too early... it does seem really strange to
me, but I just double checked with Ingo's patch and it does indeed
hang.  The slow way to make progress is just to go through
start_kernel() line-by-line and enable cpu_clock() at each stage, and
see where it stops hanging.  I'll give that a shot as a background
process (my ia64 box takes quite a while to boot, so each test takes a
long time but requires very little of my attention).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-14 Thread Tony Luck

On Wed, Feb 13, 2008 at 7:47 PM, Roland Dreier <[EMAIL PROTECTED]> wrote:
>  The strange thing is that Ingo's patch to make cpu_clock() a NOP until
>  after sched_init() didn't fix things for me...

Very strange.  I threw in an output line counter into the printk code() ... if I
disable the timestamps for the first 30 lines, then everything is good (so the
basic timestamping code does still work on ia64). But I would have thought
that Ingo's delay until sched_init() ought to be long enough too. Clearly I
need to figure out exactly what needs to be initialized to prevent the
hang/crash.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-14 Thread Luck, Tony

> Just curious -- can you reproduce the same problem with
> CONFIG_PRINTK_TIME as I'm seeing?

Yes I can reproduce this (on latest Linus tree).  System
dies with no console output ... looks like the boot cpu
may have taken a machine check (it isn't responding to my
debugger).

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-14 Thread Luck, Tony

 Just curious -- can you reproduce the same problem with
 CONFIG_PRINTK_TIME as I'm seeing?

Yes I can reproduce this (on latest Linus tree).  System
dies with no console output ... looks like the boot cpu
may have taken a machine check (it isn't responding to my
debugger).

-Tony
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-14 Thread Tony Luck

On Wed, Feb 13, 2008 at 7:47 PM, Roland Dreier [EMAIL PROTECTED] wrote:
  The strange thing is that Ingo's patch to make cpu_clock() a NOP until
  after sched_init() didn't fix things for me...

Very strange.  I threw in an output line counter into the printk code() ... if I
disable the timestamps for the first 30 lines, then everything is good (so the
basic timestamping code does still work on ia64). But I would have thought
that Ingo's delay until sched_init() ought to be long enough too. Clearly I
need to figure out exactly what needs to be initialized to prevent the
hang/crash.

-Tony
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-14 Thread Roland Dreier

The strange thing is that Ingo's patch to make cpu_clock() a NOP until
after sched_init() didn't fix things for me...

  Very strange.  I threw in an output line counter into the printk code() ... 
  if I
  disable the timestamps for the first 30 lines, then everything is good (so 
  the
  basic timestamping code does still work on ia64). But I would have thought
  that Ingo's delay until sched_init() ought to be long enough too. Clearly I
  need to figure out exactly what needs to be initialized to prevent the
  hang/crash.

I guess sched_init() is too early... it does seem really strange to
me, but I just double checked with Ingo's patch and it does indeed
hang.  The slow way to make progress is just to go through
start_kernel() line-by-line and enable cpu_clock() at each stage, and
see where it stops hanging.  I'll give that a shot as a background
process (my ia64 box takes quite a while to boot, so each test takes a
long time but requires very little of my attention).
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-14 Thread Luck, Tony

 I guess sched_init() is too early... it does seem really strange to
 me, but I just double checked with Ingo's patch and it does indeed
 hang.  The slow way to make progress is just to go through
 start_kernel() line-by-line and enable cpu_clock() at each stage, and
 see where it stops hanging.  I'll give that a shot as a background
 process (my ia64 box takes quite a while to boot, so each test takes a
 long time but requires very little of my attention).

We *ought* to be safe after cpu_init() ... which is called from setup_arch(),
which is several calls before sched_init().

Thanks for looking at this though ... my ability to test just went
away for a while ... some lab re-organization means all my systems
just got powered off and removed from their rack so the rack can be
moved to a new location.

-Tony
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-14 Thread Tony Luck

 We *ought* to be safe after cpu_init() ... which is called from setup_arch(),
 which is several calls before sched_init().

Perhaps what is happening is that cpu0 comes online ... safely skips
over the early printk calls.  Calls cpu_init() which sets up the resources
*it* needs (ar.k3 points to per-cpu space), and then executes
sched_init() which marks it safe for all printk's. Then cpu1 comes
up and does a printk before it gets to cpu_init().

Try with Ingo patch and CONFIG_SMP=n to see if you can come
up on a uni-processor.

-Tony
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-13 Thread Roland Dreier

 > I'll take a closer look at what is needed tomorrow.

Hi Tony,

Just curious -- can you reproduce the same problem with
CONFIG_PRINTK_TIME as I'm seeing?  If not I'm happy to test anything
you want to try.

The strange thing is that Ingo's patch to make cpu_clock() a NOP until
after sched_init() didn't fix things for me...

 - R.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-13 Thread Tony Luck

> That's right.  I thought you guys had something that would handle that
> early on, but looking at how the trick works in the vmlinux.lds.S ia64
> uses that isn't the case.

We try to get things set up pertty early ... but I agree this is
fragile.  Adding code to printk() to not provide a timestamp
before some safe point in boot is a workaround to the
current problem.  But it may come back to haunt us if
other per-cpu data is added that needs to be accessed
early during boot.

There are some changes going on at the moment on how
we allocate the space for the per-cpu area.  It is likely that
for a non-boot cpu we might be able to get everything that
we need for per-cpu access to work done in head.S before
we can get to any C code.  Boot cpu may be harder unless
we statically allocate space for its per-cpu area in
vmlinux.lds.S

I'll take a closer look at what is needed tomorrow.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-13 Thread David Miller

From: "Tony Luck" <[EMAIL PROTECTED]>
Date: Wed, 13 Feb 2008 16:59:30 -0800

> > It is legal to access per-cpu data as early as you like,
> > it just evaluates to the static copy in the per-cpu section
> > of the kernel image until the per-cpu areas are setup.
> 
> On ia64 per-cpu variables are mapped into the top 64K
> of the address space.  Accessing them before the
> resources to handle the access to that virtual address
> have been set up would cause problems.

That's right.  I thought you guys had something that would handle that
early on, but looking at how the trick works in the vmlinux.lds.S ia64
uses that isn't the case.

I guess we really can't do per-cpu accesses early on.

This is extremely fragile, we should find some way to formalize this
to make accidental early per-cpu references trap at build time or
similar.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-13 Thread Tony Luck

> It is legal to access per-cpu data as early as you like,
> it just evaluates to the static copy in the per-cpu section
> of the kernel image until the per-cpu areas are setup.

On ia64 per-cpu variables are mapped into the top 64K
of the address space.  Accessing them before the
resources to handle the access to that virtual address
have been set up would cause problems.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-13 Thread Roland Dreier

 > > so .. how about the patch below? Note that we already had an "early 
 > > bootup" special (the rq->idle check), it's now just made explicit via 
 > > the scheduler_running flag.
 > 
 > the one below even builds. (untested otherwise)

I just tried this... it doesn't work on top of current git (same
behavior: hang before the kernel prints anything).

I also tried reverting 19ef9309 on top of a checkout of the tree state
at 19ef9309, and that hung before printing anything too.

 - R.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-13 Thread Ingo Molnar

* David Miller <[EMAIL PROTECTED]> wrote:

> I see nothing illegal in what cpu_clock() is doing, that's why I fixed 
> the sparc64 per-cpu problem I ran into since sparc64 was doing the 
> wrong thing when booted on a non-zero cpu.

okay. I'm just somewhat uneasy about potentially losing the ability to 
printk. printk must stay simple and dependable. Maybe PRINTK_TIMESTAMPs 
should be delayed to at the point when system_state == SYSTEM_RUNNING.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-13 Thread Ingo Molnar


* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> so .. how about the patch below? Note that we already had an "early 
> bootup" special (the rq->idle check), it's now just made explicit via 
> the scheduler_running flag.

the one below even builds. (untested otherwise)

Ingo

--->
Subject: sched: make sched_clock() early-bootup capable
From: Ingo Molnar <[EMAIL PROTECTED]>
Date: Wed Feb 13 13:49:36 CET 2008

do not call sched_clock() too early. Not only might rq->idle
not be set up - but pure per-cpu data might not be accessible
either.

Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
---
 kernel/sched.c |   14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

Index: linux-x86.q/kernel/sched.c
===
--- linux-x86.q.orig/kernel/sched.c
+++ linux-x86.q/kernel/sched.c
@@ -666,6 +666,8 @@ const_debug unsigned int sysctl_sched_rt
  */
 const_debug unsigned int sysctl_sched_rt_ratio = 62259;
 
+static __read_mostly int scheduler_running;
+
 /*
  * For kernel-internal use: high-speed (but slightly incorrect) per-cpu
  * clock constructed from sched_clock():
@@ -676,14 +678,16 @@ unsigned long long cpu_clock(int cpu)
unsigned long flags;
struct rq *rq;
 
-   local_irq_save(flags);
-   rq = cpu_rq(cpu);
/*
 * Only call sched_clock() if the scheduler has already been
 * initialized (some code might call cpu_clock() very early):
 */
-   if (rq->idle)
-   update_rq_clock(rq);
+   if (unlikely(!scheduler_running))
+   return 0;
+
+   local_irq_save(flags);
+   rq = cpu_rq(cpu);
+   update_rq_clock(rq);
now = rq->clock;
local_irq_restore(flags);
 
@@ -7255,6 +7259,8 @@ void __init sched_init(void)
 * During early bootup we pretend to be a normal task:
 */
current->sched_class = _sched_class;
+
+   scheduler_running = 1;
 }
 
 #ifdef CONFIG_DEBUG_SPINLOCK_SLEEP
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-13 Thread David Miller

From: Ingo Molnar <[EMAIL PROTECTED]>
Date: Wed, 13 Feb 2008 13:57:25 +0100

> so .. how about the patch below? Note that we already had an "early 
> bootup" special (the rq->idle check), it's now just made explicit via 
> the scheduler_running flag.

I don't see what the problem is.

It is legal to access per-cpu data as early as you like,
it just evaluates to the static copy in the per-cpu section
of the kernel image until the per-cpu areas are setup.

rq->idle should also be zero this early as well, that's
also legal to rely upon

I see nothing illegal in what cpu_clock() is doing, that's
why I fixed the sparc64 per-cpu problem I ran into since
sparc64 was doing the wrong thing when booted on a non-zero
cpu.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-13 Thread Ingo Molnar


* David Miller <[EMAIL PROTECTED]> wrote:

> The kernel now derefernces per-cpu variables very early, essentially 
> in the very first printk() (via printk()'s call to cpu_clock()).
> 
> This bit me on sparc64 because of how I do the per-cpu address 
> formation.  If I booted on a non-zero cpuid things would explode.
> 
> You might be hitting something similar.

hm. But the raw sched_clock() use was wrong. We could either go back to 
jiffies (which is certainly the simplest and was used before 
printk_clock() was introduced which incorrectly relied on sched_clock()) 
- but that loses precision and the same issue will re-visit us once we 
go totally tickless and start to map jiffies to GTOD ...

so .. how about the patch below? Note that we already had an "early 
bootup" special (the rq->idle check), it's now just made explicit via 
the scheduler_running flag.

Ingo

-->
Subject: sched: make sched_clock() early-bootup capable
From: Ingo Molnar <[EMAIL PROTECTED]>
Date: Wed Feb 13 13:49:36 CET 2008

do not call sched_clock() too early. Not only might rq->idle
not be set up - but pure per-cpu data might not be accessible
either.

Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
---
 kernel/sched.c |   14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

Index: linux-x86.q/kernel/sched.c
===
--- linux-x86.q.orig/kernel/sched.c
+++ linux-x86.q/kernel/sched.c
@@ -666,6 +666,8 @@ const_debug unsigned int sysctl_sched_rt
  */
 const_debug unsigned int sysctl_sched_rt_ratio = 62259;
 
+static __readmostly int scheduler_running;
+
 /*
  * For kernel-internal use: high-speed (but slightly incorrect) per-cpu
  * clock constructed from sched_clock():
@@ -676,14 +678,16 @@ unsigned long long cpu_clock(int cpu)
unsigned long flags;
struct rq *rq;
 
-   local_irq_save(flags);
-   rq = cpu_rq(cpu);
/*
 * Only call sched_clock() if the scheduler has already been
 * initialized (some code might call cpu_clock() very early):
 */
-   if (rq->idle)
-   update_rq_clock(rq);
+   if (unlikely(!scheduler_running))
+   return 0;
+
+   local_irq_save(flags);
+   rq = cpu_rq(cpu);
+   update_rq_clock(rq);
now = rq->clock;
local_irq_restore(flags);
 
@@ -7255,6 +7259,8 @@ void __init sched_init(void)
 * During early bootup we pretend to be a normal task:
 */
current->sched_class = _sched_class;
+
+   scheduler_running = 1;
 }
 
 #ifdef CONFIG_DEBUG_SPINLOCK_SLEEP
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-13 Thread Ingo Molnar


* David Miller [EMAIL PROTECTED] wrote:

 The kernel now derefernces per-cpu variables very early, essentially 
 in the very first printk() (via printk()'s call to cpu_clock()).
 
 This bit me on sparc64 because of how I do the per-cpu address 
 formation.  If I booted on a non-zero cpuid things would explode.
 
 You might be hitting something similar.

hm. But the raw sched_clock() use was wrong. We could either go back to 
jiffies (which is certainly the simplest and was used before 
printk_clock() was introduced which incorrectly relied on sched_clock()) 
- but that loses precision and the same issue will re-visit us once we 
go totally tickless and start to map jiffies to GTOD ...

so .. how about the patch below? Note that we already had an early 
bootup special (the rq-idle check), it's now just made explicit via 
the scheduler_running flag.

Ingo

--
Subject: sched: make sched_clock() early-bootup capable
From: Ingo Molnar [EMAIL PROTECTED]
Date: Wed Feb 13 13:49:36 CET 2008

do not call sched_clock() too early. Not only might rq-idle
not be set up - but pure per-cpu data might not be accessible
either.

Signed-off-by: Ingo Molnar [EMAIL PROTECTED]
---
 kernel/sched.c |   14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

Index: linux-x86.q/kernel/sched.c
===
--- linux-x86.q.orig/kernel/sched.c
+++ linux-x86.q/kernel/sched.c
@@ -666,6 +666,8 @@ const_debug unsigned int sysctl_sched_rt
  */
 const_debug unsigned int sysctl_sched_rt_ratio = 62259;
 
+static __readmostly int scheduler_running;
+
 /*
  * For kernel-internal use: high-speed (but slightly incorrect) per-cpu
  * clock constructed from sched_clock():
@@ -676,14 +678,16 @@ unsigned long long cpu_clock(int cpu)
unsigned long flags;
struct rq *rq;
 
-   local_irq_save(flags);
-   rq = cpu_rq(cpu);
/*
 * Only call sched_clock() if the scheduler has already been
 * initialized (some code might call cpu_clock() very early):
 */
-   if (rq-idle)
-   update_rq_clock(rq);
+   if (unlikely(!scheduler_running))
+   return 0;
+
+   local_irq_save(flags);
+   rq = cpu_rq(cpu);
+   update_rq_clock(rq);
now = rq-clock;
local_irq_restore(flags);
 
@@ -7255,6 +7259,8 @@ void __init sched_init(void)
 * During early bootup we pretend to be a normal task:
 */
current-sched_class = fair_sched_class;
+
+   scheduler_running = 1;
 }
 
 #ifdef CONFIG_DEBUG_SPINLOCK_SLEEP
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-13 Thread Ingo Molnar


* Ingo Molnar [EMAIL PROTECTED] wrote:

 so .. how about the patch below? Note that we already had an early 
 bootup special (the rq-idle check), it's now just made explicit via 
 the scheduler_running flag.

the one below even builds. (untested otherwise)

Ingo

---
Subject: sched: make sched_clock() early-bootup capable
From: Ingo Molnar [EMAIL PROTECTED]
Date: Wed Feb 13 13:49:36 CET 2008

do not call sched_clock() too early. Not only might rq-idle
not be set up - but pure per-cpu data might not be accessible
either.

Signed-off-by: Ingo Molnar [EMAIL PROTECTED]
---
 kernel/sched.c |   14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

Index: linux-x86.q/kernel/sched.c
===
--- linux-x86.q.orig/kernel/sched.c
+++ linux-x86.q/kernel/sched.c
@@ -666,6 +666,8 @@ const_debug unsigned int sysctl_sched_rt
  */
 const_debug unsigned int sysctl_sched_rt_ratio = 62259;
 
+static __read_mostly int scheduler_running;
+
 /*
  * For kernel-internal use: high-speed (but slightly incorrect) per-cpu
  * clock constructed from sched_clock():
@@ -676,14 +678,16 @@ unsigned long long cpu_clock(int cpu)
unsigned long flags;
struct rq *rq;
 
-   local_irq_save(flags);
-   rq = cpu_rq(cpu);
/*
 * Only call sched_clock() if the scheduler has already been
 * initialized (some code might call cpu_clock() very early):
 */
-   if (rq-idle)
-   update_rq_clock(rq);
+   if (unlikely(!scheduler_running))
+   return 0;
+
+   local_irq_save(flags);
+   rq = cpu_rq(cpu);
+   update_rq_clock(rq);
now = rq-clock;
local_irq_restore(flags);
 
@@ -7255,6 +7259,8 @@ void __init sched_init(void)
 * During early bootup we pretend to be a normal task:
 */
current-sched_class = fair_sched_class;
+
+   scheduler_running = 1;
 }
 
 #ifdef CONFIG_DEBUG_SPINLOCK_SLEEP
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-13 Thread David Miller

From: Ingo Molnar [EMAIL PROTECTED]
Date: Wed, 13 Feb 2008 13:57:25 +0100

 so .. how about the patch below? Note that we already had an early 
 bootup special (the rq-idle check), it's now just made explicit via 
 the scheduler_running flag.

I don't see what the problem is.

It is legal to access per-cpu data as early as you like,
it just evaluates to the static copy in the per-cpu section
of the kernel image until the per-cpu areas are setup.

rq-idle should also be zero this early as well, that's
also legal to rely upon

I see nothing illegal in what cpu_clock() is doing, that's
why I fixed the sparc64 per-cpu problem I ran into since
sparc64 was doing the wrong thing when booted on a non-zero
cpu.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-13 Thread Ingo Molnar


* David Miller [EMAIL PROTECTED] wrote:

 I see nothing illegal in what cpu_clock() is doing, that's why I fixed 
 the sparc64 per-cpu problem I ran into since sparc64 was doing the 
 wrong thing when booted on a non-zero cpu.

okay. I'm just somewhat uneasy about potentially losing the ability to 
printk. printk must stay simple and dependable. Maybe PRINTK_TIMESTAMPs 
should be delayed to at the point when system_state == SYSTEM_RUNNING.

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-13 Thread Roland Dreier

   so .. how about the patch below? Note that we already had an early 
   bootup special (the rq-idle check), it's now just made explicit via 
   the scheduler_running flag.
  
  the one below even builds. (untested otherwise)

I just tried this... it doesn't work on top of current git (same
behavior: hang before the kernel prints anything).

I also tried reverting 19ef9309 on top of a checkout of the tree state
at 19ef9309, and that hung before printing anything too.

 - R.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-13 Thread Tony Luck

 It is legal to access per-cpu data as early as you like,
 it just evaluates to the static copy in the per-cpu section
 of the kernel image until the per-cpu areas are setup.

On ia64 per-cpu variables are mapped into the top 64K
of the address space.  Accessing them before the
resources to handle the access to that virtual address
have been set up would cause problems.

-Tony
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-13 Thread David Miller

From: Tony Luck [EMAIL PROTECTED]
Date: Wed, 13 Feb 2008 16:59:30 -0800

  It is legal to access per-cpu data as early as you like,
  it just evaluates to the static copy in the per-cpu section
  of the kernel image until the per-cpu areas are setup.

 On ia64 per-cpu variables are mapped into the top 64K
 of the address space.  Accessing them before the
 resources to handle the access to that virtual address
 have been set up would cause problems.

That's right.  I thought you guys had something that would handle that
early on, but looking at how the trick works in the vmlinux.lds.S ia64
uses that isn't the case.

I guess we really can't do per-cpu accesses early on.

This is extremely fragile, we should find some way to formalize this
to make accidental early per-cpu references trap at build time or
similar.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-13 Thread Tony Luck

 That's right.  I thought you guys had something that would handle that
 early on, but looking at how the trick works in the vmlinux.lds.S ia64
 uses that isn't the case.

We try to get things set up pertty early ... but I agree this is
fragile.  Adding code to printk() to not provide a timestamp
before some safe point in boot is a workaround to the
current problem.  But it may come back to haunt us if
other per-cpu data is added that needs to be accessed
early during boot.

There are some changes going on at the moment on how
we allocate the space for the per-cpu area.  It is likely that
for a non-boot cpu we might be able to get everything that
we need for per-cpu access to work done in head.S before
we can get to any C code.  Boot cpu may be harder unless
we statically allocate space for its per-cpu area in
vmlinux.lds.S

I'll take a closer look at what is needed tomorrow.

-Tony
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-13 Thread Roland Dreier

  I'll take a closer look at what is needed tomorrow.

Hi Tony,

Just curious -- can you reproduce the same problem with
CONFIG_PRINTK_TIME as I'm seeing?  If not I'm happy to test anything
you want to try.

The strange thing is that Ingo's patch to make cpu_clock() a NOP until
after sched_init() didn't fix things for me...

 - R.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-12 Thread David Miller

From: Roland Dreier <[EMAIL PROTECTED]>
Date: Tue, 12 Feb 2008 22:24:05 -0800

> I'm seeing a strange hang with current git (head 96b5a46e) on an ia64
> box -- an Intel SDV with 2 dual core hyperthreaded Itanium 2 CPUs (so
> 8 logical CPUs to the kernel).  It hangs without printing anything
> ("Uncompressing Linux... done" from ELILO is the last thing I see) if
> I have CONFIG_PRINTK_TIME=y; it works fine with CONFIG_PRINTK_TIME=n.
> 
> The really strange thing is that I have bisected this down to 326e96b9
> ("printk: revert ktime_get() timestamps"), and verified that if revert
> this one patch on top of my current git tree, then the kernel boots
> fine with CONFIG_PRINTK_TIME=y.  The strange thing is that I have also
> checked that the real v2.6.24 kernel boots fine on this system, and as
> far as I can tell, 2.6.24 didn't have the commit that 326e96b9 reverts
> (19ef9309), so there is some interaction with another patch that made
> 19ef9309 necessary on my system.
> 
> Any good idea how to debug this, given that the broken kernels don't
> give any output at all?

The kernel now derefernces per-cpu variables very early, essentially
in the very first printk() (via printk()'s call to cpu_clock()).

This bit me on sparc64 because of how I do the per-cpu address
formation.  If I booted on a non-zero cpuid things would explode.

You might be hitting something similar.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-12 Thread Roland Dreier

I'm seeing a strange hang with current git (head 96b5a46e) on an ia64
box -- an Intel SDV with 2 dual core hyperthreaded Itanium 2 CPUs (so
8 logical CPUs to the kernel).  It hangs without printing anything
("Uncompressing Linux... done" from ELILO is the last thing I see) if
I have CONFIG_PRINTK_TIME=y; it works fine with CONFIG_PRINTK_TIME=n.

The really strange thing is that I have bisected this down to 326e96b9
("printk: revert ktime_get() timestamps"), and verified that if revert
this one patch on top of my current git tree, then the kernel boots
fine with CONFIG_PRINTK_TIME=y.  The strange thing is that I have also
checked that the real v2.6.24 kernel boots fine on this system, and as
far as I can tell, 2.6.24 didn't have the commit that 326e96b9 reverts
(19ef9309), so there is some interaction with another patch that made
19ef9309 necessary on my system.

Any good idea how to debug this, given that the broken kernels don't
give any output at all?

Thanks,
  Roland
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-12 Thread Roland Dreier

I'm seeing a strange hang with current git (head 96b5a46e) on an ia64
box -- an Intel SDV with 2 dual core hyperthreaded Itanium 2 CPUs (so
8 logical CPUs to the kernel).  It hangs without printing anything
(Uncompressing Linux... done from ELILO is the last thing I see) if
I have CONFIG_PRINTK_TIME=y; it works fine with CONFIG_PRINTK_TIME=n.

The really strange thing is that I have bisected this down to 326e96b9
(printk: revert ktime_get() timestamps), and verified that if revert
this one patch on top of my current git tree, then the kernel boots
fine with CONFIG_PRINTK_TIME=y.  The strange thing is that I have also
checked that the real v2.6.24 kernel boots fine on this system, and as
far as I can tell, 2.6.24 didn't have the commit that 326e96b9 reverts
(19ef9309), so there is some interaction with another patch that made
19ef9309 necessary on my system.

Any good idea how to debug this, given that the broken kernels don't
give any output at all?

Thanks,
  Roland
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-12 Thread David Miller

From: Roland Dreier [EMAIL PROTECTED]
Date: Tue, 12 Feb 2008 22:24:05 -0800

 I'm seeing a strange hang with current git (head 96b5a46e) on an ia64
 box -- an Intel SDV with 2 dual core hyperthreaded Itanium 2 CPUs (so
 8 logical CPUs to the kernel).  It hangs without printing anything
 (Uncompressing Linux... done from ELILO is the last thing I see) if
 I have CONFIG_PRINTK_TIME=y; it works fine with CONFIG_PRINTK_TIME=n.

 The really strange thing is that I have bisected this down to 326e96b9
 (printk: revert ktime_get() timestamps), and verified that if revert
 this one patch on top of my current git tree, then the kernel boots
 fine with CONFIG_PRINTK_TIME=y.  The strange thing is that I have also
 checked that the real v2.6.24 kernel boots fine on this system, and as
 far as I can tell, 2.6.24 didn't have the commit that 326e96b9 reverts
 (19ef9309), so there is some interaction with another patch that made
 19ef9309 necessary on my system.

 Any good idea how to debug this, given that the broken kernels don't
 give any output at all?

The kernel now derefernces per-cpu variables very early, essentially
in the very first printk() (via printk()'s call to cpu_clock()).

This bit me on sparc64 because of how I do the per-cpu address
formation.  If I booted on a non-zero cpuid things would explode.

You might be hitting something similar.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

42 matches

Mail list logo