Re: [REGRESSION] Boot hang with 939f04bec printk: enable interrupts before calling console_trylock_for_printk()

2014-07-23 Thread Jan Kara
On Wed 23-07-14 01:29:32, Andreas Bombe wrote:
> On Mon, Jul 21, 2014 at 12:04:34PM +0200, Jan Kara wrote:
> > On Sat 19-07-14 00:50:05, Andreas Bombe wrote:
> > > I don't see anything in /sys/kernel/debug/tracing/trace_pipe or
> > > .../trace (besides the header) with your patch applied. In case you
> > > meant to test it with the problematic printk change, I also tried with
> > > the revert reverted. That still hangs as before without any error report
> > > to see.
> >   Yes, I meant testing my lockdep patch with the problematic printk change.
> > Thanks for having a look. I'm puzzled why it didn't help.
> > 
> > > I checked the kernel logs and there is also no lockdep report anywhere.
> > > I get the "trace_printk() being used" notice but nothing else of
> > > interest around there. Though the notice should mean trace_printk() was
> > > used at least once?
> >   Yes. Anyway, I'd be grateful if you could run one more test for me so
> > that I can better understand what's going on. Can you take recent vanilla
> > kernel (with the revert) and apply attached patch to it? It again enables
> > interrupts when calling console_unlock() but keeps lockdep coverage
> > unchanged. It helped Sasha so I want to see whether your case is similar or
> > different. Thanks!
> 
> Applied on top of 15ba2236f, works fine. 
  Great. Thanks for testing. I'll send the patch to Andrew.

> I still don't see what printing could have triggered the problem. The
> only thing that is a warning is from the PCI code about some missing
> pcie-to-pci bridge (which I really should report some time). That isn't
> the culprit however since I tested a build with that WARN_ON_ONCE
> removed and it still hung.
> 
> Okay, there's ACPI errors, but these seem to be rather late to matter, I
> think? Anyway, here's the log of this working boot from the start to
> just before initrd gets started:
  The prints you can see are those that are fine ;). The thing is that when
lockdep covers more of the printk & console code itself (as was the case
with my original patch), it finds something which makes it crash the
machine. I'm not sure why my patch to make lockdep use trace_printk didn't
really help, maybe the nature of the crash is different than I thought.
Sasha Levin was able to reproduce the problem with a virtual machine so
I'll ask him for a config and will hope I'll be able to reproduce myself
and experiment with it. Thanks again for testing!

Honza
-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REGRESSION] Boot hang with 939f04bec printk: enable interrupts before calling console_trylock_for_printk()

2014-07-23 Thread Jan Kara
On Wed 23-07-14 01:29:32, Andreas Bombe wrote:
 On Mon, Jul 21, 2014 at 12:04:34PM +0200, Jan Kara wrote:
  On Sat 19-07-14 00:50:05, Andreas Bombe wrote:
   I don't see anything in /sys/kernel/debug/tracing/trace_pipe or
   .../trace (besides the header) with your patch applied. In case you
   meant to test it with the problematic printk change, I also tried with
   the revert reverted. That still hangs as before without any error report
   to see.
Yes, I meant testing my lockdep patch with the problematic printk change.
  Thanks for having a look. I'm puzzled why it didn't help.
  
   I checked the kernel logs and there is also no lockdep report anywhere.
   I get the trace_printk() being used notice but nothing else of
   interest around there. Though the notice should mean trace_printk() was
   used at least once?
Yes. Anyway, I'd be grateful if you could run one more test for me so
  that I can better understand what's going on. Can you take recent vanilla
  kernel (with the revert) and apply attached patch to it? It again enables
  interrupts when calling console_unlock() but keeps lockdep coverage
  unchanged. It helped Sasha so I want to see whether your case is similar or
  different. Thanks!
 
 Applied on top of 15ba2236f, works fine. 
  Great. Thanks for testing. I'll send the patch to Andrew.

 I still don't see what printing could have triggered the problem. The
 only thing that is a warning is from the PCI code about some missing
 pcie-to-pci bridge (which I really should report some time). That isn't
 the culprit however since I tested a build with that WARN_ON_ONCE
 removed and it still hung.
 
 Okay, there's ACPI errors, but these seem to be rather late to matter, I
 think? Anyway, here's the log of this working boot from the start to
 just before initrd gets started:
  The prints you can see are those that are fine ;). The thing is that when
lockdep covers more of the printk  console code itself (as was the case
with my original patch), it finds something which makes it crash the
machine. I'm not sure why my patch to make lockdep use trace_printk didn't
really help, maybe the nature of the crash is different than I thought.
Sasha Levin was able to reproduce the problem with a virtual machine so
I'll ask him for a config and will hope I'll be able to reproduce myself
and experiment with it. Thanks again for testing!

Honza
-- 
Jan Kara j...@suse.cz
SUSE Labs, CR
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REGRESSION] Boot hang with 939f04bec printk: enable interrupts before calling console_trylock_for_printk()

2014-07-21 Thread Jan Kara
On Sat 19-07-14 00:50:05, Andreas Bombe wrote:
> On Thu, Jul 17, 2014 at 10:31:37AM +0200, Jan Kara wrote:
> > On Wed 16-07-14 23:34:08, Andreas Bombe wrote:
> > > On Mon, Jul 14, 2014 at 10:35:27AM +0200, Jan Kara wrote:
> > > > On Sun 29-06-14 00:50:50, Andreas Bombe wrote:
> > > > > None of the post 3.15 kernel boot for me. They all hang at the GRUB
> > > > > screen telling me it loaded and started the kernel, but the kernel
> > > > > itself stops before it prints anything (or even replaces the GRUB
> > > > > background graphics).
> > > > > 
> > > > > I bisected it down to 939f04bec1a4ef6ba4370b0f34b01decc844b1b1 
> > > > > "printk:
> > > > > enable interrupts before calling console_trylock_for_printk()".
> > > > > Reverting that patch on the latest kernel (git 24b414d5a7) allows me 
> > > > > to
> > > > > boot normally. I fixed the conflict in the revert by leaving in the 
> > > > > "if
> > > > > (in_sched) return printed_len;".
> > > > > 
> > > > > I have the "early printk via the EFI framebuffer" option enabled,
> > > > > disabling it made no difference however.
> > > >   Thanks for report. I've been on vacation so I'm replying with a 
> > > > delay. I
> > > > believe this is one of the issues where this patch just uncovers 
> > > > underlying
> > > > problem - I belive lockdep tries to report some locking issue in console
> > > > driver code (this patch increased lockdep coverage of console driver 
> > > > code)
> > > > however we are holding some locks in printk code which make lockdep
> > > > deadlock. Can you try running with the attached patch?
> > > 
> > > EUNABLE
> > > 
> > > You forgot to attach a patch.
> >   Bah, sorry. Attaching now.
> 
> I don't see anything in /sys/kernel/debug/tracing/trace_pipe or
> .../trace (besides the header) with your patch applied. In case you
> meant to test it with the problematic printk change, I also tried with
> the revert reverted. That still hangs as before without any error report
> to see.
  Yes, I meant testing my lockdep patch with the problematic printk change.
Thanks for having a look. I'm puzzled why it didn't help.

> I checked the kernel logs and there is also no lockdep report anywhere.
> I get the "trace_printk() being used" notice but nothing else of
> interest around there. Though the notice should mean trace_printk() was
> used at least once?
  Yes. Anyway, I'd be grateful if you could run one more test for me so
that I can better understand what's going on. Can you take recent vanilla
kernel (with the revert) and apply attached patch to it? It again enables
interrupts when calling console_unlock() but keeps lockdep coverage
unchanged. It helped Sasha so I want to see whether your case is similar or
different. Thanks!

Honza
-- 
Jan Kara 
SUSE Labs, CR
>From c73d4093c3fd5013dbf5baccd495d6db24de31cc Mon Sep 17 00:00:00 2001
From: Jan Kara 
Date: Wed, 4 Jun 2014 16:11:37 -0700
Subject: [PATCH] printk: enable interrupts before calling
 console_trylock_for_printk()

We need interrupts disabled when calling console_trylock_for_printk()
only so that cpu id we pass to can_use_console() remains valid (for
other things console_sem provides all the exclusion we need and
deadlocks on console_sem due to interrupts are impossible because we use
down_trylock()).  However if we are rescheduled, we are guaranteed to
run on an online cpu so we can easily just get the cpu id in
can_use_console().

We can lose a bit of performance when we enable interrupts in
vprintk_emit() and then disable them again in console_unlock() but OTOH
it can somewhat reduce interrupt latency caused by console_unlock().

We avoid calling console_unlock() from vprintk_emit() with lockdep
enabled as that has unveiled quite some bugs leading to system freezes
during boot (e.g. https://lkml.org/lkml/2014/5/30/242).

Signed-off-by: Jan Kara 
---
 kernel/printk/printk.c | 31 +--
 1 file changed, 21 insertions(+), 10 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 13e839dbca07..fe4154d83fe4 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1416,10 +1416,9 @@ static int have_callable_console(void)
 /*
  * Can we actually use the console at this time on this cpu?
  *
- * Console drivers may assume that per-cpu resources have
- * been allocated. So unless they're explicitly marked as
- * being able to cope (CON_ANYTIME) don't call them until
- * this CPU is officially up.
+ * Console drivers may assume that per-cpu resources have been allocated. So
+ * unless they're explicitly marked as being able to cope (CON_ANYTIME) don't
+ * call them until this CPU is officially up.
  */
 static inline int can_use_console(unsigned int cpu)
 {
@@ -1432,8 +1431,10 @@ static inline int can_use_console(unsigned int cpu)
  * console_lock held, and 'console_locked' set) if it
  * is successful, false otherwise.
  */
-static int console_trylock_for_printk(unsigned int 

Re: [REGRESSION] Boot hang with 939f04bec printk: enable interrupts before calling console_trylock_for_printk()

2014-07-21 Thread Jan Kara
On Sat 19-07-14 00:50:05, Andreas Bombe wrote:
 On Thu, Jul 17, 2014 at 10:31:37AM +0200, Jan Kara wrote:
  On Wed 16-07-14 23:34:08, Andreas Bombe wrote:
   On Mon, Jul 14, 2014 at 10:35:27AM +0200, Jan Kara wrote:
On Sun 29-06-14 00:50:50, Andreas Bombe wrote:
 None of the post 3.15 kernel boot for me. They all hang at the GRUB
 screen telling me it loaded and started the kernel, but the kernel
 itself stops before it prints anything (or even replaces the GRUB
 background graphics).
 
 I bisected it down to 939f04bec1a4ef6ba4370b0f34b01decc844b1b1 
 printk:
 enable interrupts before calling console_trylock_for_printk().
 Reverting that patch on the latest kernel (git 24b414d5a7) allows me 
 to
 boot normally. I fixed the conflict in the revert by leaving in the 
 if
 (in_sched) return printed_len;.
 
 I have the early printk via the EFI framebuffer option enabled,
 disabling it made no difference however.
  Thanks for report. I've been on vacation so I'm replying with a 
delay. I
believe this is one of the issues where this patch just uncovers 
underlying
problem - I belive lockdep tries to report some locking issue in console
driver code (this patch increased lockdep coverage of console driver 
code)
however we are holding some locks in printk code which make lockdep
deadlock. Can you try running with the attached patch?
   
   EUNABLE
   
   You forgot to attach a patch.
Bah, sorry. Attaching now.
 
 I don't see anything in /sys/kernel/debug/tracing/trace_pipe or
 .../trace (besides the header) with your patch applied. In case you
 meant to test it with the problematic printk change, I also tried with
 the revert reverted. That still hangs as before without any error report
 to see.
  Yes, I meant testing my lockdep patch with the problematic printk change.
Thanks for having a look. I'm puzzled why it didn't help.

 I checked the kernel logs and there is also no lockdep report anywhere.
 I get the trace_printk() being used notice but nothing else of
 interest around there. Though the notice should mean trace_printk() was
 used at least once?
  Yes. Anyway, I'd be grateful if you could run one more test for me so
that I can better understand what's going on. Can you take recent vanilla
kernel (with the revert) and apply attached patch to it? It again enables
interrupts when calling console_unlock() but keeps lockdep coverage
unchanged. It helped Sasha so I want to see whether your case is similar or
different. Thanks!

Honza
-- 
Jan Kara j...@suse.cz
SUSE Labs, CR
From c73d4093c3fd5013dbf5baccd495d6db24de31cc Mon Sep 17 00:00:00 2001
From: Jan Kara j...@suse.cz
Date: Wed, 4 Jun 2014 16:11:37 -0700
Subject: [PATCH] printk: enable interrupts before calling
 console_trylock_for_printk()

We need interrupts disabled when calling console_trylock_for_printk()
only so that cpu id we pass to can_use_console() remains valid (for
other things console_sem provides all the exclusion we need and
deadlocks on console_sem due to interrupts are impossible because we use
down_trylock()).  However if we are rescheduled, we are guaranteed to
run on an online cpu so we can easily just get the cpu id in
can_use_console().

We can lose a bit of performance when we enable interrupts in
vprintk_emit() and then disable them again in console_unlock() but OTOH
it can somewhat reduce interrupt latency caused by console_unlock().

We avoid calling console_unlock() from vprintk_emit() with lockdep
enabled as that has unveiled quite some bugs leading to system freezes
during boot (e.g. https://lkml.org/lkml/2014/5/30/242).

Signed-off-by: Jan Kara j...@suse.cz
---
 kernel/printk/printk.c | 31 +--
 1 file changed, 21 insertions(+), 10 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 13e839dbca07..fe4154d83fe4 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1416,10 +1416,9 @@ static int have_callable_console(void)
 /*
  * Can we actually use the console at this time on this cpu?
  *
- * Console drivers may assume that per-cpu resources have
- * been allocated. So unless they're explicitly marked as
- * being able to cope (CON_ANYTIME) don't call them until
- * this CPU is officially up.
+ * Console drivers may assume that per-cpu resources have been allocated. So
+ * unless they're explicitly marked as being able to cope (CON_ANYTIME) don't
+ * call them until this CPU is officially up.
  */
 static inline int can_use_console(unsigned int cpu)
 {
@@ -1432,8 +1431,10 @@ static inline int can_use_console(unsigned int cpu)
  * console_lock held, and 'console_locked' set) if it
  * is successful, false otherwise.
  */
-static int console_trylock_for_printk(unsigned int cpu)
+static int console_trylock_for_printk(void)
 {
+	unsigned int cpu = smp_processor_id();
+
 	if (!console_trylock())
 	

Re: [REGRESSION] Boot hang with 939f04bec printk: enable interrupts before calling console_trylock_for_printk()

2014-07-18 Thread Andreas Bombe
On Thu, Jul 17, 2014 at 10:31:37AM +0200, Jan Kara wrote:
> On Wed 16-07-14 23:34:08, Andreas Bombe wrote:
> > On Mon, Jul 14, 2014 at 10:35:27AM +0200, Jan Kara wrote:
> > > On Sun 29-06-14 00:50:50, Andreas Bombe wrote:
> > > > None of the post 3.15 kernel boot for me. They all hang at the GRUB
> > > > screen telling me it loaded and started the kernel, but the kernel
> > > > itself stops before it prints anything (or even replaces the GRUB
> > > > background graphics).
> > > > 
> > > > I bisected it down to 939f04bec1a4ef6ba4370b0f34b01decc844b1b1 "printk:
> > > > enable interrupts before calling console_trylock_for_printk()".
> > > > Reverting that patch on the latest kernel (git 24b414d5a7) allows me to
> > > > boot normally. I fixed the conflict in the revert by leaving in the "if
> > > > (in_sched) return printed_len;".
> > > > 
> > > > I have the "early printk via the EFI framebuffer" option enabled,
> > > > disabling it made no difference however.
> > >   Thanks for report. I've been on vacation so I'm replying with a delay. I
> > > believe this is one of the issues where this patch just uncovers 
> > > underlying
> > > problem - I belive lockdep tries to report some locking issue in console
> > > driver code (this patch increased lockdep coverage of console driver code)
> > > however we are holding some locks in printk code which make lockdep
> > > deadlock. Can you try running with the attached patch?
> > 
> > EUNABLE
> > 
> > You forgot to attach a patch.
>   Bah, sorry. Attaching now.

I don't see anything in /sys/kernel/debug/tracing/trace_pipe or
.../trace (besides the header) with your patch applied. In case you
meant to test it with the problematic printk change, I also tried with
the revert reverted. That still hangs as before without any error report
to see.

I checked the kernel logs and there is also no lockdep report anywhere.
I get the "trace_printk() being used" notice but nothing else of
interest around there. Though the notice should mean trace_printk() was
used at least once?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REGRESSION] Boot hang with 939f04bec printk: enable interrupts before calling console_trylock_for_printk()

2014-07-18 Thread Andreas Bombe
On Thu, Jul 17, 2014 at 10:31:37AM +0200, Jan Kara wrote:
 On Wed 16-07-14 23:34:08, Andreas Bombe wrote:
  On Mon, Jul 14, 2014 at 10:35:27AM +0200, Jan Kara wrote:
   On Sun 29-06-14 00:50:50, Andreas Bombe wrote:
None of the post 3.15 kernel boot for me. They all hang at the GRUB
screen telling me it loaded and started the kernel, but the kernel
itself stops before it prints anything (or even replaces the GRUB
background graphics).

I bisected it down to 939f04bec1a4ef6ba4370b0f34b01decc844b1b1 printk:
enable interrupts before calling console_trylock_for_printk().
Reverting that patch on the latest kernel (git 24b414d5a7) allows me to
boot normally. I fixed the conflict in the revert by leaving in the if
(in_sched) return printed_len;.

I have the early printk via the EFI framebuffer option enabled,
disabling it made no difference however.
 Thanks for report. I've been on vacation so I'm replying with a delay. I
   believe this is one of the issues where this patch just uncovers 
   underlying
   problem - I belive lockdep tries to report some locking issue in console
   driver code (this patch increased lockdep coverage of console driver code)
   however we are holding some locks in printk code which make lockdep
   deadlock. Can you try running with the attached patch?
  
  EUNABLE
  
  You forgot to attach a patch.
   Bah, sorry. Attaching now.

I don't see anything in /sys/kernel/debug/tracing/trace_pipe or
.../trace (besides the header) with your patch applied. In case you
meant to test it with the problematic printk change, I also tried with
the revert reverted. That still hangs as before without any error report
to see.

I checked the kernel logs and there is also no lockdep report anywhere.
I get the trace_printk() being used notice but nothing else of
interest around there. Though the notice should mean trace_printk() was
used at least once?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REGRESSION] Boot hang with 939f04bec printk: enable interrupts before calling console_trylock_for_printk()

2014-07-17 Thread Jan Kara
On Wed 16-07-14 23:34:08, Andreas Bombe wrote:
> On Mon, Jul 14, 2014 at 10:35:27AM +0200, Jan Kara wrote:
> > On Sun 29-06-14 00:50:50, Andreas Bombe wrote:
> > > None of the post 3.15 kernel boot for me. They all hang at the GRUB
> > > screen telling me it loaded and started the kernel, but the kernel
> > > itself stops before it prints anything (or even replaces the GRUB
> > > background graphics).
> > > 
> > > I bisected it down to 939f04bec1a4ef6ba4370b0f34b01decc844b1b1 "printk:
> > > enable interrupts before calling console_trylock_for_printk()".
> > > Reverting that patch on the latest kernel (git 24b414d5a7) allows me to
> > > boot normally. I fixed the conflict in the revert by leaving in the "if
> > > (in_sched) return printed_len;".
> > > 
> > > I have the "early printk via the EFI framebuffer" option enabled,
> > > disabling it made no difference however.
> >   Thanks for report. I've been on vacation so I'm replying with a delay. I
> > believe this is one of the issues where this patch just uncovers underlying
> > problem - I belive lockdep tries to report some locking issue in console
> > driver code (this patch increased lockdep coverage of console driver code)
> > however we are holding some locks in printk code which make lockdep
> > deadlock. Can you try running with the attached patch?
> 
> EUNABLE
> 
> You forgot to attach a patch.
  Bah, sorry. Attaching now.

Honza
-- 
Jan Kara 
SUSE Labs, CR
>From 93930c92c664d8b834e1e04b92c00023626fc07a Mon Sep 17 00:00:00 2001
From: Jan Kara 
Date: Thu, 19 Jun 2014 15:36:09 +0200
Subject: [PATCH] lockdep: Dump info via tracing

Signed-off-by: Jan Kara 
---
 kernel/locking/lockdep.c | 707 +++
 1 file changed, 402 insertions(+), 305 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index d24e4339b46d..b15e7dec55f6 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -77,6 +77,26 @@ module_param(lock_stat, int, 0644);
  */
 static arch_spinlock_t lockdep_lock = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
 
+static void sprint_ip_sym(char *buf, unsigned long ip)
+{
+	sprintf(buf, "[<%p>] %pS\n", (void *) ip, (void *) ip);
+}
+
+static void trace_print_stack_trace(struct stack_trace *trace, int spaces)
+{
+	int i, n;
+	char buf[256];
+
+	if (!trace->entries)
+		return;
+
+	for (i = 0; i < trace->nr_entries; i++) {
+		n = sprintf(buf, "%*c", 1 + spaces, ' ');
+		sprint_ip_sym(buf + n, trace->entries[i]);
+		trace_printk(buf);
+	}
+}
+
 static int graph_lock(void)
 {
 	arch_spin_lock(_lock);
@@ -382,9 +402,9 @@ static unsigned long stack_trace[MAX_STACK_TRACE_ENTRIES];
 
 static void print_lockdep_off(const char *bug_msg)
 {
-	printk(KERN_DEBUG "%s\n", bug_msg);
-	printk(KERN_DEBUG "turning off the locking correctness validator.\n");
-	printk(KERN_DEBUG "Please attach the output of /proc/lock_stat to the bug report\n");
+	trace_printk("%s\n", bug_msg);
+	trace_printk("turning off the locking correctness validator.\n");
+	trace_printk("Please attach the output of /proc/lock_stat to the bug report\n");
 }
 
 static int save_trace(struct stack_trace *trace)
@@ -417,7 +437,7 @@ static int save_trace(struct stack_trace *trace)
 			return 0;
 
 		print_lockdep_off("BUG: MAX_STACK_TRACE_ENTRIES too low!");
-		dump_stack();
+		trace_dump_stack(0);
 
 		return 0;
 	}
@@ -506,7 +526,7 @@ void get_usage_chars(struct lock_class *class, char usage[LOCK_USAGE_CHARS])
 	usage[i] = '\0';
 }
 
-static void __print_lock_name(struct lock_class *class)
+static void __sprint_lock_name(char *buf, struct lock_class *class)
 {
 	char str[KSYM_NAME_LEN];
 	const char *name;
@@ -514,28 +534,28 @@ static void __print_lock_name(struct lock_class *class)
 	name = class->name;
 	if (!name) {
 		name = __get_key_name(class->key, str);
-		printk("%s", name);
+		strcpy(buf, name);
 	} else {
-		printk("%s", name);
+		strcpy(buf, name);
 		if (class->name_version > 1)
-			printk("#%d", class->name_version);
+			sprintf(buf + strlen(buf), "#%d", class->name_version);
 		if (class->subclass)
-			printk("/%d", class->subclass);
+			sprintf(buf + strlen(buf), "/%d", class->subclass);
 	}
 }
 
-static void print_lock_name(struct lock_class *class)
+static void sprint_lock_name(char *buf, struct lock_class *class)
 {
 	char usage[LOCK_USAGE_CHARS];
 
 	get_usage_chars(class, usage);
 
-	printk(" (");
-	__print_lock_name(class);
-	printk("){%s}", usage);
+	strcpy(buf, " (");
+	__sprint_lock_name(buf, class);
+	sprintf(buf + strlen(buf), "){%s}", usage);
 }
 
-static void print_lockdep_cache(struct lockdep_map *lock)
+static void sprint_lockdep_cache(char *buf, struct lockdep_map *lock)
 {
 	const char *name;
 	char str[KSYM_NAME_LEN];
@@ -544,14 +564,14 @@ static void print_lockdep_cache(struct lockdep_map *lock)
 	if (!name)
 		name = __get_key_name(lock->key->subkeys, str);
 
-	printk("%s", name);
+	strcpy(buf, name);
 }
 
-static void 

Re: [REGRESSION] Boot hang with 939f04bec printk: enable interrupts before calling console_trylock_for_printk()

2014-07-17 Thread Jan Kara
On Wed 16-07-14 23:34:08, Andreas Bombe wrote:
 On Mon, Jul 14, 2014 at 10:35:27AM +0200, Jan Kara wrote:
  On Sun 29-06-14 00:50:50, Andreas Bombe wrote:
   None of the post 3.15 kernel boot for me. They all hang at the GRUB
   screen telling me it loaded and started the kernel, but the kernel
   itself stops before it prints anything (or even replaces the GRUB
   background graphics).
   
   I bisected it down to 939f04bec1a4ef6ba4370b0f34b01decc844b1b1 printk:
   enable interrupts before calling console_trylock_for_printk().
   Reverting that patch on the latest kernel (git 24b414d5a7) allows me to
   boot normally. I fixed the conflict in the revert by leaving in the if
   (in_sched) return printed_len;.
   
   I have the early printk via the EFI framebuffer option enabled,
   disabling it made no difference however.
Thanks for report. I've been on vacation so I'm replying with a delay. I
  believe this is one of the issues where this patch just uncovers underlying
  problem - I belive lockdep tries to report some locking issue in console
  driver code (this patch increased lockdep coverage of console driver code)
  however we are holding some locks in printk code which make lockdep
  deadlock. Can you try running with the attached patch?
 
 EUNABLE
 
 You forgot to attach a patch.
  Bah, sorry. Attaching now.

Honza
-- 
Jan Kara j...@suse.cz
SUSE Labs, CR
From 93930c92c664d8b834e1e04b92c00023626fc07a Mon Sep 17 00:00:00 2001
From: Jan Kara j...@suse.cz
Date: Thu, 19 Jun 2014 15:36:09 +0200
Subject: [PATCH] lockdep: Dump info via tracing

Signed-off-by: Jan Kara j...@suse.cz
---
 kernel/locking/lockdep.c | 707 +++
 1 file changed, 402 insertions(+), 305 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index d24e4339b46d..b15e7dec55f6 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -77,6 +77,26 @@ module_param(lock_stat, int, 0644);
  */
 static arch_spinlock_t lockdep_lock = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
 
+static void sprint_ip_sym(char *buf, unsigned long ip)
+{
+	sprintf(buf, [%p] %pS\n, (void *) ip, (void *) ip);
+}
+
+static void trace_print_stack_trace(struct stack_trace *trace, int spaces)
+{
+	int i, n;
+	char buf[256];
+
+	if (!trace-entries)
+		return;
+
+	for (i = 0; i  trace-nr_entries; i++) {
+		n = sprintf(buf, %*c, 1 + spaces, ' ');
+		sprint_ip_sym(buf + n, trace-entries[i]);
+		trace_printk(buf);
+	}
+}
+
 static int graph_lock(void)
 {
 	arch_spin_lock(lockdep_lock);
@@ -382,9 +402,9 @@ static unsigned long stack_trace[MAX_STACK_TRACE_ENTRIES];
 
 static void print_lockdep_off(const char *bug_msg)
 {
-	printk(KERN_DEBUG %s\n, bug_msg);
-	printk(KERN_DEBUG turning off the locking correctness validator.\n);
-	printk(KERN_DEBUG Please attach the output of /proc/lock_stat to the bug report\n);
+	trace_printk(%s\n, bug_msg);
+	trace_printk(turning off the locking correctness validator.\n);
+	trace_printk(Please attach the output of /proc/lock_stat to the bug report\n);
 }
 
 static int save_trace(struct stack_trace *trace)
@@ -417,7 +437,7 @@ static int save_trace(struct stack_trace *trace)
 			return 0;
 
 		print_lockdep_off(BUG: MAX_STACK_TRACE_ENTRIES too low!);
-		dump_stack();
+		trace_dump_stack(0);
 
 		return 0;
 	}
@@ -506,7 +526,7 @@ void get_usage_chars(struct lock_class *class, char usage[LOCK_USAGE_CHARS])
 	usage[i] = '\0';
 }
 
-static void __print_lock_name(struct lock_class *class)
+static void __sprint_lock_name(char *buf, struct lock_class *class)
 {
 	char str[KSYM_NAME_LEN];
 	const char *name;
@@ -514,28 +534,28 @@ static void __print_lock_name(struct lock_class *class)
 	name = class-name;
 	if (!name) {
 		name = __get_key_name(class-key, str);
-		printk(%s, name);
+		strcpy(buf, name);
 	} else {
-		printk(%s, name);
+		strcpy(buf, name);
 		if (class-name_version  1)
-			printk(#%d, class-name_version);
+			sprintf(buf + strlen(buf), #%d, class-name_version);
 		if (class-subclass)
-			printk(/%d, class-subclass);
+			sprintf(buf + strlen(buf), /%d, class-subclass);
 	}
 }
 
-static void print_lock_name(struct lock_class *class)
+static void sprint_lock_name(char *buf, struct lock_class *class)
 {
 	char usage[LOCK_USAGE_CHARS];
 
 	get_usage_chars(class, usage);
 
-	printk( ();
-	__print_lock_name(class);
-	printk(){%s}, usage);
+	strcpy(buf,  ();
+	__sprint_lock_name(buf, class);
+	sprintf(buf + strlen(buf), ){%s}, usage);
 }
 
-static void print_lockdep_cache(struct lockdep_map *lock)
+static void sprint_lockdep_cache(char *buf, struct lockdep_map *lock)
 {
 	const char *name;
 	char str[KSYM_NAME_LEN];
@@ -544,14 +564,14 @@ static void print_lockdep_cache(struct lockdep_map *lock)
 	if (!name)
 		name = __get_key_name(lock-key-subkeys, str);
 
-	printk(%s, name);
+	strcpy(buf, name);
 }
 
-static void print_lock(struct held_lock *hlock)
+static void sprint_lock(char *buf, struct 

Re: [REGRESSION] Boot hang with 939f04bec printk: enable interrupts before calling console_trylock_for_printk()

2014-07-16 Thread Andreas Bombe
On Mon, Jul 14, 2014 at 10:35:27AM +0200, Jan Kara wrote:
> On Sun 29-06-14 00:50:50, Andreas Bombe wrote:
> > None of the post 3.15 kernel boot for me. They all hang at the GRUB
> > screen telling me it loaded and started the kernel, but the kernel
> > itself stops before it prints anything (or even replaces the GRUB
> > background graphics).
> > 
> > I bisected it down to 939f04bec1a4ef6ba4370b0f34b01decc844b1b1 "printk:
> > enable interrupts before calling console_trylock_for_printk()".
> > Reverting that patch on the latest kernel (git 24b414d5a7) allows me to
> > boot normally. I fixed the conflict in the revert by leaving in the "if
> > (in_sched) return printed_len;".
> > 
> > I have the "early printk via the EFI framebuffer" option enabled,
> > disabling it made no difference however.
>   Thanks for report. I've been on vacation so I'm replying with a delay. I
> believe this is one of the issues where this patch just uncovers underlying
> problem - I belive lockdep tries to report some locking issue in console
> driver code (this patch increased lockdep coverage of console driver code)
> however we are holding some locks in printk code which make lockdep
> deadlock. Can you try running with the attached patch?

EUNABLE

You forgot to attach a patch.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REGRESSION] Boot hang with 939f04bec printk: enable interrupts before calling console_trylock_for_printk()

2014-07-16 Thread Andreas Bombe
On Mon, Jul 14, 2014 at 10:35:27AM +0200, Jan Kara wrote:
 On Sun 29-06-14 00:50:50, Andreas Bombe wrote:
  None of the post 3.15 kernel boot for me. They all hang at the GRUB
  screen telling me it loaded and started the kernel, but the kernel
  itself stops before it prints anything (or even replaces the GRUB
  background graphics).
  
  I bisected it down to 939f04bec1a4ef6ba4370b0f34b01decc844b1b1 printk:
  enable interrupts before calling console_trylock_for_printk().
  Reverting that patch on the latest kernel (git 24b414d5a7) allows me to
  boot normally. I fixed the conflict in the revert by leaving in the if
  (in_sched) return printed_len;.
  
  I have the early printk via the EFI framebuffer option enabled,
  disabling it made no difference however.
   Thanks for report. I've been on vacation so I'm replying with a delay. I
 believe this is one of the issues where this patch just uncovers underlying
 problem - I belive lockdep tries to report some locking issue in console
 driver code (this patch increased lockdep coverage of console driver code)
 however we are holding some locks in printk code which make lockdep
 deadlock. Can you try running with the attached patch?

EUNABLE

You forgot to attach a patch.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REGRESSION] Boot hang with 939f04bec printk: enable interrupts before calling console_trylock_for_printk()

2014-07-02 Thread Andreas Bombe
On Mon, Jun 30, 2014 at 01:20:30PM -0700, Andrew Morton wrote:
> On Sun, 29 Jun 2014 00:50:50 +0200 Andreas Bombe  wrote:
> 
> > None of the post 3.15 kernel boot for me. They all hang at the GRUB
> > screen telling me it loaded and started the kernel, but the kernel
> > itself stops before it prints anything (or even replaces the GRUB
> > background graphics).
> > 
> > I bisected it down to 939f04bec1a4ef6ba4370b0f34b01decc844b1b1 "printk:
> > enable interrupts before calling console_trylock_for_printk()".
> > Reverting that patch on the latest kernel (git 24b414d5a7) allows me to
> > boot normally.
> 
> OK, thanks, I queued the below reversion.  I'll hold off for a week or
> two in case we come up with a fix.
> 
> > I fixed the conflict in the revert by leaving in the "if
> > (in_sched) return printed_len;".
> 
> hm, that might have been buggy - we still need to the lockdep_on() and
> local_irq_restore().  You're a victim of
> someone-stuck-a-random-return-in-the-middle-of-a-function.

To be honest, I just randomly picked one possible conflict resolution to
confirm whether the revert actually helps.

> Please review this and if possible, run-time test it?  Against current
> mainline.

I've applied it (actually I picked the one from your patch series but
they are identical I think) and it does boot. I'm writing this on the
latest kernel + your revert.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REGRESSION] Boot hang with 939f04bec printk: enable interrupts before calling console_trylock_for_printk()

2014-07-02 Thread Andreas Bombe
On Mon, Jun 30, 2014 at 01:20:30PM -0700, Andrew Morton wrote:
 On Sun, 29 Jun 2014 00:50:50 +0200 Andreas Bombe a...@debian.org wrote:
 
  None of the post 3.15 kernel boot for me. They all hang at the GRUB
  screen telling me it loaded and started the kernel, but the kernel
  itself stops before it prints anything (or even replaces the GRUB
  background graphics).
  
  I bisected it down to 939f04bec1a4ef6ba4370b0f34b01decc844b1b1 printk:
  enable interrupts before calling console_trylock_for_printk().
  Reverting that patch on the latest kernel (git 24b414d5a7) allows me to
  boot normally.
 
 OK, thanks, I queued the below reversion.  I'll hold off for a week or
 two in case we come up with a fix.
 
  I fixed the conflict in the revert by leaving in the if
  (in_sched) return printed_len;.
 
 hm, that might have been buggy - we still need to the lockdep_on() and
 local_irq_restore().  You're a victim of
 someone-stuck-a-random-return-in-the-middle-of-a-function.

To be honest, I just randomly picked one possible conflict resolution to
confirm whether the revert actually helps.

 Please review this and if possible, run-time test it?  Against current
 mainline.

I've applied it (actually I picked the one from your patch series but
they are identical I think) and it does boot. I'm writing this on the
latest kernel + your revert.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REGRESSION] Boot hang with 939f04bec printk: enable interrupts before calling console_trylock_for_printk()

2014-06-30 Thread Andrew Morton
On Sun, 29 Jun 2014 00:50:50 +0200 Andreas Bombe  wrote:

> None of the post 3.15 kernel boot for me. They all hang at the GRUB
> screen telling me it loaded and started the kernel, but the kernel
> itself stops before it prints anything (or even replaces the GRUB
> background graphics).
> 
> I bisected it down to 939f04bec1a4ef6ba4370b0f34b01decc844b1b1 "printk:
> enable interrupts before calling console_trylock_for_printk()".
> Reverting that patch on the latest kernel (git 24b414d5a7) allows me to
> boot normally.

OK, thanks, I queued the below reversion.  I'll hold off for a week or
two in case we come up with a fix.

> I fixed the conflict in the revert by leaving in the "if
> (in_sched) return printed_len;".

hm, that might have been buggy - we still need to the lockdep_on() and
local_irq_restore().  You're a victim of
someone-stuck-a-random-return-in-the-middle-of-a-function.

Please review this and if possible, run-time test it?  Against current
mainline.


From: Andrew Morton 
Subject: kernel/printk/printk.c: revert "printk: enable interrupts before 
calling console_trylock_for_printk()"

Revert 939f04bec1a4 ("printk: enable interrupts before calling
console_trylock_for_printk()").

Andreas reported:

: None of the post 3.15 kernel boot for me. They all hang at the GRUB
: screen telling me it loaded and started the kernel, but the kernel
: itself stops before it prints anything (or even replaces the GRUB
: background graphics).

939f04bec1a4 is modest latency reduction.  Revert it until we understand
the reason for these failures.

Reported-by: Andreas Bombe 
Cc: Jan Kara 
Cc: Steven Rostedt 
Signed-off-by: Andrew Morton 
---

 kernel/printk/printk.c |   44 +++
 1 file changed, 18 insertions(+), 26 deletions(-)

diff -puN 
kernel/printk/printk.c~kernel-printk-printkc-revert-printk-enable-interrupts-before-calling-console_trylock_for_printk
 kernel/printk/printk.c
--- 
a/kernel/printk/printk.c~kernel-printk-printkc-revert-printk-enable-interrupts-before-calling-console_trylock_for_printk
+++ a/kernel/printk/printk.c
@@ -1416,9 +1416,10 @@ static int have_callable_console(void)
 /*
  * Can we actually use the console at this time on this cpu?
  *
- * Console drivers may assume that per-cpu resources have been allocated. So
- * unless they're explicitly marked as being able to cope (CON_ANYTIME) don't
- * call them until this CPU is officially up.
+ * Console drivers may assume that per-cpu resources have
+ * been allocated. So unless they're explicitly marked as
+ * being able to cope (CON_ANYTIME) don't call them until
+ * this CPU is officially up.
  */
 static inline int can_use_console(unsigned int cpu)
 {
@@ -1431,10 +1432,8 @@ static inline int can_use_console(unsign
  * console_lock held, and 'console_locked' set) if it
  * is successful, false otherwise.
  */
-static int console_trylock_for_printk(void)
+static int console_trylock_for_printk(unsigned int cpu)
 {
-   unsigned int cpu = smp_processor_id();
-
if (!console_trylock())
return 0;
/*
@@ -1609,8 +1608,7 @@ asmlinkage int vprintk_emit(int facility
 */
if (!oops_in_progress && !lockdep_recursing(current)) {
recursion_bug = 1;
-   local_irq_restore(flags);
-   return 0;
+   goto out_restore_irqs;
}
zap_locks();
}
@@ -1718,27 +1716,21 @@ asmlinkage int vprintk_emit(int facility
 
logbuf_cpu = UINT_MAX;
raw_spin_unlock(_lock);
-   lockdep_on();
-   local_irq_restore(flags);
 
/* If called from the scheduler, we can not call up(). */
-   if (in_sched)
-   return printed_len;
-
-   /*
-* Disable preemption to avoid being preempted while holding
-* console_sem which would prevent anyone from printing to console
-*/
-   preempt_disable();
-   /*
-* Try to acquire and then immediately release the console semaphore.
-* The release will print out buffers and wake up /dev/kmsg and syslog()
-* users.
-*/
-   if (console_trylock_for_printk())
-   console_unlock();
-   preempt_enable();
+   if (!in_sched) {
+   /*
+* Try to acquire and then immediately release the console
+* semaphore.  The release will print out buffers and wake up
+* /dev/kmsg and syslog() users.
+*/
+   if (console_trylock_for_printk(this_cpu))
+   console_unlock();
+   }
 
+   lockdep_on();
+out_restore_irqs:
+   local_irq_restore(flags);
return printed_len;
 }
 EXPORT_SYMBOL(vprintk_emit);
_

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [REGRESSION] Boot hang with 939f04bec printk: enable interrupts before calling console_trylock_for_printk()

2014-06-30 Thread Andrew Morton
On Sun, 29 Jun 2014 00:50:50 +0200 Andreas Bombe a...@debian.org wrote:

 None of the post 3.15 kernel boot for me. They all hang at the GRUB
 screen telling me it loaded and started the kernel, but the kernel
 itself stops before it prints anything (or even replaces the GRUB
 background graphics).
 
 I bisected it down to 939f04bec1a4ef6ba4370b0f34b01decc844b1b1 printk:
 enable interrupts before calling console_trylock_for_printk().
 Reverting that patch on the latest kernel (git 24b414d5a7) allows me to
 boot normally.

OK, thanks, I queued the below reversion.  I'll hold off for a week or
two in case we come up with a fix.

 I fixed the conflict in the revert by leaving in the if
 (in_sched) return printed_len;.

hm, that might have been buggy - we still need to the lockdep_on() and
local_irq_restore().  You're a victim of
someone-stuck-a-random-return-in-the-middle-of-a-function.

Please review this and if possible, run-time test it?  Against current
mainline.


From: Andrew Morton a...@linux-foundation.org
Subject: kernel/printk/printk.c: revert printk: enable interrupts before 
calling console_trylock_for_printk()

Revert 939f04bec1a4 (printk: enable interrupts before calling
console_trylock_for_printk()).

Andreas reported:

: None of the post 3.15 kernel boot for me. They all hang at the GRUB
: screen telling me it loaded and started the kernel, but the kernel
: itself stops before it prints anything (or even replaces the GRUB
: background graphics).

939f04bec1a4 is modest latency reduction.  Revert it until we understand
the reason for these failures.

Reported-by: Andreas Bombe a...@debian.org
Cc: Jan Kara j...@suse.cz
Cc: Steven Rostedt rost...@goodmis.org
Signed-off-by: Andrew Morton a...@linux-foundation.org
---

 kernel/printk/printk.c |   44 +++
 1 file changed, 18 insertions(+), 26 deletions(-)

diff -puN 
kernel/printk/printk.c~kernel-printk-printkc-revert-printk-enable-interrupts-before-calling-console_trylock_for_printk
 kernel/printk/printk.c
--- 
a/kernel/printk/printk.c~kernel-printk-printkc-revert-printk-enable-interrupts-before-calling-console_trylock_for_printk
+++ a/kernel/printk/printk.c
@@ -1416,9 +1416,10 @@ static int have_callable_console(void)
 /*
  * Can we actually use the console at this time on this cpu?
  *
- * Console drivers may assume that per-cpu resources have been allocated. So
- * unless they're explicitly marked as being able to cope (CON_ANYTIME) don't
- * call them until this CPU is officially up.
+ * Console drivers may assume that per-cpu resources have
+ * been allocated. So unless they're explicitly marked as
+ * being able to cope (CON_ANYTIME) don't call them until
+ * this CPU is officially up.
  */
 static inline int can_use_console(unsigned int cpu)
 {
@@ -1431,10 +1432,8 @@ static inline int can_use_console(unsign
  * console_lock held, and 'console_locked' set) if it
  * is successful, false otherwise.
  */
-static int console_trylock_for_printk(void)
+static int console_trylock_for_printk(unsigned int cpu)
 {
-   unsigned int cpu = smp_processor_id();
-
if (!console_trylock())
return 0;
/*
@@ -1609,8 +1608,7 @@ asmlinkage int vprintk_emit(int facility
 */
if (!oops_in_progress  !lockdep_recursing(current)) {
recursion_bug = 1;
-   local_irq_restore(flags);
-   return 0;
+   goto out_restore_irqs;
}
zap_locks();
}
@@ -1718,27 +1716,21 @@ asmlinkage int vprintk_emit(int facility
 
logbuf_cpu = UINT_MAX;
raw_spin_unlock(logbuf_lock);
-   lockdep_on();
-   local_irq_restore(flags);
 
/* If called from the scheduler, we can not call up(). */
-   if (in_sched)
-   return printed_len;
-
-   /*
-* Disable preemption to avoid being preempted while holding
-* console_sem which would prevent anyone from printing to console
-*/
-   preempt_disable();
-   /*
-* Try to acquire and then immediately release the console semaphore.
-* The release will print out buffers and wake up /dev/kmsg and syslog()
-* users.
-*/
-   if (console_trylock_for_printk())
-   console_unlock();
-   preempt_enable();
+   if (!in_sched) {
+   /*
+* Try to acquire and then immediately release the console
+* semaphore.  The release will print out buffers and wake up
+* /dev/kmsg and syslog() users.
+*/
+   if (console_trylock_for_printk(this_cpu))
+   console_unlock();
+   }
 
+   lockdep_on();
+out_restore_irqs:
+   local_irq_restore(flags);
return printed_len;
 }
 EXPORT_SYMBOL(vprintk_emit);
_

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to