Re: [PATCH v4 7/8] lockdep: Change hardirq{s_enabled,_context} to per-cpu variables

2020-06-30 Thread Peter Zijlstra
On Tue, Jun 30, 2020 at 07:59:39AM +0200, Ahmed S. Darwish wrote:
> Peter Zijlstra wrote:
> 
> ...
> 
> > -#define lockdep_assert_irqs_disabled() do {\
> > -   WARN_ONCE(debug_locks && !current->lockdep_recursion && \
> > - current->hardirqs_enabled,\
> > - "IRQs not disabled as expected\n");   \
> > -   } while (0)
> 
> ...
> 
> > +#define lockdep_assert_irqs_disabled() \
> > +do {   
> > \
> > +   WARN_ON_ONCE(debug_locks && this_cpu_read(hardirqs_enabled));   \
> > +} while (0)
> 
> I think it would be nice to keep the "IRQs not disabled as expected"
> message. It makes the lockdep splat much more readable.
> 
> This is similarly the case for the v3 lockdep preemption macros:
> 
>   https://lkml.kernel.org/r/20200630054452.3675847-5-a.darw...@linutronix.de
> 
> I did not add a message though to get in-sync with the IRQ macros above.

Hurmph.. the file:line output of a splat is usually all I look at, also
__WARN_printf() generates such atrocious crap code that try and not use
it.

I suppose I should do a __WARN_str() or something, but then people are
unlikely to want to use that, too much variation etc. :/

Cursed if you do, cursed if you don't.


Re: [PATCH v4 7/8] lockdep: Change hardirq{s_enabled,_context} to per-cpu variables

2020-06-30 Thread Ahmed S. Darwish
Peter Zijlstra wrote:

...

> -#define lockdep_assert_irqs_disabled()   do {\
> - WARN_ONCE(debug_locks && !current->lockdep_recursion && \
> -   current->hardirqs_enabled,\
> -   "IRQs not disabled as expected\n");   \
> - } while (0)

...

> +#define lockdep_assert_irqs_disabled()   \
> +do { \
> + WARN_ON_ONCE(debug_locks && this_cpu_read(hardirqs_enabled));   \
> +} while (0)

I think it would be nice to keep the "IRQs not disabled as expected"
message. It makes the lockdep splat much more readable.

This is similarly the case for the v3 lockdep preemption macros:

  https://lkml.kernel.org/r/20200630054452.3675847-5-a.darw...@linutronix.de

I did not add a message though to get in-sync with the IRQ macros above.

Thanks,

--
Ahmed S. Darwish
Linutronix GmbH


Re: [PATCH v4 7/8] lockdep: Change hardirq{s_enabled,_context} to per-cpu variables

2020-06-24 Thread Peter Zijlstra
On Wed, Jun 24, 2020 at 01:32:46PM +0200, Marco Elver wrote:
> From: Marco Elver 
> Date: Wed, 24 Jun 2020 11:23:22 +0200
> Subject: [PATCH] kcsan: Make KCSAN compatible with new IRQ state tracking
> 
> The new IRQ state tracking code does not honor lockdep_off(), and as
> such we should again permit tracing by using non-raw functions in
> core.c. Update the lockdep_off() comment in report.c, to reflect the
> fact there is still a potential risk of deadlock due to using printk()
> from scheduler code.
> 
> Suggested-by: Peter Zijlstra (Intel) 
> Signed-off-by: Marco Elver 

Thanks!

I've put this in front of the series at hand. I'll wait a little while
longer for arch people to give feedback on their header patches before I
stuff the lot into tip/locking/core.


Re: [PATCH v4 7/8] lockdep: Change hardirq{s_enabled,_context} to per-cpu variables

2020-06-24 Thread Peter Zijlstra
On Tue, Jun 23, 2020 at 10:24:04PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 23, 2020 at 08:12:32PM +0200, Peter Zijlstra wrote:
> > Fair enough; I'll rip it all up and boot a KCSAN kernel, see what if
> > anything happens.
> 
> OK, so the below patch doesn't seem to have any nasty recursion issues
> here. The only 'problem' is that lockdep now sees report_lock can cause
> deadlocks.
> 
> It is completely right about it too, but I don't suspect there's much we
> can do about it, it's pretty much the standard printk() with scheduler
> locks held report.

So I've been getting tons and tons of this:

[   60.471348] 
==
[   60.479427] BUG: KCSAN: data-race in __rcu_read_lock / __rcu_read_unlock
[   60.486909]
[   60.488572] write (marked) to 0x88840fff1cf0 of 4 bytes by interrupt on 
cpu 1:
[   60.497026]  __rcu_read_lock+0x37/0x60
[   60.501214]  cpuacct_account_field+0x1b/0x170
[   60.506081]  task_group_account_field+0x32/0x160
[   60.511238]  account_system_time+0xe6/0x110
[   60.515912]  update_process_times+0x1d/0xd0
[   60.520585]  tick_sched_timer+0xfc/0x180
[   60.524967]  __hrtimer_run_queues+0x271/0x440
[   60.529832]  hrtimer_interrupt+0x222/0x670
[   60.534409]  __sysvec_apic_timer_interrupt+0xb3/0x1a0
[   60.540052]  asm_call_on_stack+0x12/0x20
[   60.544434]  sysvec_apic_timer_interrupt+0xba/0x130
[   60.549882]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[   60.555621]  delay_tsc+0x7d/0xe0
[   60.559226]  kcsan_setup_watchpoint+0x292/0x4e0
[   60.564284]  __rcu_read_unlock+0x73/0x2c0
[   60.568763]  __unlock_page_memcg+0xda/0xf0
[   60.573338]  unlock_page_memcg+0x32/0x40
[   60.577721]  page_remove_rmap+0x5c/0x200
[   60.582104]  unmap_page_range+0x83c/0xc10
[   60.586582]  unmap_single_vma+0xb0/0x150
[   60.590963]  unmap_vmas+0x81/0xe0
[   60.594663]  exit_mmap+0x135/0x2b0
[   60.598464]  __mmput+0x21/0x150
[   60.601970]  mmput+0x2a/0x30
[   60.605176]  exit_mm+0x2fc/0x350
[   60.608780]  do_exit+0x372/0xff0
[   60.612385]  do_group_exit+0x139/0x140
[   60.616571]  __do_sys_exit_group+0xb/0x10
[   60.621048]  __se_sys_exit_group+0xa/0x10
[   60.625524]  __x64_sys_exit_group+0x1b/0x20
[   60.630189]  do_syscall_64+0x6c/0xe0
[   60.634182]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   60.639820]
[   60.641485] read to 0x88840fff1cf0 of 4 bytes by task 2430 on cpu 1:
[   60.648969]  __rcu_read_unlock+0x73/0x2c0
[   60.653446]  __unlock_page_memcg+0xda/0xf0
[   60.658019]  unlock_page_memcg+0x32/0x40
[   60.662400]  page_remove_rmap+0x5c/0x200
[   60.666782]  unmap_page_range+0x83c/0xc10
[   60.671259]  unmap_single_vma+0xb0/0x150
[   60.675641]  unmap_vmas+0x81/0xe0
[   60.679341]  exit_mmap+0x135/0x2b0
[   60.683141]  __mmput+0x21/0x150
[   60.686647]  mmput+0x2a/0x30
[   60.689853]  exit_mm+0x2fc/0x350
[   60.693458]  do_exit+0x372/0xff0
[   60.697062]  do_group_exit+0x139/0x140
[   60.701248]  __do_sys_exit_group+0xb/0x10
[   60.705724]  __se_sys_exit_group+0xa/0x10
[   60.710201]  __x64_sys_exit_group+0x1b/0x20
[   60.714872]  do_syscall_64+0x6c/0xe0
[   60.718864]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   60.724503]
[   60.726156] Reported by Kernel Concurrency Sanitizer on:
[   60.732089] CPU: 1 PID: 2430 Comm: sshd Not tainted 
5.8.0-rc2-00186-gb4ee11fe08b3-dirty #303
[   60.741510] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS 
SE5C600.86B.02.02.0002.122320131210 12/23/2013
[   60.752957] 
==

And I figured a quick way to get rid of that would be something like the
below, seeing how volatile gets auto annotated... but that doesn't seem
to actually work.

What am I missing?



diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 352223664ebd..b08861118e1a 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -351,17 +351,17 @@ static int rcu_preempt_blocked_readers_cgp(struct 
rcu_node *rnp)
 
 static void rcu_preempt_read_enter(void)
 {
-   current->rcu_read_lock_nesting++;
+   (*(volatile int *)>rcu_read_lock_nesting)++;
 }
 
 static int rcu_preempt_read_exit(void)
 {
-   return --current->rcu_read_lock_nesting;
+   return --(*(volatile int *)>rcu_read_lock_nesting);
 }
 
 static void rcu_preempt_depth_set(int val)
 {
-   current->rcu_read_lock_nesting = val;
+   WRITE_ONCE(current->rcu_read_lock_nesting, val);
 }
 
 /*



Re: [PATCH v4 7/8] lockdep: Change hardirq{s_enabled,_context} to per-cpu variables

2020-06-23 Thread Ahmed S. Darwish
On Tue, Jun 23, 2020 at 05:24:50PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 23, 2020 at 05:00:31PM +0200, Ahmed S. Darwish wrote:
> > On Tue, Jun 23, 2020 at 10:36:52AM +0200, Peter Zijlstra wrote:
> > ...
> > > -#define lockdep_assert_irqs_disabled()   do {
> > > \
> > > - WARN_ONCE(debug_locks && !current->lockdep_recursion && \
> > > -   current->hardirqs_enabled,\
> > > -   "IRQs not disabled as expected\n");   \
> > > - } while (0)
> > > +#define lockdep_assert_irqs_enabled()
> > > \
> > > +do { 
> > > \
> > > + WARN_ON_ONCE(debug_locks && !this_cpu_read(hardirqs_enabled));  \
> > > +} while (0)
> > >
> >
> > Can we add a small comment on top of lockdep_off(), stating that lockdep
> > IRQ tracking will still be kept after a lockdep_off call?
>
> That would only legitimize lockdep_off(). The only comment I want to put
> on that is: "if you use this, you're doing it wrong'.
>

Well, freshly merged code is using it. For example, KCSAN:

=> f1bc96210c6a ("kcsan: Make KCSAN compatible with lockdep")
=> kernel/kcsan/report.c:

void kcsan_report(...)
{
...
/*
 * With TRACE_IRQFLAGS, lockdep's IRQ trace state becomes corrupted if
 * we do not turn off lockdep here; this could happen due to recursion
 * into lockdep via KCSAN if we detect a race in utilities used by
 * lockdep.
 */
lockdep_off();
...
}

thanks,

--
Ahmed S. Darwish
Linutronix GmbH


Re: [PATCH v4 7/8] lockdep: Change hardirq{s_enabled,_context} to per-cpu variables

2020-06-23 Thread Ahmed S. Darwish
On Tue, Jun 23, 2020 at 10:36:52AM +0200, Peter Zijlstra wrote:
...
> -#define lockdep_assert_irqs_disabled()   do {
> \
> - WARN_ONCE(debug_locks && !current->lockdep_recursion && \
> -   current->hardirqs_enabled,\
> -   "IRQs not disabled as expected\n");   \
> - } while (0)
> +#define lockdep_assert_irqs_enabled()
> \
> +do { \
> + WARN_ON_ONCE(debug_locks && !this_cpu_read(hardirqs_enabled));  \
> +} while (0)
>

Can we add a small comment on top of lockdep_off(), stating that lockdep
IRQ tracking will still be kept after a lockdep_off call?

thanks,

--
Ahmed S. Darwish
Linutronix GmbH


Re: [PATCH v4 7/8] lockdep: Change hardirq{s_enabled,_context} to per-cpu variables

2020-06-23 Thread Peter Zijlstra
On Tue, Jun 23, 2020 at 10:24:04PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 23, 2020 at 08:12:32PM +0200, Peter Zijlstra wrote:
> > Fair enough; I'll rip it all up and boot a KCSAN kernel, see what if
> > anything happens.
> 
> OK, so the below patch doesn't seem to have any nasty recursion issues
> here. The only 'problem' is that lockdep now sees report_lock can cause
> deadlocks.
> 
> It is completely right about it too, but I don't suspect there's much we
> can do about it, it's pretty much the standard printk() with scheduler
> locks held report.

Just for giggles I added the below and that works fine too. Right until
the report_lock deadlock splat of course, thereafter lockdep is
disabled.

diff --git a/kernel/kcsan/report.c b/kernel/kcsan/report.c
index ac5f8345bae9..a011cf0a1611 100644
--- a/kernel/kcsan/report.c
+++ b/kernel/kcsan/report.c
@@ -459,6 +459,8 @@ static void set_other_info_task_blocking(unsigned long 
*flags,
 */
int timeout = max(kcsan_udelay_task, kcsan_udelay_interrupt);

+   lockdep_assert_held(_lock);
+
other_info->task = current;
do {
if (is_running) {
@@ -495,6 +497,8 @@ static void set_other_info_task_blocking(unsigned long 
*flags,
 other_info->task == current);
if (is_running)
set_current_state(TASK_RUNNING);
+
+   lockdep_assert_held(_lock);
 }

 /* Populate @other_info; requires that the provided @other_info not in use. */


Re: [PATCH v4 7/8] lockdep: Change hardirq{s_enabled,_context} to per-cpu variables

2020-06-23 Thread Peter Zijlstra
On Tue, Jun 23, 2020 at 08:12:32PM +0200, Peter Zijlstra wrote:
> Fair enough; I'll rip it all up and boot a KCSAN kernel, see what if
> anything happens.

OK, so the below patch doesn't seem to have any nasty recursion issues
here. The only 'problem' is that lockdep now sees report_lock can cause
deadlocks.

It is completely right about it too, but I don't suspect there's much we
can do about it, it's pretty much the standard printk() with scheduler
locks held report.

---
diff --git a/kernel/kcsan/core.c b/kernel/kcsan/core.c
index 15f67949d11e..732623c30359 100644
--- a/kernel/kcsan/core.c
+++ b/kernel/kcsan/core.c
@@ -397,8 +397,7 @@ kcsan_setup_watchpoint(const volatile void *ptr, size_t 
size, int type)
}
 
if (!kcsan_interrupt_watcher)
-   /* Use raw to avoid lockdep recursion via IRQ flags tracing. */
-   raw_local_irq_save(irq_flags);
+   local_irq_save(irq_flags);
 
watchpoint = insert_watchpoint((unsigned long)ptr, size, is_write);
if (watchpoint == NULL) {
@@ -539,7 +538,7 @@ kcsan_setup_watchpoint(const volatile void *ptr, size_t 
size, int type)
kcsan_counter_dec(KCSAN_COUNTER_USED_WATCHPOINTS);
 out_unlock:
if (!kcsan_interrupt_watcher)
-   raw_local_irq_restore(irq_flags);
+   local_irq_restore(irq_flags);
 out:
user_access_restore(ua_flags);
 }
diff --git a/kernel/kcsan/report.c b/kernel/kcsan/report.c
index ac5f8345bae9..ef31c1d2dac3 100644
--- a/kernel/kcsan/report.c
+++ b/kernel/kcsan/report.c
@@ -605,14 +605,6 @@ void kcsan_report(const volatile void *ptr, size_t size, 
int access_type,
if (WARN_ON(watchpoint_idx < 0 || watchpoint_idx >= 
ARRAY_SIZE(other_infos)))
goto out;
 
-   /*
-* With TRACE_IRQFLAGS, lockdep's IRQ trace state becomes corrupted if
-* we do not turn off lockdep here; this could happen due to recursion
-* into lockdep via KCSAN if we detect a race in utilities used by
-* lockdep.
-*/
-   lockdep_off();
-
if (prepare_report(, type, , other_info)) {
/*
 * Never report if value_change is FALSE, only if we it is
@@ -628,7 +620,6 @@ void kcsan_report(const volatile void *ptr, size_t size, 
int access_type,
release_report(, other_info);
}
 
-   lockdep_on();
 out:
kcsan_enable_current();
 }



Re: [PATCH v4 7/8] lockdep: Change hardirq{s_enabled,_context} to per-cpu variables

2020-06-23 Thread Peter Zijlstra
On Tue, Jun 23, 2020 at 09:13:35PM +0200, Marco Elver wrote:
> I see the below report when I boot with your branch + KCSAN and
> PROVE_LOCKING. config attached. Trying to make sense of what's
> happening.

Ah, I was still playing with tip/master + PROVE_LOCKING + KCSAN and
slowly removing parts of that annotation patch to see what would come
unstuck.

I think I just hit a genuine but unavoidable lockdep report on
report_lock.

> -- >8 --
> 
> [   10.182354] [ cut here ]
> [   10.183058] WARNING: CPU: 7 PID: 136 at kernel/locking/lockdep.c:398 
> lockdep_hardirqs_on_prepare+0x1c6/0x270
> [   10.184347] Modules linked in:
> [   10.184771] CPU: 7 PID: 136 Comm: systemd-journal Not tainted 5.8.0-rc1+ #3
> [   10.185706] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> 1.13.0-1 04/01/2014
> [   10.186821] RIP: 0010:lockdep_hardirqs_on_prepare+0x1c6/0x270
> [   10.187594] Code: 75 28 65 48 8b 04 25 28 00 00 00 48 3b 44 24 08 0f 85 b9 
> 00 00 00 48 83 c4 10 5b 41 5e 41 5f c3 65 48 ff 05 d4 24 4e 75 eb d8 <0f> 0b 
> 90 41 c7 86 c4 08 00 00 00 00 00 00 eb c8 e8 65 09 71 01 85
> [   10.190203] RSP: 0018:a7ee802b7848 EFLAGS: 00010017
> [   10.190989] RAX: 0001 RBX: 955e92a34ab0 RCX: 
> 0001
> [   10.192053] RDX: 0006 RSI: 955e92a34a88 RDI: 
> 955e92a341c0
> [   10.193117] RBP: a7ee802b7be8 R08:  R09: 
> 
> [   10.194186] R10:  R11: 8d07e268 R12: 
> 0001
> [   10.195249] R13: 8e41bb10 R14: 955e92a341c0 R15: 
> 0001
> [   10.196312] FS:  7fd6862aa8c0() GS:955e9fd8() 
> knlGS:
> [   10.197513] CS:  0010 DS:  ES:  CR0: 80050033
> [   10.198373] CR2: 7fd6837dd000 CR3: 000812acc001 CR4: 
> 00760ee0
> [   10.199436] DR0:  DR1:  DR2: 
> 
> [   10.200494] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [   10.201554] PKRU: 5554
> [   10.201967] Call Trace:
> [   10.202348]  ? _raw_spin_unlock_irqrestore+0x40/0x70
> [   10.203093]  trace_hardirqs_on+0x56/0x60   <- enter 
> IRQ flags tracing code?
> [   10.203686]  _raw_spin_unlock_irqrestore+0x40/0x70 <- 
> take report_lock
> [   10.204406]  prepare_report+0x11f/0x150
> [   10.204986]  kcsan_report+0xca/0x6c0   <- 
> generating a KCSAN report
> [   10.212669]  kcsan_found_watchpoint+0xe5/0x110

That appears to be warning about a lockdep_recursion underflow, weird.
I'll go stare at it.




Re: [PATCH v4 7/8] lockdep: Change hardirq{s_enabled,_context} to per-cpu variables

2020-06-23 Thread Peter Zijlstra
On Tue, Jun 23, 2020 at 07:59:57PM +0200, Marco Elver wrote:
> On Tue, Jun 23, 2020 at 06:37PM +0200, Peter Zijlstra wrote:
> > On Tue, Jun 23, 2020 at 06:13:21PM +0200, Ahmed S. Darwish wrote:
> > > Well, freshly merged code is using it. For example, KCSAN:
> > > 
> > > => f1bc96210c6a ("kcsan: Make KCSAN compatible with lockdep")
> > > => kernel/kcsan/report.c:
> > > 
> > > void kcsan_report(...)
> > > {
> > >   ...
> > > /*
> > >  * With TRACE_IRQFLAGS, lockdep's IRQ trace state becomes 
> > > corrupted if
> > >  * we do not turn off lockdep here; this could happen due to 
> > > recursion
> > >  * into lockdep via KCSAN if we detect a race in utilities used by
> > >  * lockdep.
> > >  */
> > > lockdep_off();
> > >   ...
> > > }
> > 
> > Marco, do you remember what exactly happened there? Because I'm about to
> > wreck that. That is, I'm going to make TRACE_IRQFLAGS ignore
> > lockdep_off().
> 
> Yeah, I was trying to squash any kind of recursion:
> 
>   lockdep -> other libs ->
>   -> KCSAN
>   -> print report
>   -> dump stack, printk and friends
>   -> lockdep -> other libs
>   -> KCSAN ...
> 
> Some history:
> 
> * Initial patch to fix:
>   https://lore.kernel.org/lkml/20200115162512.70807-1-el...@google.com/

That patch is weird; just :=n on lockdep.c should've cured that, the
rest is massive overkill.

> * KCSAN+lockdep+ftrace:
>   https://lore.kernel.org/lkml/20200214211035.209972-1-el...@google.com/

That doesn't really have anything useful..

> lockdep now has KCSAN_SANITIZE := n, but we still need to ensure that
> there are no paths out of lockdep, or the IRQ flags tracing code, that
> might lead through other libs, through KCSAN, libs used to generate a
> report, and back to lockdep.
> 
> I never quite figured out the exact trace that led to corruption, but
> avoiding any kind of potential for recursion was the only thing that
> would avoid the check_flags() warnings.

Fair enough; I'll rip it all up and boot a KCSAN kernel, see what if
anything happens.


Re: [PATCH v4 7/8] lockdep: Change hardirq{s_enabled,_context} to per-cpu variables

2020-06-23 Thread Peter Zijlstra
On Tue, Jun 23, 2020 at 06:13:21PM +0200, Ahmed S. Darwish wrote:
> Well, freshly merged code is using it. For example, KCSAN:
> 
> => f1bc96210c6a ("kcsan: Make KCSAN compatible with lockdep")
> => kernel/kcsan/report.c:
> 
> void kcsan_report(...)
> {
>   ...
> /*
>  * With TRACE_IRQFLAGS, lockdep's IRQ trace state becomes corrupted if
>  * we do not turn off lockdep here; this could happen due to recursion
>  * into lockdep via KCSAN if we detect a race in utilities used by
>  * lockdep.
>  */
> lockdep_off();
>   ...
> }

Marco, do you remember what exactly happened there? Because I'm about to
wreck that. That is, I'm going to make TRACE_IRQFLAGS ignore
lockdep_off().


Re: [PATCH v4 7/8] lockdep: Change hardirq{s_enabled,_context} to per-cpu variables

2020-06-23 Thread Peter Zijlstra
On Tue, Jun 23, 2020 at 05:00:31PM +0200, Ahmed S. Darwish wrote:
> On Tue, Jun 23, 2020 at 10:36:52AM +0200, Peter Zijlstra wrote:
> ...
> > -#define lockdep_assert_irqs_disabled() do {
> > \
> > -   WARN_ONCE(debug_locks && !current->lockdep_recursion && \
> > - current->hardirqs_enabled,\
> > - "IRQs not disabled as expected\n");   \
> > -   } while (0)
> > +#define lockdep_assert_irqs_enabled()  
> > \
> > +do {   
> > \
> > +   WARN_ON_ONCE(debug_locks && !this_cpu_read(hardirqs_enabled));  \
> > +} while (0)
> >
> 
> Can we add a small comment on top of lockdep_off(), stating that lockdep
> IRQ tracking will still be kept after a lockdep_off call?

That would only legitimize lockdep_off(). The only comment I want to put
on that is: "if you use this, you're doing it wrong'.


[PATCH v4 7/8] lockdep: Change hardirq{s_enabled, _context} to per-cpu variables

2020-06-23 Thread Peter Zijlstra
Currently all IRQ-tracking state is in task_struct, this means that
task_struct needs to be defined before we use it.

Especially for lockdep_assert_irq*() this can lead to header-hell.

Move the hardirq state into per-cpu variables to avoid the task_struct
dependency.

Signed-off-by: Peter Zijlstra (Intel) 
---
 include/linux/irqflags.h |   19 ---
 include/linux/lockdep.h  |   34 ++
 include/linux/sched.h|2 --
 kernel/fork.c|4 +---
 kernel/locking/lockdep.c |   30 +++---
 kernel/softirq.c |6 ++
 6 files changed, 52 insertions(+), 43 deletions(-)

--- a/include/linux/irqflags.h
+++ b/include/linux/irqflags.h
@@ -14,6 +14,7 @@
 
 #include 
 #include 
+#include 
 
 /* Currently lockdep_softirqs_on/off is used only by lockdep */
 #ifdef CONFIG_PROVE_LOCKING
@@ -31,18 +32,22 @@
 #endif
 
 #ifdef CONFIG_TRACE_IRQFLAGS
+
+DECLARE_PER_CPU(int, hardirqs_enabled);
+DECLARE_PER_CPU(int, hardirq_context);
+
   extern void trace_hardirqs_on_prepare(void);
   extern void trace_hardirqs_off_finish(void);
   extern void trace_hardirqs_on(void);
   extern void trace_hardirqs_off(void);
-# define lockdep_hardirq_context(p)((p)->hardirq_context)
+# define lockdep_hardirq_context(p)(this_cpu_read(hardirq_context))
 # define lockdep_softirq_context(p)((p)->softirq_context)
-# define lockdep_hardirqs_enabled(p)   ((p)->hardirqs_enabled)
+# define lockdep_hardirqs_enabled(p)   (this_cpu_read(hardirqs_enabled))
 # define lockdep_softirqs_enabled(p)   ((p)->softirqs_enabled)
-# define lockdep_hardirq_enter()   \
-do {   \
-   if (!current->hardirq_context++)\
-   current->hardirq_threaded = 0;  \
+# define lockdep_hardirq_enter()   \
+do {   \
+   if (this_cpu_inc_return(hardirq_context) == 1)  \
+   current->hardirq_threaded = 0;  \
 } while (0)
 # define lockdep_hardirq_threaded()\
 do {   \
@@ -50,7 +55,7 @@ do {  \
 } while (0)
 # define lockdep_hardirq_exit()\
 do {   \
-   current->hardirq_context--; \
+   this_cpu_dec(hardirq_context);  \
 } while (0)
 # define lockdep_softirq_enter()   \
 do {   \
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -20,6 +20,7 @@ extern int lock_stat;
 #define MAX_LOCKDEP_SUBCLASSES 8UL
 
 #include 
+#include 
 
 enum lockdep_wait_type {
LD_WAIT_INV = 0,/* not checked, catch all */
@@ -703,28 +704,29 @@ do {  
\
lock_release(&(lock)->dep_map, _THIS_IP_);  \
 } while (0)
 
-#define lockdep_assert_irqs_enabled()  do {\
-   WARN_ONCE(debug_locks && !current->lockdep_recursion && \
- !current->hardirqs_enabled,   \
- "IRQs not enabled as expected\n");\
-   } while (0)
+DECLARE_PER_CPU(int, hardirqs_enabled);
+DECLARE_PER_CPU(int, hardirq_context);
 
-#define lockdep_assert_irqs_disabled() do {\
-   WARN_ONCE(debug_locks && !current->lockdep_recursion && \
- current->hardirqs_enabled,\
- "IRQs not disabled as expected\n");   \
-   } while (0)
+#define lockdep_assert_irqs_enabled()  \
+do {   \
+   WARN_ON_ONCE(debug_locks && !this_cpu_read(hardirqs_enabled));  \
+} while (0)
 
-#define lockdep_assert_in_irq() do {   \
-   WARN_ONCE(debug_locks && !current->lockdep_recursion && \
- !current->hardirq_context,\
- "Not in hardirq as expected\n");  \
-   } while (0)
+#define lockdep_assert_irqs_disabled() \
+do {   \
+   WARN_ON_ONCE(debug_locks && this_cpu_read(hardirqs_enabled));   \
+} while (0)
+
+#define lockdep_assert_in_irq()
\
+do {   \
+   WARN_ON_ONCE(debug_locks && !this_cpu_read(hardirq_context));   \
+} while (0)
 
 #else
 # define might_lock(lock) do { } while (0)
 # define might_lock_read(lock) do { } while (0)
 # define might_lock_nested(lock, subclass) do { } while (0)
+
 # define lockdep_assert_irqs_enabled() do { } while (0)
 # define