Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-05-01 Thread Waiman Long
On Wed, Apr 03, 2019 at 11:34:04AM -0600, Khalid Aziz wrote:
> diff --git a/Documentation/admin-guide/kernel-parameters.txt
b/Documentation/admin-guide/kernel-parameters.txt

> index 858b6c0b9a15..9b36da94760e 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -2997,6 +2997,12 @@
>
>  nox2apic    [X86-64,APIC] Do not enable x2APIC mode.
>
> +    noxpfo    [XPFO] Disable eXclusive Page Frame Ownership (XPFO)
> +    when CONFIG_XPFO is on. Physical pages mapped into
> +    user applications will also be mapped in the
> +    kernel's address space as if CONFIG_XPFO was not
> +    enabled.
> +
>  cpu0_hotplug    [X86] Turn on CPU0 hotplug feature when
>  CONFIG_BO OTPARAM_HOTPLUG_CPU0 is off.
>  Some features depend on CPU0. Known dependencies are:

Given the big performance impact that XPFO can have. It should be off by
default when configured. Instead, the xpfo option should be used to
enable it.

Cheers,
Longman

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 07/17] debugobjects: Move printk out of db lock critical sections

2018-11-22 Thread Waiman Long
On 11/22/2018 11:02 AM, Petr Mladek wrote:
> On Thu 2018-11-22 11:04:22, Sergey Senozhatsky wrote:
>> On (11/21/18 11:49), Waiman Long wrote:
>> [..]
>>>>case ODEBUG_STATE_ACTIVE:
>>>> -  debug_print_object(obj, "init");
>>>>state = obj->state;
>>>>raw_spin_unlock_irqrestore(>lock, flags);
>>>> +  debug_print_object(obj, "init");
>>>>debug_object_fixup(descr->fixup_init, addr, state);
>>>>return;
>>>>  
>>>>case ODEBUG_STATE_DESTROYED:
>>>> -  debug_print_object(obj, "init");
>>>> +  debug_printobj = true;
>>>>break;
>>>>default:
>>>>break;
>>>>}
>>>>  
>>>>raw_spin_unlock_irqrestore(>lock, flags);
>>>> +  if (debug_chkstack)
>>>> +  debug_object_is_on_stack(addr, onstack);
>>>> +  if (debug_printobj)
>>>> +  debug_print_object(obj, "init");
>>>>
>> [..]
>>> As a side note, one of the test systems that I used generated a
>>> debugobjects splat in the bootup process and the system hanged
>>> afterward. Applying this patch alone fix the hanging problem and the
>>> system booted up successfully. So it is not really a good idea to call
>>> printk() while holding a raw spinlock.
> Please, was the system hang reproducible? I wonder if it was a
> deadlock described by Sergey below.

Yes, it is 100% reproducible on the testing system that I used.

> The commit message is right. printk() might take too long and
> cause softlockup or livelock. But it does not explain why
> the system could competely hang.
>
> Also note that prinkt() should not longer block a single process
> indefinitely thanks to the commit dbdda842fe96f8932 ("printk:
> Add console owner and waiter logic to load balance console writes").

The problem might have been caused by the fact that IRQ was also
disabled in the lock critical section.

>> Some serial consoles call mod_timer(). So what we could have with the
>> debug objects enabled was
>>
>>  mod_timer()
>>   lock_timer_base()
>>debug_activate()
>> printk()
>>  call_console_drivers()
>>   foo_console()
>>mod_timer()
>> lock_timer_base()   << deadlock
> Anyway, I wonder what was the primary motivation for this patch.
> Was it the system hang? Or was it lockdep report about nesting
> two terminal locks: db->lock, pool_lock with logbuf_lock?

The primary motivation was to make sure that printk() won't be called
while holding either db->lock or pool_lock in the debugobjects code. In
the determination of which locks can be made terminal, I focused on
local spinlocks that won't cause boundary to an unrelated subsystem as
it will greatly complicate the analysis.

I didn't realize that it fixed a hang problem that I was seeing until I
did bisection to find out that it was caused by the patch that cause the
debugobjects splat in the first place a few days ago. I was comparing
the performance status of the pre and post patched kernel. The pre-patch
kernel failed to boot in the one of my test systems, but the
post-patched kernel could. I narrowed it down to this patch as the fix
for the hang problem.

Cheers,
Longman



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 09/17] debugobjects: Make object hash locks nestable terminal locks

2018-11-22 Thread Waiman Long
On 11/22/2018 10:33 AM, Petr Mladek wrote:
> On Mon 2018-11-19 13:55:18, Waiman Long wrote:
>> By making the object hash locks nestable terminal locks, we can avoid
>> a bunch of unnecessary lockdep validations as well as saving space
>> in the lockdep tables.
> Please, explain which terminal lock might be nested.
>
> Hmm, it would hide eventual nesting of other terminal locks.
> It might reduce lockdep reliability. I wonder if the space
> optimization is worth it.
>
> Finally, it might be good to add a short explanation what (nested)
> terminal locks mean into each commit. It would help people to
> understand the effect without digging into the lockdep code, ...
>
> Best Regards,
> Petr

Nested terminal lock is currently only used in the debugobjects code. It
should only be used on a case-by-case basis. In the case of the
debugojects code, the locking patterns are:

(1)

raw_spin_lock(_lock);
...
raw_spin_unlock(_lock);

(2)

raw_spin_lock(_lock);
...
raw_spin_lock(_lock);
...
raw_spin_unlock(_lock);
...
raw_spin_unlock(_lock);

(3)

raw_spin_lock(_lock);
...
raw_spin_unlock(_lock);

So the db_lock is made to be nestable that it can allow acquisition of
pool_lock (a regular terminal lock) underneath it.

Cheers,
Longman


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 07/17] debugobjects: Move printk out of db lock critical sections

2018-11-22 Thread Waiman Long
On 11/21/2018 09:04 PM, Sergey Senozhatsky wrote:
> On (11/21/18 11:49), Waiman Long wrote:
> [..]
>>> case ODEBUG_STATE_ACTIVE:
>>> -   debug_print_object(obj, "init");
>>> state = obj->state;
>>> raw_spin_unlock_irqrestore(>lock, flags);
>>> +   debug_print_object(obj, "init");
>>> debug_object_fixup(descr->fixup_init, addr, state);
>>> return;
>>>  
>>> case ODEBUG_STATE_DESTROYED:
>>> -   debug_print_object(obj, "init");
>>> +   debug_printobj = true;
>>> break;
>>> default:
>>> break;
>>> }
>>>  
>>> raw_spin_unlock_irqrestore(>lock, flags);
>>> +   if (debug_chkstack)
>>> +   debug_object_is_on_stack(addr, onstack);
>>> +   if (debug_printobj)
>>> +   debug_print_object(obj, "init");
>>>
> [..]
>> As a side note, one of the test systems that I used generated a
>> debugobjects splat in the bootup process and the system hanged
>> afterward. Applying this patch alone fix the hanging problem and the
>> system booted up successfully. So it is not really a good idea to call
>> printk() while holding a raw spinlock.
> Right, I like this patch.
> And I think that we, maybe, can go even further.
>
> Some serial consoles call mod_timer(). So what we could have with the
> debug objects enabled was
>
>   mod_timer()
>lock_timer_base()
> debug_activate()
>  printk()
>   call_console_drivers()
>foo_console()
> mod_timer()
>  lock_timer_base()   << deadlock
>
> That's one possible scenario. The other one can involve console's
> IRQ handler, uart port spinlock, mod_timer, debug objects, printk,
> and an eventual deadlock on the uart port spinlock. This one can
> be mitigated with printk_safe. But mod_timer() deadlock will require
> a different fix.
>
> So maybe we need to switch debug objects print-outs to _always_
> printk_deferred(). Debug objects can be used in code which cannot
> do direct printk() - timekeeping is just one example.
>
>   -ss

Actually, I don't think that was the cause of the hang. The debugobjects
splat was caused by debug_object_is_on_stack(), below was the output:

[    6.890048] ODEBUG: object (ptrval) is NOT on stack
(ptrval), but annotated.
[    6.891000] WARNING: CPU: 28 PID: 1 at lib/debugobjects.c:369
__debug_object_init.cold.11+0x51/0x2d6
[    6.891000] Modules linked in:
[    6.891000] CPU: 28 PID: 1 Comm: swapper/0 Not tainted
4.18.0-41.el8.bz1651764_cgroup_debug.x86_64+debug #1
[    6.891000] Hardware name: HPE ProLiant DL120 Gen10/ProLiant DL120
Gen10, BIOS U36 11/14/2017
[    6.891000] RIP: 0010:__debug_object_init.cold.11+0x51/0x2d6
[    6.891000] Code: ea 03 80 3c 02 00 0f 85 85 02 00 00 49 8b 54 24 18
48 89 de 4c 89 44 24 10 48 c7 c7 00 ce 22 94 e8 73 18 62 ff 4c 8b 44 24
10 <0f> 0b e9 60 db ff ff 41 83 c4 01 b8 ff ff 37 00 44 89 25 ce 46 f9
[    6.891000] RSP: :880104187960 EFLAGS: 00010086
[    6.891000] RAX: 0050 RBX: 9764c570 RCX:

[    6.891000] RDX:  RSI:  RDI:
880104178ca8
[    6.891000] RBP: 110020830f34 R08: 8807ce68a1d0 R09:
fbfff2923554
[    6.891000] R10: fbfff2923554 R11: 9491aaa3 R12:
880104178000
[    6.891000] R13: 96c809b8 R14: a370 R15:
8807ce68a1c0
[    6.891000] FS:  () GS:8807d420()
knlGS:
[    6.891000] CS:  0010 DS:  ES:  CR0: 80050033
[    6.891000] CR2:  CR3: 00028de16001 CR4:
007606e0
[    6.891000] DR0:  DR1:  DR2:

[    6.891000] DR3:  DR6: fffe0ff0 DR7:
0400
[    6.891000] PKRU: 
[    6.891000] Call Trace:
[    6.891000]  ? debug_object_fixup+0x30/0x30
[    6.891000]  ? _raw_spin_unlock_irqrestore+0x4b/0x60
[    6.891000]  ? __lockdep_init_map+0x12f/0x510
[    6.891000]  ? __lockdep_init_map+0x12f/0x510
[    6.891000]  virt_efi_get_next_variable+0xa2/0x160
[    6.891000]  efivar_init+0x1c4/0x6d7
[    6.891000]  ? efivar_ssdt_setup+0x3b/0x3b
[    6.891000]  ? efivar_entry_iter+0x120/0x120
[    6.891000]  ? find_held_lock+0x3a/0x1c0
[    6.891000]  ? lock_downgrade+0x5e0/0x5e0
[    6.891000]  ? kmsg_dump_rewind_nolock+0xd9/0xd9
[    6.891000]  ? _raw_spin_unlock_irqrestore+0x4b/0x60
[    6.891000]  ? trace_hardirqs_on_caller+0x381/0x570
[    6.891000]  ? efivar_ssdt_iter+0x1f4/0x1f4
[    6.891000]  efisubsys_init+0x1be/0x4ae
[ 

Re: [PATCH v2 07/17] debugobjects: Move printk out of db lock critical sections

2018-11-21 Thread Waiman Long
On 11/19/2018 01:55 PM, Waiman Long wrote:
> The db->lock is a raw spinlock and so the lock hold time is supposed
> to be short. This will not be the case when printk() is being involved
> in some of the critical sections. In order to avoid the long hold time,
> in case some messages need to be printed, the debug_object_is_on_stack()
> and debug_print_object() calls are now moved out of those critical
> sections.
>
> Signed-off-by: Waiman Long 
> ---
>  lib/debugobjects.c | 61 
> +-
>  1 file changed, 42 insertions(+), 19 deletions(-)
>
> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> index 403dd95..4216d3d 100644
> --- a/lib/debugobjects.c
> +++ b/lib/debugobjects.c
> @@ -376,6 +376,8 @@ static void debug_object_is_on_stack(void *addr, int 
> onstack)
>   struct debug_bucket *db;
>   struct debug_obj *obj;
>   unsigned long flags;
> + bool debug_printobj = false;
> + bool debug_chkstack = false;
>  
>   fill_pool();
>  
> @@ -392,7 +394,7 @@ static void debug_object_is_on_stack(void *addr, int 
> onstack)
>   debug_objects_oom();
>   return;
>   }
> - debug_object_is_on_stack(addr, onstack);
> + debug_chkstack = true;
>   }
>  
>   switch (obj->state) {
> @@ -403,20 +405,25 @@ static void debug_object_is_on_stack(void *addr, int 
> onstack)
>   break;
>  
>   case ODEBUG_STATE_ACTIVE:
> - debug_print_object(obj, "init");
>   state = obj->state;
>   raw_spin_unlock_irqrestore(>lock, flags);
> + debug_print_object(obj, "init");
>   debug_object_fixup(descr->fixup_init, addr, state);
>   return;
>  
>   case ODEBUG_STATE_DESTROYED:
> - debug_print_object(obj, "init");
> + debug_printobj = true;
>   break;
>   default:
>   break;
>   }
>  
>   raw_spin_unlock_irqrestore(>lock, flags);
> + if (debug_chkstack)
> + debug_object_is_on_stack(addr, onstack);
> + if (debug_printobj)
> + debug_print_object(obj, "init");
> +
>  }
>  
>  /**
> @@ -474,6 +481,8 @@ int debug_object_activate(void *addr, struct 
> debug_obj_descr *descr)
>  
>   obj = lookup_object(addr, db);
>   if (obj) {
> + bool debug_printobj = false;
> +
>   switch (obj->state) {
>   case ODEBUG_STATE_INIT:
>   case ODEBUG_STATE_INACTIVE:
> @@ -482,14 +491,14 @@ int debug_object_activate(void *addr, struct 
> debug_obj_descr *descr)
>   break;
>  
>   case ODEBUG_STATE_ACTIVE:
> - debug_print_object(obj, "activate");
>   state = obj->state;
>   raw_spin_unlock_irqrestore(>lock, flags);
> + debug_print_object(obj, "activate");
>   ret = debug_object_fixup(descr->fixup_activate, addr, 
> state);
>   return ret ? 0 : -EINVAL;
>  
>   case ODEBUG_STATE_DESTROYED:
> - debug_print_object(obj, "activate");
> + debug_printobj = true;
>   ret = -EINVAL;
>   break;
>   default:
> @@ -497,10 +506,13 @@ int debug_object_activate(void *addr, struct 
> debug_obj_descr *descr)
>   break;
>   }
>   raw_spin_unlock_irqrestore(>lock, flags);
> + if (debug_printobj)
> + debug_print_object(obj, "activate");
>   return ret;
>   }
>  
>   raw_spin_unlock_irqrestore(>lock, flags);
> +
>   /*
>* We are here when a static object is activated. We
>* let the type specific code confirm whether this is
> @@ -532,6 +544,7 @@ void debug_object_deactivate(void *addr, struct 
> debug_obj_descr *descr)
>   struct debug_bucket *db;
>   struct debug_obj *obj;
>   unsigned long flags;
> + bool debug_printobj = false;
>  
>   if (!debug_objects_enabled)
>   return;
> @@ -549,24 +562,27 @@ void debug_object_deactivate(void *addr, struct 
> debug_obj_descr *descr)
>   if (!obj->astate)
>   obj->state = ODEBUG_STATE_INACTIVE;
>   else
> - debug_print_object(obj, "deactivate"

[PATCH v2 11/17] locking/rwsem: Mark rwsem.wait_lock as a terminal lock

2018-11-19 Thread Waiman Long
The wait_lock in a rwsem is always acquired with IRQ disabled. For the
rwsem-xadd.c implementation, no other lock will be called while holding
the wait_lock. So it satisfies the condition of being a terminal lock.
By marking it as terminal,  the lockdep overhead will be reduced.

Signed-off-by: Waiman Long 
---
 include/linux/rwsem.h   | 11 ++-
 kernel/locking/rwsem-xadd.c |  1 +
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/include/linux/rwsem.h b/include/linux/rwsem.h
index 67dbb57..a2a2385 100644
--- a/include/linux/rwsem.h
+++ b/include/linux/rwsem.h
@@ -77,16 +77,25 @@ static inline int rwsem_is_locked(struct rw_semaphore *sem)
 # define __RWSEM_DEP_MAP_INIT(lockname)
 #endif
 
+/*
+ * The wait_lock is marked as a terminal lock to reduce lockdep overhead
+ * when the rwsem-xadd.c is used. This is implied when
+ * CONFIG_RWSEM_SPIN_ON_OWNER is true. The rwsem-spinlock.c implementation
+ * allows calling wake_up_process() while holding the wait_lock. So it
+ * can't be marked as terminal in this case.
+ */
 #ifdef CONFIG_RWSEM_SPIN_ON_OWNER
 #define __RWSEM_OPT_INIT(lockname) , .osq = OSQ_LOCK_UNLOCKED, .owner = NULL
+#define __RWSEM_WAIT_LOCK_INIT(x)  __RAW_TERMINAL_SPIN_LOCK_UNLOCKED(x)
 #else
 #define __RWSEM_OPT_INIT(lockname)
+#define __RWSEM_WAIT_LOCK_INIT(x)  __RAW_SPIN_LOCK_UNLOCKED(x)
 #endif
 
 #define __RWSEM_INITIALIZER(name)  \
{ __RWSEM_INIT_COUNT(name), \
  .wait_list = LIST_HEAD_INIT((name).wait_list),\
- .wait_lock = __RAW_SPIN_LOCK_UNLOCKED(name.wait_lock) \
+ .wait_lock = __RWSEM_WAIT_LOCK_INIT(name.wait_lock)   \
  __RWSEM_OPT_INIT(name)\
  __RWSEM_DEP_MAP_INIT(name) }
 
diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index 09b1800..3dbe593 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -85,6 +85,7 @@ void __init_rwsem(struct rw_semaphore *sem, const char *name,
 #endif
atomic_long_set(>count, RWSEM_UNLOCKED_VALUE);
raw_spin_lock_init(>wait_lock);
+   lockdep_set_terminal_class(>wait_lock);
INIT_LIST_HEAD(>wait_list);
 #ifdef CONFIG_RWSEM_SPIN_ON_OWNER
sem->owner = NULL;
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 17/17] locking/lockdep: Check raw/non-raw locking conflicts

2018-11-19 Thread Waiman Long
A task holding a raw spinlock should not acquire a non-raw lock as that
will break PREEMPT_RT kernel. Checking is now added and a lockdep warning
will be printed if that happens.

Signed-off-by: Waiman Long 
---
 include/linux/lockdep.h |  6 ++
 include/linux/spinlock_types.h  |  4 ++--
 kernel/locking/lockdep.c| 15 +--
 kernel/locking/spinlock_debug.c |  1 +
 4 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index b9435fb..9a6fe0e 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -150,12 +150,15 @@ struct lock_class_stats {
  *another lock within its critical section is not allowed.
  * 3) LOCKDEP_FLAG_TERMINAL_NESTABLE: This is a terminal lock that can
  *allow one more regular terminal lock to be nested underneath it.
+ * 4) LOCKDEP_FLAG_RAW: This is a raw spinlock. A task holding a raw
+ *spinlock should not acquire a non-raw lock.
  *
  * Only the least significant 4 bits of the flags will be copied to the
  * held_lock structure.
  */
 #define LOCKDEP_FLAG_TERMINAL  (1 << 0)
 #define LOCKDEP_FLAG_TERMINAL_NESTABLE (1 << 1)
+#define LOCKDEP_FLAG_RAW   (1 << 2)
 #define LOCKDEP_FLAG_NOVALIDATE(1 << 4)
 
 #define LOCKDEP_HLOCK_FLAGS_MASK   0x0f
@@ -333,6 +336,8 @@ extern void lockdep_init_map(struct lockdep_map *lock, 
const char *name,
do { (lock)->dep_map.flags |= LOCKDEP_FLAG_TERMINAL; } while (0)
 #define lockdep_set_terminal_nestable_class(lock) \
do { (lock)->dep_map.flags |= LOCKDEP_FLAG_TERMINAL_NESTABLE; } while 
(0)
+#define lockdep_set_raw_class(lock) \
+   do { (lock)->dep_map.flags |= LOCKDEP_FLAG_RAW; } while (0)
 
 /*
  * Compare locking classes
@@ -448,6 +453,7 @@ static inline void lockdep_on(void)
do { (void)(key); } while (0)
 #define lockdep_set_subclass(lock, sub)do { } while (0)
 
+#define lockdep_set_raw_class(lock)do { } while (0)
 #define lockdep_set_novalidate_class(lock) do { } while (0)
 #define lockdep_set_terminal_class(lock)   do { } while (0)
 #define lockdep_set_terminal_nestable_class(lock)  do { } while (0)
diff --git a/include/linux/spinlock_types.h b/include/linux/spinlock_types.h
index 6a8086e..1d2114b 100644
--- a/include/linux/spinlock_types.h
+++ b/include/linux/spinlock_types.h
@@ -55,11 +55,11 @@
SPIN_DEP_MAP_INIT(lockname, f) }
 
 #define __RAW_SPIN_LOCK_UNLOCKED(lockname) \
-   (raw_spinlock_t) __RAW_SPIN_LOCK_INITIALIZER(lockname, 0)
+   (raw_spinlock_t) __RAW_SPIN_LOCK_INITIALIZER(lockname, LOCKDEP_FLAG_RAW)
 
 #define __RAW_TERMINAL_SPIN_LOCK_UNLOCKED(lockname)\
(raw_spinlock_t) __RAW_SPIN_LOCK_INITIALIZER(lockname, \
-LOCKDEP_FLAG_TERMINAL)
+   LOCKDEP_FLAG_TERMINAL|LOCKDEP_FLAG_RAW)
 
 #define DEFINE_RAW_SPINLOCK(x) raw_spinlock_t x = __RAW_SPIN_LOCK_UNLOCKED(x)
 #define DEFINE_RAW_TERMINAL_SPINLOCK(x)\
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 5a853a6..efafd2d 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -3273,8 +3273,8 @@ static int __lock_acquire(struct lockdep_map *lock, 
unsigned int subclass,
 * one isn't a regular terminal lock.
 */
prev_type = hlock_is_terminal(hlock);
-   if (DEBUG_LOCKS_WARN_ON((prev_type == LOCKDEP_FLAG_TERMINAL) ||
-   ((prev_type == LOCKDEP_FLAG_TERMINAL_NESTABLE) &&
+   if (DEBUG_LOCKS_WARN_ON((prev_type & LOCKDEP_FLAG_TERMINAL) ||
+   ((prev_type & LOCKDEP_FLAG_TERMINAL_NESTABLE) &&
 (flags_is_terminal(class->flags) !=
LOCKDEP_FLAG_TERMINAL {
pr_warn("Terminal lock error: prev lock = %s, curr lock 
= %s\n",
@@ -3282,6 +3282,17 @@ static int __lock_acquire(struct lockdep_map *lock, 
unsigned int subclass,
return 0;
}
 
+   /*
+* A task holding a raw spinlock should not acquire another
+* non-raw lock.
+*/
+   if (DEBUG_LOCKS_WARN_ON((prev_type & LOCKDEP_FLAG_RAW) &&
+   !(class->flags & LOCKDEP_FLAG_RAW))) {
+   pr_warn("Raw lock error: prev lock = %s, curr lock = 
%s\n",
+   hlock->instance->name, class->name);
+   return 0;
+   }
+
if (hlock->class_idx == class_idx && nest_lock) {
if (hlock->references) {
/*
diff --git a/kernel/locking/spinlock_debug.c b/kernel/locking/spinlock_debug.c
index 

[PATCH v2 14/17] dma-debug: Mark free_entries_lock as terminal

2018-11-19 Thread Waiman Long
By making free_entries_lock a terminal spinlock, it reduces the lockdep
overhead when this lock is used.

Signed-off-by: Waiman Long 
---
 kernel/dma/debug.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
index 231ca46..f891688 100644
--- a/kernel/dma/debug.c
+++ b/kernel/dma/debug.c
@@ -106,7 +106,7 @@ struct hash_bucket {
 /* List of pre-allocated dma_debug_entry's */
 static LIST_HEAD(free_entries);
 /* Lock for the list above */
-static DEFINE_SPINLOCK(free_entries_lock);
+static DEFINE_TERMINAL_SPINLOCK(free_entries_lock);
 
 /* Global disable flag - will be set in case of an error */
 static bool global_disable __read_mostly;
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 13/17] mm/kasan: Make quarantine_lock a terminal lock

2018-11-19 Thread Waiman Long
By making quarantine_lock a terminal spinlock, it reduces the lockdep
overhead when this lock is being used.

Signed-off-by: Waiman Long 
---
 mm/kasan/quarantine.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
index b209dba..c9d36ab 100644
--- a/mm/kasan/quarantine.c
+++ b/mm/kasan/quarantine.c
@@ -103,7 +103,7 @@ static void qlist_move_all(struct qlist_head *from, struct 
qlist_head *to)
 static int quarantine_tail;
 /* Total size of all objects in global_quarantine across all batches. */
 static unsigned long quarantine_size;
-static DEFINE_RAW_SPINLOCK(quarantine_lock);
+static DEFINE_RAW_TERMINAL_SPINLOCK(quarantine_lock);
 DEFINE_STATIC_SRCU(remove_cache_srcu);
 
 /* Maximum size of the global queue. */
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 10/17] lib/stackdepot: Make depot_lock a terminal spinlock

2018-11-19 Thread Waiman Long
By defining depot_lock as a terminal spinlock, it reduces the
lockdep overhead when this lock is being used.

Signed-off-by: Waiman Long 
---
 lib/stackdepot.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/stackdepot.c b/lib/stackdepot.c
index e513459..fb17888 100644
--- a/lib/stackdepot.c
+++ b/lib/stackdepot.c
@@ -78,7 +78,7 @@ struct stack_record {
 static int depot_index;
 static int next_slab_inited;
 static size_t depot_offset;
-static DEFINE_SPINLOCK(depot_lock);
+static DEFINE_TERMINAL_SPINLOCK(depot_lock);
 
 static bool init_stack_slab(void **prealloc)
 {
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 16/17] delay_acct: Mark task's delays->lock as terminal spinlock

2018-11-19 Thread Waiman Long
By making task's delays->lock a terminal spinlock, it reduces the
lockdep overhead when this lock is used.

Signed-off-by: Waiman Long 
---
 kernel/delayacct.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/delayacct.c b/kernel/delayacct.c
index 2a12b98..49dd8d3 100644
--- a/kernel/delayacct.c
+++ b/kernel/delayacct.c
@@ -43,8 +43,10 @@ void delayacct_init(void)
 void __delayacct_tsk_init(struct task_struct *tsk)
 {
tsk->delays = kmem_cache_zalloc(delayacct_cache, GFP_KERNEL);
-   if (tsk->delays)
+   if (tsk->delays) {
raw_spin_lock_init(>delays->lock);
+   lockdep_set_terminal_class(>delays->lock);
+   }
 }
 
 /*
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 09/17] debugobjects: Make object hash locks nestable terminal locks

2018-11-19 Thread Waiman Long
By making the object hash locks nestable terminal locks, we can avoid
a bunch of unnecessary lockdep validations as well as saving space
in the lockdep tables.

Signed-off-by: Waiman Long 
---
 lib/debugobjects.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 4216d3d..c6f3967 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -1129,8 +1129,13 @@ void __init debug_objects_early_init(void)
 {
int i;
 
-   for (i = 0; i < ODEBUG_HASH_SIZE; i++)
+   /*
+* Make the obj_hash locks nestable terminal locks.
+*/
+   for (i = 0; i < ODEBUG_HASH_SIZE; i++) {
raw_spin_lock_init(_hash[i].lock);
+   lockdep_set_terminal_nestable_class(_hash[i].lock);
+   }
 
for (i = 0; i < ODEBUG_POOL_SIZE; i++)
hlist_add_head(_static_pool[i].node, _pool);
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 08/17] locking/lockdep: Add support for nestable terminal locks

2018-11-19 Thread Waiman Long
There are use cases where we want to allow nesting of one terminal lock
underneath another terminal-like lock. That new lock type is called
nestable terminal lock which can optionally allow the acquisition of
no more than one regular (non-nestable) terminal lock underneath it.

Signed-off-by: Waiman Long 
---
 include/linux/lockdep.h|  9 -
 kernel/locking/lockdep.c   | 15 +--
 kernel/locking/lockdep_internals.h |  2 +-
 3 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index a146bca..b9435fb 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -148,16 +148,20 @@ struct lock_class_stats {
  * 1) LOCKDEP_FLAG_NOVALIDATE: No full validation, just simple checks.
  * 2) LOCKDEP_FLAG_TERMINAL: This is a terminal lock where lock/unlock on
  *another lock within its critical section is not allowed.
+ * 3) LOCKDEP_FLAG_TERMINAL_NESTABLE: This is a terminal lock that can
+ *allow one more regular terminal lock to be nested underneath it.
  *
  * Only the least significant 4 bits of the flags will be copied to the
  * held_lock structure.
  */
 #define LOCKDEP_FLAG_TERMINAL  (1 << 0)
+#define LOCKDEP_FLAG_TERMINAL_NESTABLE (1 << 1)
 #define LOCKDEP_FLAG_NOVALIDATE(1 << 4)
 
 #define LOCKDEP_HLOCK_FLAGS_MASK   0x0f
 #define LOCKDEP_NOCHECK_FLAGS  (LOCKDEP_FLAG_NOVALIDATE |\
-LOCKDEP_FLAG_TERMINAL)
+LOCKDEP_FLAG_TERMINAL   |\
+LOCKDEP_FLAG_TERMINAL_NESTABLE)
 
 /*
  * Map the lock object (the lock instance) to the lock-class object.
@@ -327,6 +331,8 @@ extern void lockdep_init_map(struct lockdep_map *lock, 
const char *name,
do { (lock)->dep_map.flags |= LOCKDEP_FLAG_NOVALIDATE; } while (0)
 #define lockdep_set_terminal_class(lock) \
do { (lock)->dep_map.flags |= LOCKDEP_FLAG_TERMINAL; } while (0)
+#define lockdep_set_terminal_nestable_class(lock) \
+   do { (lock)->dep_map.flags |= LOCKDEP_FLAG_TERMINAL_NESTABLE; } while 
(0)
 
 /*
  * Compare locking classes
@@ -444,6 +450,7 @@ static inline void lockdep_on(void)
 
 #define lockdep_set_novalidate_class(lock) do { } while (0)
 #define lockdep_set_terminal_class(lock)   do { } while (0)
+#define lockdep_set_terminal_nestable_class(lock)  do { } while (0)
 
 /*
  * We don't define lockdep_match_class() and lockdep_match_key() for !LOCKDEP
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 40894c1..5a853a6 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -3263,13 +3263,24 @@ static int __lock_acquire(struct lockdep_map *lock, 
unsigned int subclass,
class_idx = class - lock_classes + 1;
 
if (depth) {
+   int prev_type;
+
hlock = curr->held_locks + depth - 1;
 
/*
-* Warn if the previous lock is a terminal lock.
+* Warn if the previous lock is a terminal lock or the
+* previous lock is a nestable terminal lock and the current
+* one isn't a regular terminal lock.
 */
-   if (DEBUG_LOCKS_WARN_ON(hlock_is_terminal(hlock)))
+   prev_type = hlock_is_terminal(hlock);
+   if (DEBUG_LOCKS_WARN_ON((prev_type == LOCKDEP_FLAG_TERMINAL) ||
+   ((prev_type == LOCKDEP_FLAG_TERMINAL_NESTABLE) &&
+(flags_is_terminal(class->flags) !=
+   LOCKDEP_FLAG_TERMINAL {
+   pr_warn("Terminal lock error: prev lock = %s, curr lock 
= %s\n",
+   hlock->instance->name, class->name);
return 0;
+   }
 
if (hlock->class_idx == class_idx && nest_lock) {
if (hlock->references) {
diff --git a/kernel/locking/lockdep_internals.h 
b/kernel/locking/lockdep_internals.h
index 271fba8..51fa141 100644
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -215,5 +215,5 @@ static inline unsigned long debug_class_ops_read(struct 
lock_class *class)
 
 static inline unsigned int flags_is_terminal(unsigned int flags)
 {
-   return flags & LOCKDEP_FLAG_TERMINAL;
+   return flags & (LOCKDEP_FLAG_TERMINAL|LOCKDEP_FLAG_TERMINAL_NESTABLE);
 }
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 15/17] kernfs: Mark kernfs_open_node_lock as terminal lock

2018-11-19 Thread Waiman Long
By making kernfs_open_node_lock a terminal spinlock, it reduces the
lockdep overhead when this lock is used.

Signed-off-by: Waiman Long 
---
 fs/kernfs/file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c
index dbf5bc2..a86fe22 100644
--- a/fs/kernfs/file.c
+++ b/fs/kernfs/file.c
@@ -29,7 +29,7 @@
  * kernfs_open_file.  kernfs_open_files are chained at
  * kernfs_open_node->files, which is protected by kernfs_open_file_mutex.
  */
-static DEFINE_SPINLOCK(kernfs_open_node_lock);
+static DEFINE_TERMINAL_SPINLOCK(kernfs_open_node_lock);
 static DEFINE_MUTEX(kernfs_open_file_mutex);
 
 struct kernfs_open_node {
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 12/17] cgroup: Mark the rstat percpu lock as terminal

2018-11-19 Thread Waiman Long
By classifying the cgroup rstat percpu locks as terminal locks, it
reduces the lockdep overhead when these locks are being used.

Signed-off-by: Waiman Long 
---
 kernel/cgroup/rstat.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c
index d503d1a..47f7ffb 100644
--- a/kernel/cgroup/rstat.c
+++ b/kernel/cgroup/rstat.c
@@ -291,8 +291,13 @@ void __init cgroup_rstat_boot(void)
 {
int cpu;
 
-   for_each_possible_cpu(cpu)
-   raw_spin_lock_init(per_cpu_ptr(_rstat_cpu_lock, cpu));
+   for_each_possible_cpu(cpu) {
+   raw_spinlock_t *cgroup_rstat_percpu_lock =
+   per_cpu_ptr(_rstat_cpu_lock, cpu);
+
+   raw_spin_lock_init(cgroup_rstat_percpu_lock);
+   lockdep_set_terminal_class(cgroup_rstat_percpu_lock);
+   }
 
BUG_ON(cgroup_rstat_init(_dfl_root.cgrp));
 }
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 07/17] debugobjects: Move printk out of db lock critical sections

2018-11-19 Thread Waiman Long
The db->lock is a raw spinlock and so the lock hold time is supposed
to be short. This will not be the case when printk() is being involved
in some of the critical sections. In order to avoid the long hold time,
in case some messages need to be printed, the debug_object_is_on_stack()
and debug_print_object() calls are now moved out of those critical
sections.

Signed-off-by: Waiman Long 
---
 lib/debugobjects.c | 61 +-
 1 file changed, 42 insertions(+), 19 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 403dd95..4216d3d 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -376,6 +376,8 @@ static void debug_object_is_on_stack(void *addr, int 
onstack)
struct debug_bucket *db;
struct debug_obj *obj;
unsigned long flags;
+   bool debug_printobj = false;
+   bool debug_chkstack = false;
 
fill_pool();
 
@@ -392,7 +394,7 @@ static void debug_object_is_on_stack(void *addr, int 
onstack)
debug_objects_oom();
return;
}
-   debug_object_is_on_stack(addr, onstack);
+   debug_chkstack = true;
}
 
switch (obj->state) {
@@ -403,20 +405,25 @@ static void debug_object_is_on_stack(void *addr, int 
onstack)
break;
 
case ODEBUG_STATE_ACTIVE:
-   debug_print_object(obj, "init");
state = obj->state;
raw_spin_unlock_irqrestore(>lock, flags);
+   debug_print_object(obj, "init");
debug_object_fixup(descr->fixup_init, addr, state);
return;
 
case ODEBUG_STATE_DESTROYED:
-   debug_print_object(obj, "init");
+   debug_printobj = true;
break;
default:
break;
}
 
raw_spin_unlock_irqrestore(>lock, flags);
+   if (debug_chkstack)
+   debug_object_is_on_stack(addr, onstack);
+   if (debug_printobj)
+   debug_print_object(obj, "init");
+
 }
 
 /**
@@ -474,6 +481,8 @@ int debug_object_activate(void *addr, struct 
debug_obj_descr *descr)
 
obj = lookup_object(addr, db);
if (obj) {
+   bool debug_printobj = false;
+
switch (obj->state) {
case ODEBUG_STATE_INIT:
case ODEBUG_STATE_INACTIVE:
@@ -482,14 +491,14 @@ int debug_object_activate(void *addr, struct 
debug_obj_descr *descr)
break;
 
case ODEBUG_STATE_ACTIVE:
-   debug_print_object(obj, "activate");
state = obj->state;
raw_spin_unlock_irqrestore(>lock, flags);
+   debug_print_object(obj, "activate");
ret = debug_object_fixup(descr->fixup_activate, addr, 
state);
return ret ? 0 : -EINVAL;
 
case ODEBUG_STATE_DESTROYED:
-   debug_print_object(obj, "activate");
+   debug_printobj = true;
ret = -EINVAL;
break;
default:
@@ -497,10 +506,13 @@ int debug_object_activate(void *addr, struct 
debug_obj_descr *descr)
break;
}
raw_spin_unlock_irqrestore(>lock, flags);
+   if (debug_printobj)
+   debug_print_object(obj, "activate");
return ret;
}
 
raw_spin_unlock_irqrestore(>lock, flags);
+
/*
 * We are here when a static object is activated. We
 * let the type specific code confirm whether this is
@@ -532,6 +544,7 @@ void debug_object_deactivate(void *addr, struct 
debug_obj_descr *descr)
struct debug_bucket *db;
struct debug_obj *obj;
unsigned long flags;
+   bool debug_printobj = false;
 
if (!debug_objects_enabled)
return;
@@ -549,24 +562,27 @@ void debug_object_deactivate(void *addr, struct 
debug_obj_descr *descr)
if (!obj->astate)
obj->state = ODEBUG_STATE_INACTIVE;
else
-   debug_print_object(obj, "deactivate");
+   debug_printobj = true;
break;
 
case ODEBUG_STATE_DESTROYED:
-   debug_print_object(obj, "deactivate");
+   debug_printobj = true;
break;
default:
break;
}
-   } else {
+   }
+
+   raw_spin_unlock_irqrestore(>lock, flags);
+   if (!obj) {
struct debug_obj o = { .object = addr,
   

[PATCH v2 04/17] locking/lockdep: Add DEFINE_TERMINAL_SPINLOCK() and related macros

2018-11-19 Thread Waiman Long
Add new DEFINE_RAW_TERMINAL_SPINLOCK() and DEFINE_TERMINAL_SPINLOCK()
macro to define a raw terminal spinlock and a terminal spinlock.

Signed-off-by: Waiman Long 
---
 include/linux/spinlock_types.h | 34 +++---
 kernel/printk/printk_safe.c|  2 +-
 2 files changed, 24 insertions(+), 12 deletions(-)

diff --git a/include/linux/spinlock_types.h b/include/linux/spinlock_types.h
index 24b4e6f..6a8086e 100644
--- a/include/linux/spinlock_types.h
+++ b/include/linux/spinlock_types.h
@@ -33,9 +33,10 @@
 #define SPINLOCK_OWNER_INIT((void *)-1L)
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
-# define SPIN_DEP_MAP_INIT(lockname)   .dep_map = { .name = #lockname }
+# define SPIN_DEP_MAP_INIT(lockname, f) .dep_map = { .name = #lockname, \
+.flags = f }
 #else
-# define SPIN_DEP_MAP_INIT(lockname)
+# define SPIN_DEP_MAP_INIT(lockname, f)
 #endif
 
 #ifdef CONFIG_DEBUG_SPINLOCK
@@ -47,16 +48,22 @@
 # define SPIN_DEBUG_INIT(lockname)
 #endif
 
-#define __RAW_SPIN_LOCK_INITIALIZER(lockname)  \
-   {   \
-   .raw_lock = __ARCH_SPIN_LOCK_UNLOCKED,  \
-   SPIN_DEBUG_INIT(lockname)   \
-   SPIN_DEP_MAP_INIT(lockname) }
+#define __RAW_SPIN_LOCK_INITIALIZER(lockname, f)   \
+   {   \
+   .raw_lock = __ARCH_SPIN_LOCK_UNLOCKED,  \
+   SPIN_DEBUG_INIT(lockname)   \
+   SPIN_DEP_MAP_INIT(lockname, f) }
 
 #define __RAW_SPIN_LOCK_UNLOCKED(lockname) \
-   (raw_spinlock_t) __RAW_SPIN_LOCK_INITIALIZER(lockname)
+   (raw_spinlock_t) __RAW_SPIN_LOCK_INITIALIZER(lockname, 0)
+
+#define __RAW_TERMINAL_SPIN_LOCK_UNLOCKED(lockname)\
+   (raw_spinlock_t) __RAW_SPIN_LOCK_INITIALIZER(lockname, \
+LOCKDEP_FLAG_TERMINAL)
 
 #define DEFINE_RAW_SPINLOCK(x) raw_spinlock_t x = __RAW_SPIN_LOCK_UNLOCKED(x)
+#define DEFINE_RAW_TERMINAL_SPINLOCK(x)\
+   raw_spinlock_t x = __RAW_TERMINAL_SPIN_LOCK_UNLOCKED(x)
 
 typedef struct spinlock {
union {
@@ -72,13 +79,18 @@
};
 } spinlock_t;
 
-#define __SPIN_LOCK_INITIALIZER(lockname) \
-   { { .rlock = __RAW_SPIN_LOCK_INITIALIZER(lockname) } }
+#define __SPIN_LOCK_INITIALIZER(lockname, f) \
+   { { .rlock = __RAW_SPIN_LOCK_INITIALIZER(lockname, f) } }
 
 #define __SPIN_LOCK_UNLOCKED(lockname) \
-   (spinlock_t ) __SPIN_LOCK_INITIALIZER(lockname)
+   (spinlock_t) __SPIN_LOCK_INITIALIZER(lockname, 0)
+
+#define __TERMINAL_SPIN_LOCK_UNLOCKED(lockname) \
+   (spinlock_t) __SPIN_LOCK_INITIALIZER(lockname, LOCKDEP_FLAG_TERMINAL)
 
 #define DEFINE_SPINLOCK(x) spinlock_t x = __SPIN_LOCK_UNLOCKED(x)
+#define DEFINE_TERMINAL_SPINLOCK(x) \
+   spinlock_t x = __TERMINAL_SPIN_LOCK_UNLOCKED(x)
 
 #include 
 
diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
index 0913b4d..8ff1033 100644
--- a/kernel/printk/printk_safe.c
+++ b/kernel/printk/printk_safe.c
@@ -192,7 +192,7 @@ static void report_message_lost(struct printk_safe_seq_buf 
*s)
 static void __printk_safe_flush(struct irq_work *work)
 {
static raw_spinlock_t read_lock =
-   __RAW_SPIN_LOCK_INITIALIZER(read_lock);
+   __RAW_SPIN_LOCK_UNLOCKED(read_lock);
struct printk_safe_seq_buf *s =
container_of(work, struct printk_safe_seq_buf, work);
unsigned long flags;
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 06/17] debugobjects: Mark pool_lock as a terminal lock

2018-11-19 Thread Waiman Long
By marking the internal pool_lock as a terminal lock, lockdep will be
able to skip full validation to improve locking performance.

Signed-off-by: Waiman Long 
---
 lib/debugobjects.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 70935ed..403dd95 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -39,7 +39,7 @@ struct debug_bucket {
 
 static struct debug_objobj_static_pool[ODEBUG_POOL_SIZE] 
__initdata;
 
-static DEFINE_RAW_SPINLOCK(pool_lock);
+static DEFINE_RAW_TERMINAL_SPINLOCK(pool_lock);
 
 static HLIST_HEAD(obj_pool);
 static HLIST_HEAD(obj_to_free);
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 02/17] locking/lockdep: Rework lockdep_set_novalidate_class()

2018-11-19 Thread Waiman Long
The current lockdep_set_novalidate_class() implementation is like
a hack. It assigns a special class key for that lock and calls
lockdep_init_map() twice.

This patch changes the implementation to make it as a flag bit instead.
This will allow other special locking class types to be defined and
used in the lockdep code.  A new "type" field is added to both the
lockdep_map and lock_class structures.

The new field can now be used to designate a lock and a class object
as novalidate. The lockdep_set_novalidate_class() call, however, should
be called only after lock initialization which calls lockdep_init_map().

For 64-bit architectures, the new type field won't increase the size
of the lock_class structure. The lockdep_map structure won't change as
well for 64-bit architectures with CONFIG_LOCK_STAT configured.

Please note that lockdep_set_novalidate_class() should not be used at
all unless there is overwhelming reason to do so.  Hopefully we can
retired it in the near future.

Signed-off-by: Waiman Long 
---
 include/linux/lockdep.h  | 17 ++---
 kernel/locking/lockdep.c | 14 +++---
 2 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index c5335df..8fe5b4f 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -58,8 +58,6 @@ struct lock_class_key {
struct lockdep_subclass_key subkeys[MAX_LOCKDEP_SUBCLASSES];
 };
 
-extern struct lock_class_key __lockdep_no_validate__;
-
 #define LOCKSTAT_POINTS4
 
 /*
@@ -94,6 +92,11 @@ struct lock_class {
struct list_headlocks_after, locks_before;
 
/*
+* Lock class type flags
+*/
+   unsigned intflags;
+
+   /*
 * Generation counter, when doing certain classes of graph walking,
 * to ensure that we check one node only once:
 */
@@ -140,6 +143,12 @@ struct lock_class_stats {
 #endif
 
 /*
+ * Lockdep class type flags
+ * 1) LOCKDEP_FLAG_NOVALIDATE: No full validation, just simple checks.
+ */
+#define LOCKDEP_FLAG_NOVALIDATE(1 << 0)
+
+/*
  * Map the lock object (the lock instance) to the lock-class object.
  * This is embedded into specific lock instances:
  */
@@ -147,6 +156,7 @@ struct lockdep_map {
struct lock_class_key   *key;
struct lock_class   
*class_cache[NR_LOCKDEP_CACHING_CLASSES];
const char  *name;
+   unsigned intflags;
 #ifdef CONFIG_LOCK_STAT
int cpu;
unsigned long   ip;
@@ -294,7 +304,8 @@ extern void lockdep_init_map(struct lockdep_map *lock, 
const char *name,
 (lock)->dep_map.key, sub)
 
 #define lockdep_set_novalidate_class(lock) \
-   lockdep_set_class_and_name(lock, &__lockdep_no_validate__, #lock)
+   do { (lock)->dep_map.flags |= LOCKDEP_FLAG_NOVALIDATE; } while (0)
+
 /*
  * Compare locking classes
  */
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 1efada2..493b567 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -692,10 +692,11 @@ static int count_matching_names(struct lock_class 
*new_class)
hlist_for_each_entry_rcu(class, hash_head, hash_entry) {
if (class->key == key) {
/*
-* Huh! same key, different name? Did someone trample
-* on some memory? We're most confused.
+* Huh! same key, different name or flags? Did someone
+* trample on some memory? We're most confused.
 */
-   WARN_ON_ONCE(class->name != lock->name);
+   WARN_ON_ONCE((class->name  != lock->name) ||
+(class->flags != lock->flags));
return class;
}
}
@@ -788,6 +789,7 @@ static bool assign_lock_key(struct lockdep_map *lock)
debug_atomic_inc(nr_unused_locks);
class->key = key;
class->name = lock->name;
+   class->flags = lock->flags;
class->subclass = subclass;
INIT_LIST_HEAD(>lock_entry);
INIT_LIST_HEAD(>locks_before);
@@ -3108,6 +3110,7 @@ static void __lockdep_init_map(struct lockdep_map *lock, 
const char *name,
return;
}
 
+   lock->flags = 0;
lock->name = name;
 
/*
@@ -3152,9 +3155,6 @@ void lockdep_init_map(struct lockdep_map *lock, const 
char *name,
 }
 EXPORT_SYMBOL_GPL(lockdep_init_map);
 
-struct lock_class_key __lockdep_no_validate__;
-EXPORT_SYMBOL_GPL(__lockdep_no_validate__);
-
 static int
 print_lock_nested_lock_not_held(struct task_struct *curr,
struct held_lock *hlock,
@@ -3215,7 +3215,7 @@ static int __lock_a

[PATCH v2 03/17] locking/lockdep: Add a new terminal lock type

2018-11-19 Thread Waiman Long
A terminal lock is a lock where further locking or unlocking on another
lock is not allowed. IOW, no forward dependency is permitted.

With such a restriction in place, we don't really need to do a full
validation of the lock chain involving a terminal lock.  Instead,
we just check if there is any further locking or unlocking on another
lock when a terminal lock is being held.

Only spinlocks which are acquired by the _irq or _irqsave variants
or in IRQ disabled context should be classified as terminal locks.

By adding this new lock type, we can save entries in lock_chains[],
chain_hlocks[], list_entries[] and stack_trace[]. By marking suitable
locks as terminal, we reduce the chance of overflowing those tables
allowing them to focus on locks that can have both forward and backward
dependencies.

Four bits are stolen from the pin_count of the held_lock structure
to hold a new 4-bit flags field. The pin_count field is essentially a
summation of 16-bit random cookie values. Removing 4 bits still allow
the pin_count to accumulate up to almost 4096 of those cookie values.

Signed-off-by: Waiman Long 
---
 include/linux/lockdep.h| 29 ---
 kernel/locking/lockdep.c   | 47 --
 kernel/locking/lockdep_internals.h |  5 
 kernel/locking/lockdep_proc.c  | 11 +++--
 4 files changed, 80 insertions(+), 12 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 8fe5b4f..a146bca 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -144,9 +144,20 @@ struct lock_class_stats {
 
 /*
  * Lockdep class type flags
+ *
  * 1) LOCKDEP_FLAG_NOVALIDATE: No full validation, just simple checks.
+ * 2) LOCKDEP_FLAG_TERMINAL: This is a terminal lock where lock/unlock on
+ *another lock within its critical section is not allowed.
+ *
+ * Only the least significant 4 bits of the flags will be copied to the
+ * held_lock structure.
  */
-#define LOCKDEP_FLAG_NOVALIDATE(1 << 0)
+#define LOCKDEP_FLAG_TERMINAL  (1 << 0)
+#define LOCKDEP_FLAG_NOVALIDATE(1 << 4)
+
+#define LOCKDEP_HLOCK_FLAGS_MASK   0x0f
+#define LOCKDEP_NOCHECK_FLAGS  (LOCKDEP_FLAG_NOVALIDATE |\
+LOCKDEP_FLAG_TERMINAL)
 
 /*
  * Map the lock object (the lock instance) to the lock-class object.
@@ -263,7 +274,16 @@ struct held_lock {
unsigned int check:1;   /* see lock_acquire() comment */
unsigned int hardirqs_off:1;
unsigned int references:12; /* 32 
bits */
-   unsigned int pin_count;
+   /*
+* Four bits are stolen from pin_count for flags so as not to
+* increase the size of the structure. The stolen bits may not
+* be enough in the future as more flag bits are added. However,
+* not all of them may need to be checked in the held_lock
+* structure. We just have to make sure that the the relevant
+* ones will be in the 4 least significant bits.
+*/
+   unsigned int pin_count:28;
+   unsigned int flags:4;
 };
 
 /*
@@ -305,6 +325,8 @@ extern void lockdep_init_map(struct lockdep_map *lock, 
const char *name,
 
 #define lockdep_set_novalidate_class(lock) \
do { (lock)->dep_map.flags |= LOCKDEP_FLAG_NOVALIDATE; } while (0)
+#define lockdep_set_terminal_class(lock) \
+   do { (lock)->dep_map.flags |= LOCKDEP_FLAG_TERMINAL; } while (0)
 
 /*
  * Compare locking classes
@@ -420,7 +442,8 @@ static inline void lockdep_on(void)
do { (void)(key); } while (0)
 #define lockdep_set_subclass(lock, sub)do { } while (0)
 
-#define lockdep_set_novalidate_class(lock) do { } while (0)
+#define lockdep_set_novalidate_class(lock) do { } while (0)
+#define lockdep_set_terminal_class(lock)   do { } while (0)
 
 /*
  * We don't define lockdep_match_class() and lockdep_match_key() for !LOCKDEP
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 493b567..40894c1 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -3020,6 +3020,11 @@ static inline int separate_irq_context(struct 
task_struct *curr,
 
 #endif /* defined(CONFIG_TRACE_IRQFLAGS) && defined(CONFIG_PROVE_LOCKING) */
 
+static int hlock_is_terminal(struct held_lock *hlock)
+{
+   return flags_is_terminal(hlock->flags);
+}
+
 /*
  * Mark a lock with a usage bit, and validate the state transition:
  */
@@ -3047,7 +3052,11 @@ static int mark_lock(struct task_struct *curr, struct 
held_lock *this,
 
hlock_class(this)->usage_mask |= new_mask;
 
-   if (!save_trace(hlock_class(this)->usage_traces + new_bit))
+   /*
+* We don't need to save the stack trace for terminal locks.
+*/
+   if (!hlock_is_terminal(this) &&
+   !save_trace(hlock_class(this)->usage_traces + new_bit))
return 0;
 
   

[PATCH v2 01/17] locking/lockdep: Remove version from lock_class structure

2018-11-19 Thread Waiman Long
It turns out the version field in the lock_class structure isn't used
anywhere. Just remove it.

Signed-off-by: Waiman Long 
---
 include/linux/lockdep.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 1fd82ff..c5335df 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -97,8 +97,6 @@ struct lock_class {
 * Generation counter, when doing certain classes of graph walking,
 * to ensure that we check one node only once:
 */
-   unsigned intversion;
-
int name_version;
const char  *name;
 
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 05/17] printk: Mark logbuf_lock & console_owner_lock as terminal locks

2018-11-19 Thread Waiman Long
By marking logbuf_lock and console_owner_lock as terminal locks,
it reduces the performance overhead when those locks are used with
lockdep enabled.

Signed-off-by: Waiman Long 
---
 kernel/printk/printk.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 1b2a029..bdbbe31 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -367,7 +367,7 @@ __packed __aligned(4)
  * within the scheduler's rq lock. It must be released before calling
  * console_unlock() or anything else that might wake up a process.
  */
-DEFINE_RAW_SPINLOCK(logbuf_lock);
+DEFINE_RAW_TERMINAL_SPINLOCK(logbuf_lock);
 
 /*
  * Helper macros to lock/unlock logbuf_lock and switch between
@@ -1568,7 +1568,7 @@ int do_syslog(int type, char __user *buf, int len, int 
source)
 };
 #endif
 
-static DEFINE_RAW_SPINLOCK(console_owner_lock);
+static DEFINE_RAW_TERMINAL_SPINLOCK(console_owner_lock);
 static struct task_struct *console_owner;
 static bool console_waiter;
 
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 00/17] locking/lockdep: Add a new class of terminal locks

2018-11-19 Thread Waiman Long
 v1->v2:
  - Mark more locks as terminal.
  - Add a patch to remove the unused version field from lock_class.
  - Steal some bits from pin_count of held_lock for flags.
  - Add a patch to warn if a task holding a raw spinlock is acquiring an
non-raw lock.

The purpose of this patchset is to add a new class of locks called
terminal locks and converts some of the low level raw or regular
spinlocks to terminal locks. A terminal lock does not have forward
dependency and it won't allow a lock or unlock operation on another
lock. Two level nesting of terminal locks is allowed, though.

Only spinlocks that are acquired with the _irq/_irqsave variants or
acquired in an IRQ disabled context should be classified as terminal
locks.

Because of the restrictions on terminal locks, we can do simple checks on
them without using the lockdep lock validation machinery. The advantages
of making these changes are as follows:

 1) It saves table entries used by the validation code and hence make
it harder to overflow those tables.

 2) The lockdep check in __lock_acquire() will be a little bit faster
for terminal locks without using the lock validation code.

In fact, it is possible to overflow some of the tables by running
a variety of different workloads on a debug kernel. I have seen bug
reports about exhausting MAX_LOCKDEP_KEYS, MAX_LOCKDEP_ENTRIES and
MAX_STACK_TRACE_ENTRIES. This patch will help to reduce the chance
of overflowing some of the tables.

Below were selected output lines from the lockdep_stats files of the
patched and unpatched kernels after bootup and running parallel kernel
builds and some perf benchmarks.

  Item Unpatched kernel  Patched kernel  % Change
     --  
  direct dependencies   9924 9032  -9.0%
  dependency chains1825816326 -10.6%
  dependency chain hlocks  7319864927 -11.3%
  stack-trace entries 110502   103225  -6.6%

There were some reductions in the size of the lockdep tables. They were
not significant, but it is still a good start to rein in the number of
entries in those tables to make it harder to overflow them.

In term of performance, there isn't that much noticeable differences
in both the kernel build and perf benchmark. Low level locking
microbenchmark with 4 locking threads shows the following locking rates
on a Haswell system:

   Kernel Lock Rate
   --  
   Unpatched kernelRegular lock 2,288 kop/s
   Patched kernel  Regular lock 2,352 kop/s
   Terminal lock2,512 kop/s

I was not sure why there was a slight performance improvement with
the patched kernel. However, comparing the regular and terminal lock
results, there was a slight 7% improvement in locking throughput for
terminal locks.

Looking at the perf ouput for regular lock:

   5.43%  run-locktest  [kernel.vmlinux]  [k] __lock_acquire
   4.65%  run-locktest  [kernel.vmlinux]  [k] lock_contended
   2.80%  run-locktest  [kernel.vmlinux]  [k] lock_acquired
   2.53%  run-locktest  [kernel.vmlinux]  [k] lock_release
   1.42%  run-locktest  [kernel.vmlinux]  [k] mark_lock

For terminal lock:

   5.00%  run-locktest  [kernel.vmlinux]  [k] __lock_acquire
   4.66%  run-locktest  [kernel.vmlinux]  [k] lock_contended
   2.88%  run-locktest  [kernel.vmlinux]  [k] lock_acquired
   2.55%  run-locktest  [kernel.vmlinux]  [k] lock_release
   1.34%  run-locktest  [kernel.vmlinux]  [k] mark_lock

The __lock_acquire() function ran a bit faster with terminal lock,
but the other lockdep functions remain more or less the same in term
of performance.

In term internal lockdep structure sizes, there should be no size
increase for 64-bit architectures with CONFIG_LOCK_STAT defined.
The lockdep_map structure will increase in size for 32-bit architectures
or when CONFIG_LOCK_STAT isn't defined.

Waiman Long (17):
  locking/lockdep: Remove version from lock_class structure
  locking/lockdep: Rework lockdep_set_novalidate_class()
  locking/lockdep: Add a new terminal lock type
  locking/lockdep: Add DEFINE_TERMINAL_SPINLOCK() and related macros
  printk: Mark logbuf_lock & console_owner_lock as terminal locks
  debugobjects: Mark pool_lock as a terminal lock
  debugobjects: Move printk out of db lock critical sections
  locking/lockdep: Add support for nestable terminal locks
  debugobjects: Make object hash locks nestable terminal locks
  lib/stackdepot: Make depot_lock a terminal spinlock
  locking/rwsem: Mark rwsem.wait_lock as a terminal lock
  cgroup: Mark the rstat percpu lock as terminal
  mm/kasan: Make quarantine_lock a terminal lock
  dma-debug: Mark free_entries_lock as terminal
  kernfs: Mark kernfs_open_node_lock as terminal lock
  delay_acct: Mark task's delays->lock as terminal spinlock
  locking/lockdep: