Re: [RFC PATCH] Add split_lock

2021-04-09 Thread Thomas Gleixner
On Thu, Apr 08 2021 at 12:18, Peter Zijlstra wrote:
> On Thu, Apr 08, 2021 at 06:23:38AM +0100, Matthew Wilcox (Oracle) wrote:
>> bit_spinlocks are horrible on RT because there's absolutely nowhere
>> to put the mutex to sleep on.  They also do not participate in lockdep
>> because there's nowhere to put the map.
>> 
>> Most (all?) bit spinlocks are actually a split lock; logically they
>> could be treated as a single spinlock, but for performance, we want to
>> split the lock over many objects.  Introduce the split_lock as somewhere
>> to store the lockdep map and as somewhere that the RT kernel can put
>> a mutex.  It may also let us store a ticket lock for better performance
>> on non-RT kernels in the future, but I have left the current cpu_relax()
>> implementation intact for now.
>
> I think I like it, but I'm not sure it'll work for RT as is. It's a bit
> like qrwlock in that it only uses the internal (split) lock for
> contention, but that doesn't work for PI.
>
> I've not recently looked at RT, but I think they simply used to bloat a
> number of the data structures with a real lock. Sebastian and Thomas
> will know better.

Correct, but it depends on the nature of the bit spinlock and the lock
held times. Some of them we just keep as is as it's not worth the
trouble.

Thanks,

tglx


Re: [RFC PATCH] Add split_lock

2021-04-08 Thread Peter Zijlstra
On Thu, Apr 08, 2021 at 06:23:38AM +0100, Matthew Wilcox (Oracle) wrote:
> bit_spinlocks are horrible on RT because there's absolutely nowhere
> to put the mutex to sleep on.  They also do not participate in lockdep
> because there's nowhere to put the map.
> 
> Most (all?) bit spinlocks are actually a split lock; logically they
> could be treated as a single spinlock, but for performance, we want to
> split the lock over many objects.  Introduce the split_lock as somewhere
> to store the lockdep map and as somewhere that the RT kernel can put
> a mutex.  It may also let us store a ticket lock for better performance
> on non-RT kernels in the future, but I have left the current cpu_relax()
> implementation intact for now.

I think I like it, but I'm not sure it'll work for RT as is. It's a bit
like qrwlock in that it only uses the internal (split) lock for
contention, but that doesn't work for PI.

I've not recently looked at RT, but I think they simply used to bloat a
number of the data structures with a real lock. Sebastian and Thomas
will know better.


[RFC PATCH] Add split_lock

2021-04-07 Thread Matthew Wilcox (Oracle)
bit_spinlocks are horrible on RT because there's absolutely nowhere
to put the mutex to sleep on.  They also do not participate in lockdep
because there's nowhere to put the map.

Most (all?) bit spinlocks are actually a split lock; logically they
could be treated as a single spinlock, but for performance, we want to
split the lock over many objects.  Introduce the split_lock as somewhere
to store the lockdep map and as somewhere that the RT kernel can put
a mutex.  It may also let us store a ticket lock for better performance
on non-RT kernels in the future, but I have left the current cpu_relax()
implementation intact for now.

The API change breaks all users except for the two which have been
converted.  This is an RFC, and I'm willing to fix all the rest.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 fs/dcache.c  | 25 ++--
 include/linux/bit_spinlock.h | 36 ++---
 include/linux/list_bl.h  |  9 
 include/linux/split_lock.h   | 45 
 mm/slub.c|  6 +++--
 5 files changed, 84 insertions(+), 37 deletions(-)
 create mode 100644 include/linux/split_lock.h

diff --git a/fs/dcache.c b/fs/dcache.c
index 7d24ff7eb206..a3861d330001 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -96,6 +96,7 @@ EXPORT_SYMBOL(slash_name);
 
 static unsigned int d_hash_shift __read_mostly;
 
+static DEFINE_SPLIT_LOCK(d_hash_lock);
 static struct hlist_bl_head *dentry_hashtable __read_mostly;
 
 static inline struct hlist_bl_head *d_hash(unsigned int hash)
@@ -469,9 +470,9 @@ static void ___d_drop(struct dentry *dentry)
else
b = d_hash(dentry->d_name.hash);
 
-   hlist_bl_lock(b);
+   hlist_bl_lock(b, &d_hash_lock);
__hlist_bl_del(&dentry->d_hash);
-   hlist_bl_unlock(b);
+   hlist_bl_unlock(b, &d_hash_lock);
 }
 
 void __d_drop(struct dentry *dentry)
@@ -2074,9 +2075,9 @@ static struct dentry *__d_instantiate_anon(struct dentry 
*dentry,
__d_set_inode_and_type(dentry, inode, add_flags);
hlist_add_head(&dentry->d_u.d_alias, &inode->i_dentry);
if (!disconnected) {
-   hlist_bl_lock(&dentry->d_sb->s_roots);
+   hlist_bl_lock(&dentry->d_sb->s_roots, &d_hash_lock);
hlist_bl_add_head(&dentry->d_hash, &dentry->d_sb->s_roots);
-   hlist_bl_unlock(&dentry->d_sb->s_roots);
+   hlist_bl_unlock(&dentry->d_sb->s_roots, &d_hash_lock);
}
spin_unlock(&dentry->d_lock);
spin_unlock(&inode->i_lock);
@@ -2513,9 +2514,9 @@ static void __d_rehash(struct dentry *entry)
 {
struct hlist_bl_head *b = d_hash(entry->d_name.hash);
 
-   hlist_bl_lock(b);
+   hlist_bl_lock(b, &d_hash_lock);
hlist_bl_add_head_rcu(&entry->d_hash, b);
-   hlist_bl_unlock(b);
+   hlist_bl_unlock(b, &d_hash_lock);
 }
 
 /**
@@ -2606,9 +2607,9 @@ struct dentry *d_alloc_parallel(struct dentry *parent,
goto retry;
}
 
-   hlist_bl_lock(b);
+   hlist_bl_lock(b, &d_hash_lock);
if (unlikely(READ_ONCE(parent->d_inode->i_dir_seq) != seq)) {
-   hlist_bl_unlock(b);
+   hlist_bl_unlock(b, &d_hash_lock);
rcu_read_unlock();
goto retry;
}
@@ -2626,7 +2627,7 @@ struct dentry *d_alloc_parallel(struct dentry *parent,
continue;
if (!d_same_name(dentry, parent, name))
continue;
-   hlist_bl_unlock(b);
+   hlist_bl_unlock(b, &d_hash_lock);
/* now we can try to grab a reference */
if (!lockref_get_not_dead(&dentry->d_lockref)) {
rcu_read_unlock();
@@ -2664,7 +2665,7 @@ struct dentry *d_alloc_parallel(struct dentry *parent,
new->d_flags |= DCACHE_PAR_LOOKUP;
new->d_wait = wq;
hlist_bl_add_head_rcu(&new->d_u.d_in_lookup_hash, b);
-   hlist_bl_unlock(b);
+   hlist_bl_unlock(b, &d_hash_lock);
return new;
 mismatch:
spin_unlock(&dentry->d_lock);
@@ -2677,12 +2678,12 @@ void __d_lookup_done(struct dentry *dentry)
 {
struct hlist_bl_head *b = in_lookup_hash(dentry->d_parent,
 dentry->d_name.hash);
-   hlist_bl_lock(b);
+   hlist_bl_lock(b, &d_hash_lock);
dentry->d_flags &= ~DCACHE_PAR_LOOKUP;
__hlist_bl_del(&dentry->d_u.d_in_lookup_hash);
wake_up_all(dentry->d_wait);
dentry->d_wait = NULL;
-   hlist_bl_unlock(b);
+   hlist_bl_unlock(b, &d_hash_lock);
INIT_HLIST_NODE(&dentry->d_u.d_alias);
INIT_LIST_HEAD(&dentry->d_lru);
 }
diff --git a/include/linux/bit_spinlock.h b/include/linux/bit_spinlock.h
index bbc4730a6505..641623d471b0 100644
--- a/include/linux/bit_spinlock.h
+++ b/include/linux/bit_spinlock.h
@@ -2,6 +2,7 @@
 #ifndef __LINUX_BIT_SPINLOCK_H
 #define __LINUX_BIT_SPINLOCK_H
 
+#include 
 #