On Thu, Apr 27, 2023 at 03:19:12PM +0100, Richard W.M. Jones wrote: > We've long known that nbdkit-memory-plugin with the default sparse > array allocator suffers because a global lock must be taken whenever > any read or write operation is performed. This commit aims to safely > improve performance by converting the lock into a read-write lock.
I searched for other approaches, and found this to be an interesting read: https://stackoverflow.com/questions/2407558/pthreads-reader-writer-locks-upgrading-read-lock-to-write-lock One of the suggestions there is using fcntl() locking (which does support upgrade attempts) instead of pthread_rwlock*. Might be interesting to code up and compare an approach along those lines. > +++ b/common/allocators/sparse.c > @@ -129,11 +129,20 @@ DEFINE_VECTOR_TYPE (l1_dir, struct l1_entry); > struct sparse_array { > struct allocator a; /* Must come first. */ > > - /* This lock is highly contended. When hammering the memory plugin > - * with 8 fio threads, about 30% of the total system time was taken > - * just waiting for this lock. Fixing this is quite hard. > + /* The shared (read) lock must be held if you just want to access > + * the data without modifying any of the L1/L2 metadata or > + * allocating or freeing any page. > + * > + * To modify the L1/L2 metadata including allocating or freeing any > + * page you must hold the exclusive (write) lock. > + * > + * Because POSIX rwlocks are not upgradable this presents a problem. > + * We solve it below by speculatively performing the request while > + * holding the shared lock, but if we run into an operation that > + * needs to update the metadata, we restart the entire request > + * holding the exclusive lock. > */ > - pthread_mutex_t lock; > + pthread_rwlock_t lock; But as written, this does look like a nice division of labor that reduces serialization in the common case. And yes, it does look like posix_rwlock is a bit heavier-weight than bare mutex, which explains the slowdown on the read-heavy workload even while improving the write-heavy workload. Reviewed-by: Eric Blake <ebl...@redhat.com> -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org _______________________________________________ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs