[Qemu-devel] [PULL 01/11] rcu: add rcu library

2015-01-30 Thread Paolo Bonzini
This includes a (mangled) copy of the liburcu code.  The main changes
are: 1) removing dependencies on many other header files in liburcu; 2)
removing for simplicity the tentative busy waiting in synchronize_rcu,
which has limited performance effects; 3) replacing futexes in
synchronize_rcu with QemuEvents for Win32 portability.  The API is
the same as liburcu, so it should be possible in the future to require
liburcu on POSIX systems for example and use our copy only on Windows.

Among the various versions available I chose urcu-mb, which is the
least invasive implementation even though it does not have the
fastest rcu_read_{lock,unlock} implementation.  The urcu flavor can
be changed later, after benchmarking.

Reviewed-by: Fam Zheng f...@redhat.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 docs/rcu.txt  | 285 ++
 hw/9pfs/virtio-9p-synth.c |   1 +
 include/qemu/atomic.h |  61 ++
 include/qemu/queue.h  |  13 +++
 include/qemu/rcu.h| 112 ++
 include/qemu/thread.h |   3 -
 util/Makefile.objs|   1 +
 util/rcu.c| 172 
 8 files changed, 645 insertions(+), 3 deletions(-)
 create mode 100644 docs/rcu.txt
 create mode 100644 include/qemu/rcu.h
 create mode 100644 util/rcu.c

diff --git a/docs/rcu.txt b/docs/rcu.txt
new file mode 100644
index 000..9938ad3
--- /dev/null
+++ b/docs/rcu.txt
@@ -0,0 +1,285 @@
+Using RCU (Read-Copy-Update) for synchronization
+
+
+Read-copy update (RCU) is a synchronization mechanism that is used to
+protect read-mostly data structures.  RCU is very efficient and scalable
+on the read side (it is wait-free), and thus can make the read paths
+extremely fast.
+
+RCU supports concurrency between a single writer and multiple readers,
+thus it is not used alone.  Typically, the write-side will use a lock to
+serialize multiple updates, but other approaches are possible (e.g.,
+restricting updates to a single task).  In QEMU, when a lock is used,
+this will often be the iothread mutex, also known as the big QEMU
+lock (BQL).  Also, restricting updates to a single task is done in
+QEMU using the bottom half API.
+
+RCU is fundamentally a wait-to-finish mechanism.  The read side marks
+sections of code with critical sections, and the update side will wait
+for the execution of all *currently running* critical sections before
+proceeding, or before asynchronously executing a callback.
+
+The key point here is that only the currently running critical sections
+are waited for; critical sections that are started _after_ the beginning
+of the wait do not extend the wait, despite running concurrently with
+the updater.  This is the reason why RCU is more scalable than,
+for example, reader-writer locks.  It is so much more scalable that
+the system will have a single instance of the RCU mechanism; a single
+mechanism can be used for an arbitrary number of things, without
+having to worry about things such as contention or deadlocks.
+
+How is this possible?  The basic idea is to split updates in two phases,
+removal and reclamation.  During removal, we ensure that subsequent
+readers will not be able to get a reference to the old data.  After
+removal has completed, a critical section will not be able to access
+the old data.  Therefore, critical sections that begin after removal
+do not matter; as soon as all previous critical sections have finished,
+there cannot be any readers who hold references to the data structure,
+and these can now be safely reclaimed (e.g., freed or unref'ed).
+
+Here is a picutre:
+
+thread 1  thread 2  thread 3
+------
+enter RCU crit.sec.
+   |finish removal phase
+   |begin wait
+   |  |enter RCU crit.sec.
+exit RCU crit.sec |   |
+complete wait |
+begin reclamation phase   |
+   exit RCU crit.sec.
+
+
+Note how thread 3 is still executing its critical section when thread 2
+starts reclaiming data.  This is possible, because the old version of the
+data structure was not accessible at the time thread 3 began executing
+that critical section.
+
+
+RCU API
+===
+
+The core RCU API is small:
+
+ void rcu_read_lock(void);
+
+Used by a reader to inform the reclaimer that the reader is
+entering an RCU read-side critical section.
+
+ void rcu_read_unlock(void);
+
+Used by a reader to inform the reclaimer that the reader is
+exiting an RCU read-side critical section.  Note that RCU
+read-side critical sections may be nested and/or 

[Qemu-devel] [PULL 01/11] rcu: add rcu library

2015-01-30 Thread Paolo Bonzini
This includes a (mangled) copy of the liburcu code.  The main changes
are: 1) removing dependencies on many other header files in liburcu; 2)
removing for simplicity the tentative busy waiting in synchronize_rcu,
which has limited performance effects; 3) replacing futexes in
synchronize_rcu with QemuEvents for Win32 portability.  The API is
the same as liburcu, so it should be possible in the future to require
liburcu on POSIX systems for example and use our copy only on Windows.

Among the various versions available I chose urcu-mb, which is the
least invasive implementation even though it does not have the
fastest rcu_read_{lock,unlock} implementation.  The urcu flavor can
be changed later, after benchmarking.

Reviewed-by: Fam Zheng f...@redhat.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 docs/rcu.txt  | 285 ++
 hw/9pfs/virtio-9p-synth.c |   1 +
 include/qemu/atomic.h |  61 ++
 include/qemu/queue.h  |  13 +++
 include/qemu/rcu.h| 112 ++
 include/qemu/thread.h |   3 -
 util/Makefile.objs|   1 +
 util/rcu.c| 172 
 8 files changed, 645 insertions(+), 3 deletions(-)
 create mode 100644 docs/rcu.txt
 create mode 100644 include/qemu/rcu.h
 create mode 100644 util/rcu.c

diff --git a/docs/rcu.txt b/docs/rcu.txt
new file mode 100644
index 000..9938ad3
--- /dev/null
+++ b/docs/rcu.txt
@@ -0,0 +1,285 @@
+Using RCU (Read-Copy-Update) for synchronization
+
+
+Read-copy update (RCU) is a synchronization mechanism that is used to
+protect read-mostly data structures.  RCU is very efficient and scalable
+on the read side (it is wait-free), and thus can make the read paths
+extremely fast.
+
+RCU supports concurrency between a single writer and multiple readers,
+thus it is not used alone.  Typically, the write-side will use a lock to
+serialize multiple updates, but other approaches are possible (e.g.,
+restricting updates to a single task).  In QEMU, when a lock is used,
+this will often be the iothread mutex, also known as the big QEMU
+lock (BQL).  Also, restricting updates to a single task is done in
+QEMU using the bottom half API.
+
+RCU is fundamentally a wait-to-finish mechanism.  The read side marks
+sections of code with critical sections, and the update side will wait
+for the execution of all *currently running* critical sections before
+proceeding, or before asynchronously executing a callback.
+
+The key point here is that only the currently running critical sections
+are waited for; critical sections that are started _after_ the beginning
+of the wait do not extend the wait, despite running concurrently with
+the updater.  This is the reason why RCU is more scalable than,
+for example, reader-writer locks.  It is so much more scalable that
+the system will have a single instance of the RCU mechanism; a single
+mechanism can be used for an arbitrary number of things, without
+having to worry about things such as contention or deadlocks.
+
+How is this possible?  The basic idea is to split updates in two phases,
+removal and reclamation.  During removal, we ensure that subsequent
+readers will not be able to get a reference to the old data.  After
+removal has completed, a critical section will not be able to access
+the old data.  Therefore, critical sections that begin after removal
+do not matter; as soon as all previous critical sections have finished,
+there cannot be any readers who hold references to the data structure,
+and these can now be safely reclaimed (e.g., freed or unref'ed).
+
+Here is a picutre:
+
+thread 1  thread 2  thread 3
+------
+enter RCU crit.sec.
+   |finish removal phase
+   |begin wait
+   |  |enter RCU crit.sec.
+exit RCU crit.sec |   |
+complete wait |
+begin reclamation phase   |
+   exit RCU crit.sec.
+
+
+Note how thread 3 is still executing its critical section when thread 2
+starts reclaiming data.  This is possible, because the old version of the
+data structure was not accessible at the time thread 3 began executing
+that critical section.
+
+
+RCU API
+===
+
+The core RCU API is small:
+
+ void rcu_read_lock(void);
+
+Used by a reader to inform the reclaimer that the reader is
+entering an RCU read-side critical section.
+
+ void rcu_read_unlock(void);
+
+Used by a reader to inform the reclaimer that the reader is
+exiting an RCU read-side critical section.  Note that RCU
+read-side critical sections may be nested and/or