v1: https://lists.gnu.org/archive/html/qemu-devel/2016-05/msg03661.html
Patch 1 hasn't changed from v1 (where it was patch 2 though). Patches 2 and 3 fix a not-so-small-after-all RCU performance regression we introduced when transitioning to __atomic primitives. I got an arm64 machine to test today and a workload that issues a lot of atomic_read_rcu's, such as a 100%-lookup qht-bench test, can gain ~12% in performance. [ in v1's 0/2 message I mentioned rcutorture; It turns out it's not as dependent on atomic_read_rcu as I thought, so it's not a good benchmark to measure this effect. ] Thanks, Emilio