Excerpts from Waiman Long's message of July 8, 2020 1:33 pm: > On 7/7/20 1:57 AM, Nicholas Piggin wrote: >> Yes, powerpc could certainly get more performance out of the slow >> paths, and then there are a few parameters to tune. >> >> We don't have a good alternate patching for function calls yet, but >> that would be something to do for native vs pv. >> >> And then there seem to be one or two tunable parameters we could >> experiment with. >> >> The paravirt locks may need a bit more tuning. Some simple testing >> under KVM shows we might be a bit slower in some cases. Whether this >> is fairness or something else I'm not sure. The current simple pv >> spinlock code can do a directed yield to the lock holder CPU, whereas >> the pv qspl here just does a general yield. I think we might actually >> be able to change that to also support directed yield. Though I'm >> not sure if this is actually the cause of the slowdown yet. > > Regarding the paravirt lock, I have taken a further look into the > current PPC spinlock code. There is an equivalent of pv_wait() but no > pv_kick(). Maybe PPC doesn't really need that.
So powerpc has two types of wait, either undirected "all processors" or directed to a specific processor which has been preempted by the hypervisor. The simple spinlock code does a directed wait, because it knows the CPU which is holding the lock. In this case, there is a sequence that is used to ensure we don't wait if the condition has become true, and the target CPU does not need to kick the waiter it will happen automatically (see splpar_spin_yield). This is preferable because we only wait as needed and don't require the kick operation. The pv spinlock code I did uses the undirected wait, because we don't know the CPU number which we are waiting on. This is undesirable because it's higher overhead and the wait is not so accurate. I think perhaps we could change things so we wait on the correct CPU when queued, which might be good enough (we could also put the lock owner CPU in the spinlock word, if we add another format). > Attached are two > additional qspinlock patches that adds a CONFIG_PARAVIRT_QSPINLOCKS_LITE > option to not require pv_kick(). There is also a fixup patch to be > applied after your patchset. > > I don't have access to a PPC LPAR with shared processor at the moment, > so I can't test the performance of the paravirt code. Would you mind > adding my patches and do some performance test on your end to see if it > gives better result? Great, I'll do some tests. Any suggestions for what to try? Thanks, Nick