On 07/15/2015 06:01 AM, Peter Zijlstra wrote:
On Tue, Jul 14, 2015 at 10:13:37PM -0400, Waiman Long wrote:+static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock *prev) { struct pv_node *pn = (struct pv_node *)node; + struct pv_node *pp = (struct pv_node *)prev; + bool wait_early, can_wait_early; int loop;for (;;) { - for (loop = SPIN_THRESHOLD; loop; loop--) { + /* + * Spin less if the previous vCPU was in the halted state + * and it is not the queue head. + */ + can_wait_early = (pn->waithist> PV_WAITHIST_THRESHOLD); + wait_early = can_wait_early&& !READ_ONCE(prev->locked)&& + (READ_ONCE(pp->state) == vcpu_halted); + loop = wait_early ? QNODE_SPIN_THRESHOLD_SHORT + : QNODE_SPIN_THRESHOLD; + for (; loop; loop--, cpu_relax()) { + bool halted; + if (READ_ONCE(node->locked)) return; - cpu_relax(); + + if (!can_wait_early || (loop& QNODE_SPIN_CHECK_MASK)) + continue; + + /* + * Look for state transition at previous node. + * + * running => halted: + * call pv_wait() now if kick-ahead is enabled + * or reduce spin threshold to + * QNODE_SPIN_THRESHOLD_SHORT or less. + * halted => running: + * reset spin threshold to QNODE_SPIN_THRESHOLD + */ + halted = (READ_ONCE(pp->state) == vcpu_halted)&& + !READ_ONCE(prev->locked); + if (wait_early == halted) + continue; + wait_early = halted; + + if (!wait_early) + loop = QNODE_SPIN_THRESHOLD; + else if (pv_kick_ahead) + break; + else if (loop> QNODE_SPIN_THRESHOLD_SHORT) + loop = QNODE_SPIN_THRESHOLD_SHORT; } + if (wait_early) + pvstat_inc(pvstat_wait_early); + + /* + * A pv_wait while !wait_early has higher weight than when + * wait_early is true. + */ + if (pn->waithist< PV_WAITHIST_MAX) + pn->waithist += wait_early ? 1 : PV_WAIT_INC;So when you looked at this patch, you didn't go like, OMFG!? So what was wrong with something like: static inline int pv_spin_threshold(struct pv_node *prev) { if (READ_ONCE(prev->locked)) /* it can run, wait for it */ return SPIN_THRESHOLD; if (READ_ONCE(prev->state) == vcpu_halted) /* its not running, do not wait */ return 1; return SPIN_THRESHOLD; } static void pv_wait_head(...) { for (;;) { for (loop = pv_spin_threshold(pp); loop; loop--) { if (READ_ONCE(node->locked)) return; cpu_relax(); } if (!lp) { ... } pv_wait(&l->locked, _Q_SLOW_VAL); } } What part of: "keep it simple" and "gradual complexity" have you still not grasped?
I confess that I was a bit sloppy in that part of the code. I want to get it out for review ASAP without doing too much fine tuning as I expect at least a few iterations for this patchset. I will certainly change it in the new patch. Anyway, thanks for the great suggestion.
Cheers, Longman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

