On 01/22/2016 01:09 AM, Davidlohr Bueso wrote:
On Thu, 21 Jan 2016, Waiman Long wrote:
On 01/21/2016 04:29 AM, Ding Tianhong wrote:
I got the vmcore and found that the ifconfig is already in the
wait_list of the
rtnl_lock for 120 second, but my process could get and release the
rtnl_lock
normally several times in one second, so it means that my process
jump the
queue and the ifconfig couldn't get the rtnl all the time, I check
the mutex lock
slow path and found that the mutex may spin on owner ignore whether
the wait list
is empty, it will cause the task in the wait list always be cut in
line, so add
test for wait list in the mutex_can_spin_on_owner and avoid this
problem.
So this has been somewhat always known, at least in theory, until now.
It's the cost
of spinning without going through the wait-queue, unlike other locks.
[...]
From: Waiman Long <waiman.l...@hpe.com>
Date: Thu, 21 Jan 2016 17:53:14 -0500
Subject: [PATCH] locking/mutex: Enable optimistic spinning of woken
task in wait list
Ding Tianhong reported a live-lock situation where a constant stream
of incoming optimistic spinners blocked a task in the wait list from
getting the mutex.
This patch attempts to fix this live-lock condition by enabling the
a woken task in the wait list to enter optimistic spinning loop itself
with precedence over the ones in the OSQ. This should prevent the
live-lock
condition from happening.
And one of the reasons why we never bothered 'fixing' things was the
additional
branching out in the slowpath (and lack of real issue, although this
one being so
damn pathological). I fear that your approach is one of those
scenarios where the
code ends up being bloated, albeit most of it is actually duplicated
and can be
refactored *sigh*. So now we'd spin, then sleep, then try spinning
then sleep again...
phew. Not to mention the performance implications, ie loosing the
benefits of osq
over waiter spinning in scenarios that would otherwise have more osq
spinners as
opposed to waiter spinners, or in setups where it is actually best to
block instead
of spinning.
The patch that I sent out is just a proof of concept to make sure that
it can fix that particular case. I do plan to refactor it if I decide to
go ahead with an official one. Unlike the OSQ, there can be no more than
one waiter spinner as the wakeup function is directed to only the first
task in the wait list and the spinning won't happen until the task is
first woken up. In the worst case scenario, there are only 2 spinners
spinning on the lock and the owner field, one from OSQ and one from the
wait list. That shouldn't put too much cacheline contention traffic to
the system.
Cheers,
Longman