On 3/6/18 1:41 PM, Andrew Morton wrote:
On Tue, 6 Mar 2018 13:17:37 -0800 Yang Shi <yang....@linux.alibaba.com> wrote:
It just mitigates the hung task warning, can't resolve the mmap_sem
scalability issue. Furthermore, waiting on pure uninterruptible state
for reading /proc sounds unnecessary. It doesn't wait for I/O completion.
Since we already had down_read_killable() APIs available, IMHO, giving
application a chance to abort at some circumstances sounds not bad.
Where the heck are we holding mmap_sem for so long? Can that be fixed?
The mmap_sem is held for unmapping a large map which has every single
page mapped. This is not a issue in real production code. Just found it
by running vm-scalability on a machine with ~600GB memory.
AFAIK, I don't see any easy fix for the mmap_sem scalability issue. I
saw range locking patches (https://lwn.net/Articles/723648/) were
floating around. But, it may not help too much on the case that a large
map with every single page mapped.
Well it sounds fairly simple to mitigate? Simplistically: don't unmap
600G in a single hit; do it 1G at a time, dropping mmap_sem each time.
A smarter version might only come up for air if there are mmap_sem
waiters and if it has already done some work. I don't think we have
any particular atomicity requirements when unmapping?
I'm not quite sure. But, the existing applications may assume munmap is