Re: [RFC PATCH v1 00/13] lru_lock scalability

Daniel Jordan Tue, 13 Feb 2018 13:08:35 -0800

On 02/08/2018 06:36 PM, Andrew Morton wrote:

On Wed, 31 Jan 2018 18:04:00 -0500 [email protected] wrote:

lru_lock, a per-node* spinlock that protects an LRU list, is one of the
hottest locks in the kernel.  On some workloads on large machines, it
shows up at the top of lock_stat.


Do you have details on which callsites are causing the problem?  That
would permit us to consider other approaches, perhaps.


Sure, there are two paths where we're seeing contention.

In the first one, a pagevec's worth of anonymous pages are added tovarious LRUs when the per-cpu pagevec fills up:


  /* take an anonymous page fault, eventually end up at... */
  handle_pte_fault
    do_anonymous_page
      lru_cache_add_active_or_unevictable
        lru_cache_add
          __lru_cache_add
            __pagevec_lru_add
              pagevec_lru_move_fn
                /* contend on lru_lock */

In the second, one or more pages are removed from an LRU under one holdof lru_lock:


  // userland calls munmap or exit, eventually end up at...
  zap_pte_range
    __tlb_remove_page // returns true because we eventually hit
                      // MAX_GATHER_BATCH_COUNT in tlb_next_batch
    tlb_flush_mmu_free
      free_pages_and_swap_cache
        release_pages
          /* contend on lru_lock */

For a broader context, we've run decision support benchmarks wherelru_lock (and zone->lock) show long wait times. But we're not the onlyones according to certain kernel comments:


mm/vmscan.c:
 * zone_lru_lock is heavily contended.  Some of the functions that
 * shrink the lists perform better by taking out a batch of pages
 * and working on them outside the LRU lock.
 *
 * For pagecache intensive workloads, this function is the hottest
 * spot in the kernel (apart from copy_*_user functions).
...
static unsigned long isolate_lru_pages(unsigned long nr_to_scan,


include/linux/mmzone.h:

* zone->lock and the [pgdat->lru_lock] are two of the hottest locks inthe kernel.* So add a wild amount of padding here to ensure that they fall intoseparate

 * cachelines. ...

Anyway, if you're seeing this lock in your workloads, I'm interested inhearing what you're running so we can get more real world data on this.

Re: [RFC PATCH v1 00/13] lru_lock scalability

Reply via email to