Re: [PATCH v3 2/2] virtio_balloon: replace oom notifier with shrinker

Wei Wang Sun, 05 Aug 2018 20:26:37 -0700

On 08/04/2018 03:15 AM, Michael S. Tsirkin wrote:

On Fri, Aug 03, 2018 at 04:32:26PM +0800, Wei Wang wrote:

The OOM notifier is getting deprecated to use for the reasons:
- As a callout from the oom context, it is too subtle and easy to
   generate bugs and corner cases which are hard to track;
- It is called too late (after the reclaiming has been performed).
   Drivers with large amuont of reclaimable memory is expected to
   release them at an early stage of memory pressure;
- The notifier callback isn't aware of oom contrains;
Link: https://lkml.org/lkml/2018/7/12/314


This patch replaces the virtio-balloon oom notifier with a shrinker
to release balloon pages on memory pressure. The balloon pages are
given back to mm adaptively by returning the number of pages that the
reclaimer is asking for (i.e. sc->nr_to_scan).

Currently the max possible value of sc->nr_to_scan passed to the balloon
shrinker is SHRINK_BATCH, which is 128. This is smaller than the
limitation that only VIRTIO_BALLOON_ARRAY_PFNS_MAX (256) pages can be
returned via one invocation of leak_balloon. But this patch still
considers the case that SHRINK_BATCH or shrinker->batch could be changed
to a value larger than VIRTIO_BALLOON_ARRAY_PFNS_MAX, which will need to
do multiple invocations of leak_balloon.

Historically, the feature VIRTIO_BALLOON_F_DEFLATE_ON_OOM has been used
to release balloon pages on OOM. We continue to use this feature bit for
the shrinker, so the shrinker is only registered when this feature bit
has been negotiated with host.

Signed-off-by: Wei Wang <wei.w.w...@intel.com>
Cc: Michael S. Tsirkin <m...@redhat.com>
Cc: Michal Hocko <mho...@kernel.org>
Cc: Andrew Morton <a...@linux-foundation.org>


Could you add data at how was this tested and how did guest
behaviour change. Which configurations see an improvement?


Yes. Please see the differences from the "*1" and "*2" cases below.

Taking this chance, I use "*2" and "*3" to show Michal etc thedifferences of applying and not applying the shrinker fix patch here:https://lkml.org/lkml/2018/8/3/384



*1. V3 patches
1)After inflating some amount of memory, actual=1000001536 Bytes
free -m

total used free shared buff/cacheavailable

Mem:           7975        7289         514          10 171         447
Swap:         10236           0       10236

2) dd if=478MB_file of=/dev/null, actual=1058721792 Bytes
free -m

total used free shared buff/cacheavailable

Mem:           7975        7233         102          10 639         475
Swap:         10236           0       10236

The advantage is that the inflated pages are given back to mm based onthe number, i.e. ~56MB(diff "actual" above) of the reclaimer is askingfor. This is more adaptive.

*2. V2 paches, balloon_pages_to_shrink=1000000 pages (around 4GB), withthe shrinker fix patches applied.

1)After inflating some amount of memory, actual=1000001536 Bytes
free -m

total used free shared buff/cacheavailable

Mem:           7975        7288         530          10 157         455
Swap:         10236           0       10236

2)dd if=478MB_file of=/dev/null, actual=5096001536 Bytes
free -m

total used free shared buff/cacheavailable

Mem:           7975        3381        3953          10 640        4327
Swap:         10236           0       10236

In the above example, we set 4GB to shrink to make the differenceobvious. Though the claimer only needs to reclaim ~56MB memory, 4GBinflated pages are given back to mm, which is unnecessary. From theuser's perspective, it has no idea of how many pages to given back atthe time of setting the module parameter (balloon_pages_to_shrink). So Ithink the above "*1" is better.

*3. V2 paches, balloon_pages_to_shrink=1000000 pages (around 4GB),without the shrinker fix patches applied.

1) After inflating some amount of memory, actual=1000001536 Bytes
free -m

total used free shared buff/cacheavailable

Mem:           7975        7292         524          10 158         450
Swap:         10236           0       10236

2) dd if=478MB_file of=/dev/null, actual=8589934592 Bytes
free -m

total used free shared buff/cacheavailable

Mem:           7975          53        7281          10 640        7656
Swap:         10236           0       10236

Compared to *2, all the balloon pages are shrunk, but users expect 4GBto shrink. The reason is that do_slab_shrink has a mistake incalculating schrinkctl->nr_scanned, which should be the actual number ofpages that the shrinker has freed, but do slab_shrink still treat thatvalue as 128 (but 4GB has actually been freed).



Best,
Wei
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 2/2] virtio_balloon: replace oom notifier with shrinker

Reply via email to