Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
Sorry for the slow reply. Re: Module parameters: I prefer not to have module parameters since they are controlled by the guest. In general, in virtualized environments the admins controlling the hypervisor are more knowledgeable about these things than the users. A feature bit seems useful so that the host knows what the guest behavior will be, and can change the host side implementation to make the experience good for the guest. I worry that requiring global_node_page_state(NR_FILE_PAGES) == 0 before allowing deflation is too strict. One of the benefits of the shrinker API is that it is invoked before vmscan.c has gone through heroic efforts to reclaim the world. I'm not familiar enough with the code to judge how this patch impacts this, but would it be beneficial to allow deflation when vmscan.c is trying "too hard" to reclaim pages? Is there some softer condition than "global_node_page_state(NR_FILE_PAGES) == 0"? For my own understanding, does this patch work by returning 0 pages when asked for pages? Are there cases where that results in an unnecessary OOM? For example, if global_node_page_state(NR_FILE_PAGES) == 1, and the guest needs 2? Regarding other shrinkers (like KVM MMU cache): Reclaiming other shrinkers first would match the behavior of DEFLATE_ON_OOM when it was using the OOM notifier callback. On the other hand (awkwardly), the memory stats reported on the stats queue for "available memory" do not count shrinker memory as "available". So a balloon implementation that aims to reclaim some amount of available memory would not be able to tell how much memory was in the shrinkers and probably doesn't expect to reclaim them. For this reason, I think only looking at page cache size is the right choice. There should be a 1:1 relationship between stats reported and when DEFLATE_ON_OOM is invoked. Maybe in the future we add another stat that reports shrinker sizes, in which case we should also add a feature bit that allows other shrinkers to be pressured. Regarding NUMA awareness: I agree it's out of scope for this patch since all implementations so far are not NUMA aware. Would it be possible to back port this patch to 4.19 when the change to shrinker API was made? On Tue, Feb 11, 2020 at 6:20 AM Tetsuo Handa < penguin-ker...@i-love.sakura.ne.jp> wrote: > On 2020/02/10 16:27, Wang, Wei W wrote: > >> Well, my comment is rather: "Do not try to reserve guest's memory. In > other > >> words, do not try to maintain balloons on the guest side. Since host > would > >> be able to cache file data on the host's cache, guests would be able to > >> quickly fetch file data from host's cache via normal I/O requests." ;-) > > > > Didn't this one. The discussion was about guest pagecache pages v.s. > guest balloon pages. > > Why is host's pagecache here? > > I'm expecting a mode: "Guests should try to minimize pagecache pages (and > teach > host to treat reclaimed pages as if POSIX_FADV_DONTNEED) instead of > managing > guest balloon pages". In other words, as if > > while :; sleep 5; echo 1 > /proc/sys/vm/drop_caches; done > > is running in the guest's kernel. And as if > > echo 2 > /proc/sys/vm/drop_caches > > is triggered in the guest's kernel when host requested guests to reclaim > memory. No long-life balloons. Guest balloons do not need to care about > NUMA. Just leave the management of pagecache pages to the host. > > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
On 2020/02/10 16:27, Wang, Wei W wrote: >> Well, my comment is rather: "Do not try to reserve guest's memory. In other >> words, do not try to maintain balloons on the guest side. Since host would >> be able to cache file data on the host's cache, guests would be able to >> quickly fetch file data from host's cache via normal I/O requests." ;-) > > Didn't this one. The discussion was about guest pagecache pages v.s. guest > balloon pages. > Why is host's pagecache here? I'm expecting a mode: "Guests should try to minimize pagecache pages (and teach host to treat reclaimed pages as if POSIX_FADV_DONTNEED) instead of managing guest balloon pages". In other words, as if while :; sleep 5; echo 1 > /proc/sys/vm/drop_caches; done is running in the guest's kernel. And as if echo 2 > /proc/sys/vm/drop_caches is triggered in the guest's kernel when host requested guests to reclaim memory. No long-life balloons. Guest balloons do not need to care about NUMA. Just leave the management of pagecache pages to the host. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
RE: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
On Monday, February 10, 2020 11:57 AM, Tetsuo Handa wrote: > Then, "node-A's NR_FILE_PAGES is already 0 and node-B's NR_FILE_PAGES is > not 0, but allocation request which triggered this shrinker wants to allocate > from only node-A" > would be confused by this change, for the pagecache pages for allocating > thread's interested node are already depleted but the balloon cannot shrink > when it should because the pagecache pages for allocating thread's > uninterested nodes are not yet depleted. The existing balloon isn't numa aware. "but the balloon cannot shrink " - even we let balloon to shrink, it could shrink pages from the uninterested node. When we have a numa aware balloon, we could further update the shrinker to check with the per node counter , node_page_state(NR_FILE_PAGES). > > > > Well, my comment is rather: "Do not try to reserve guest's memory. In other > words, do not try to maintain balloons on the guest side. Since host would > be able to cache file data on the host's cache, guests would be able to > quickly fetch file data from host's cache via normal I/O requests." ;-) Didn't this one. The discussion was about guest pagecache pages v.s. guest balloon pages. Why is host's pagecache here? Best, Wei ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
RE: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
Wang, Wei W wrote: > On Saturday, February 8, 2020 8:33 PM, Tetsuo Handa wrote: > > > > Is this NUMA aware? Can "node-A's NR_FILE_PAGES is already 0 and > > node-B's NR_FILE_PAGES is not 0, but allocation request which triggered this > > shrinker wants to allocate from only node-B" happen? > > No, it's a global counter. > > >Can some thread keep > > this shrinker defunctional by keep increasing NR_FILE_PAGES? > > Yes. Actually it's our intention - as long as there are pagecache pages, > balloon pages are avoided to be reclaimed. Then, "node-A's NR_FILE_PAGES is already 0 and node-B's NR_FILE_PAGES is not 0, but allocation request which triggered this shrinker wants to allocate from only node-A" would be confused by this change, for the pagecache pages for allocating thread's interested node are already depleted but the balloon cannot shrink when it should because the pagecache pages for allocating thread's uninterested nodes are not yet depleted. > > > > > > Is this patch from "Re: Balloon pressuring page cache" thread? I hope that > > the guest could start reclaiming memory based on host's request (like OOM > > notifier chain) which is issued when host thinks that host is getting close > > to > > OOM and thus guests should start returning their unused memory to host. > > Maybe "periodically (e.g. 5 minutes)" in addition to "upon close to OOM > > condition" is also possible. > > That's about the host usages. The host side management software decides when > to > issue a request to balloon (either periodically or event driven), I think > there > isn't anything we need to do in the balloon driver here. Well, my comment is rather: "Do not try to reserve guest's memory. In other words, do not try to maintain balloons on the guest side. Since host would be able to cache file data on the host's cache, guests would be able to quickly fetch file data from host's cache via normal I/O requests." ;-) ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
RE: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
On Saturday, February 8, 2020 8:33 PM, Tetsuo Handa wrote: > > Is this NUMA aware? Can "node-A's NR_FILE_PAGES is already 0 and > node-B's NR_FILE_PAGES is not 0, but allocation request which triggered this > shrinker wants to allocate from only node-B" happen? No, it's a global counter. >Can some thread keep > this shrinker defunctional by keep increasing NR_FILE_PAGES? Yes. Actually it's our intention - as long as there are pagecache pages, balloon pages are avoided to be reclaimed. > > Is this patch from "Re: Balloon pressuring page cache" thread? I hope that > the guest could start reclaiming memory based on host's request (like OOM > notifier chain) which is issued when host thinks that host is getting close to > OOM and thus guests should start returning their unused memory to host. > Maybe "periodically (e.g. 5 minutes)" in addition to "upon close to OOM > condition" is also possible. That's about the host usages. The host side management software decides when to issue a request to balloon (either periodically or event driven), I think there isn't anything we need to do in the balloon driver here. Best, Wei ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
On 2020/02/06 17:01, Wei Wang wrote: > There are cases that users want to shrink balloon pages after the > pagecache depleted. The conservative_shrinker lets the shrinker > shrink balloon pages when all the pagecache has been reclaimed. > > @@ -796,6 +800,10 @@ static unsigned long shrink_balloon_pages(struct > virtio_balloon *vb, > { > unsigned long pages_freed = 0; > > + /* Balloon pages only gets shrunk when the pagecache depleted */ > + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES)) > + return 0; > + Is this NUMA aware? Can "node-A's NR_FILE_PAGES is already 0 and node-B's NR_FILE_PAGES is not 0, but allocation request which triggered this shrinker wants to allocate from only node-B" happen? Can some thread keep this shrinker defunctional by keep increasing NR_FILE_PAGES? Is this patch from "Re: Balloon pressuring page cache" thread? I hope that the guest could start reclaiming memory based on host's request (like OOM notifier chain) which is issued when host thinks that host is getting close to OOM and thus guests should start returning their unused memory to host. Maybe "periodically (e.g. 5 minutes)" in addition to "upon close to OOM condition" is also possible. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
On Thu, Feb 06, 2020 at 09:43:10AM +, Wang, Wei W wrote: > On Thursday, February 6, 2020 5:31 PM, Michael S. Tsirkin wrote: > > > > How about just making this a last resort thing to be compatible with > > existing > > hypervisors? if someone wants to change behaviour that really should use a > > feature bit ... > > Yeah, sounds good to me to control via feature bits. > > Best, > Wei To clarify, shrinker use could be a feature bit. OOM behaviour was there for years and has been used to dynamically size guests. -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
On 06.02.20 10:44, Wang, Wei W wrote: > On Thursday, February 6, 2020 5:32 PM, David Hildenbrand wrote: >> >> If the page cache is empty, a drop_slab() will deflate the whole balloon if I >> am not wrong. >> >> Especially, a echo 3 > /proc/sys/vm/drop_caches >> >> will first drop the page cache and then drop_slab() > > Then that's the problem of "echo 3 > /proc/sys/vm/drop_cache" itself. It > invokes other shrinkers as well (if considered an issue), need to be tweaked > in the mm. In short, I don't like this approach as long as a drop_slab() can deflate the whole balloon and don't think this is the right approach then. -- Thanks, David / dhildenb ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
RE: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
On Thursday, February 6, 2020 5:32 PM, David Hildenbrand wrote: > > If the page cache is empty, a drop_slab() will deflate the whole balloon if I > am not wrong. > > Especially, a echo 3 > /proc/sys/vm/drop_caches > > will first drop the page cache and then drop_slab() Then that's the problem of "echo 3 > /proc/sys/vm/drop_cache" itself. It invokes other shrinkers as well (if considered an issue), need to be tweaked in the mm. Best, Wei ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
RE: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
On Thursday, February 6, 2020 5:31 PM, Michael S. Tsirkin wrote: > > How about just making this a last resort thing to be compatible with existing > hypervisors? if someone wants to change behaviour that really should use a > feature bit ... Yeah, sounds good to me to control via feature bits. Best, Wei ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
On 06.02.20 10:28, Wang, Wei W wrote: > On Thursday, February 6, 2020 5:10 PM, David Hildenbrand wrote: >> so dropping caches (echo 3 > /proc/sys/vm/drop_caches) will no longer >> deflate the balloon when conservative_shrinker=true? >> > > Should be. Need Tyler's help to test it. > If the page cache is empty, a drop_slab() will deflate the whole balloon if I am not wrong. Especially, a echo 3 > /proc/sys/vm/drop_caches will first drop the page cache and then drop_slab() While I like the general idea, it looks more like a hack to me, to try to teach the shrinker something it was not built for/does not support yet. -- Thanks, David / dhildenb ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
On Thu, Feb 06, 2020 at 09:27:04AM +, Wang, Wei W wrote: > On Thursday, February 6, 2020 5:04 PM, Michael S. Tsirkin wrote: > > virtio_balloon_shrinker_count(struct shrinker *shrinker, > > > struct virtio_balloon, shrinker); > > > unsigned long count; > > > > > > - count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE; > > > + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES)) > > > > I'd rather have an API for that in mm/. In particular, do we want other > > shrinkers to run, not just pagecache? To pick an example I'm familiar > > with, kvm mmu cache for nested virt? > > We could make it extendable: > > #define BALLOON_SHRINKER_AFTER_PAGE_CACHE (1 << 0) > #define BALLOON_SHRINKER_AFTER_KVM_MMU_CACHE (1 << 1) > ... > > uint64_t conservative_shrinker; > if ((conservative_shrinker | BALLOON_SHRINKER_AFTER_PAGE_CACHE) && > global_node_page_state(NR_FILE_PAGES)) > return 0; > > For now, we probably only need BALLOON_SHRINKER_AFTER_PAGE_CACHE. > > Best, > Wei How about just making this a last resort thing to be compatible with existing hypervisors? if someone wants to change behaviour that really should use a feature bit ... -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
RE: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
On Thursday, February 6, 2020 5:10 PM, David Hildenbrand wrote: > so dropping caches (echo 3 > /proc/sys/vm/drop_caches) will no longer > deflate the balloon when conservative_shrinker=true? > Should be. Need Tyler's help to test it. Best, Wei ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
RE: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
On Thursday, February 6, 2020 5:04 PM, Michael S. Tsirkin wrote: > virtio_balloon_shrinker_count(struct shrinker *shrinker, > > struct virtio_balloon, shrinker); > > unsigned long count; > > > > - count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE; > > + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES)) > > I'd rather have an API for that in mm/. In particular, do we want other > shrinkers to run, not just pagecache? To pick an example I'm familiar > with, kvm mmu cache for nested virt? We could make it extendable: #define BALLOON_SHRINKER_AFTER_PAGE_CACHE (1 << 0) #define BALLOON_SHRINKER_AFTER_KVM_MMU_CACHE(1 << 1) ... uint64_t conservative_shrinker; if ((conservative_shrinker | BALLOON_SHRINKER_AFTER_PAGE_CACHE) && global_node_page_state(NR_FILE_PAGES)) return 0; For now, we probably only need BALLOON_SHRINKER_AFTER_PAGE_CACHE. Best, Wei ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
On 06.02.20 09:01, Wei Wang wrote: > There are cases that users want to shrink balloon pages after the > pagecache depleted. The conservative_shrinker lets the shrinker > shrink balloon pages when all the pagecache has been reclaimed. > > Signed-off-by: Wei Wang > --- > drivers/virtio/virtio_balloon.c | 14 +- > 1 file changed, 13 insertions(+), 1 deletion(-) > > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c > index 93f995f6cf36..b4c5bb13a867 100644 > --- a/drivers/virtio/virtio_balloon.c > +++ b/drivers/virtio/virtio_balloon.c > @@ -42,6 +42,10 @@ > static struct vfsmount *balloon_mnt; > #endif > > +static bool conservative_shrinker = true; > +module_param(conservative_shrinker, bool, 0644); > +MODULE_PARM_DESC(conservative_shrinker, "conservatively shrink balloon > pages"); > + > enum virtio_balloon_vq { > VIRTIO_BALLOON_VQ_INFLATE, > VIRTIO_BALLOON_VQ_DEFLATE, > @@ -796,6 +800,10 @@ static unsigned long shrink_balloon_pages(struct > virtio_balloon *vb, > { > unsigned long pages_freed = 0; > > + /* Balloon pages only gets shrunk when the pagecache depleted */ > + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES)) > + return 0; > + > /* >* One invocation of leak_balloon can deflate at most >* VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages, so we call it > @@ -837,7 +845,11 @@ static unsigned long > virtio_balloon_shrinker_count(struct shrinker *shrinker, > struct virtio_balloon, shrinker); > unsigned long count; > > - count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE; > + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES)) > + count = 0; > + else > + count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE; > + > count += vb->num_free_page_blocks * VIRTIO_BALLOON_HINT_BLOCK_PAGES; > > return count; > so dropping caches (echo 3 > /proc/sys/vm/drop_caches) will no longer deflate the balloon when conservative_shrinker=true? -- Thanks, David / dhildenb ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
On Thu, Feb 06, 2020 at 04:01:47PM +0800, Wei Wang wrote: > There are cases that users want to shrink balloon pages after the > pagecache depleted. The conservative_shrinker lets the shrinker > shrink balloon pages when all the pagecache has been reclaimed. > > Signed-off-by: Wei Wang I'd rather avoid module parameters, but otherwise looks like a reasonable idea. Tyler, what do you think? > --- > drivers/virtio/virtio_balloon.c | 14 +- > 1 file changed, 13 insertions(+), 1 deletion(-) > > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c > index 93f995f6cf36..b4c5bb13a867 100644 > --- a/drivers/virtio/virtio_balloon.c > +++ b/drivers/virtio/virtio_balloon.c > @@ -42,6 +42,10 @@ > static struct vfsmount *balloon_mnt; > #endif > > +static bool conservative_shrinker = true; > +module_param(conservative_shrinker, bool, 0644); > +MODULE_PARM_DESC(conservative_shrinker, "conservatively shrink balloon > pages"); > + > enum virtio_balloon_vq { > VIRTIO_BALLOON_VQ_INFLATE, > VIRTIO_BALLOON_VQ_DEFLATE, > @@ -796,6 +800,10 @@ static unsigned long shrink_balloon_pages(struct > virtio_balloon *vb, > { > unsigned long pages_freed = 0; > > + /* Balloon pages only gets shrunk when the pagecache depleted */ > + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES)) > + return 0; > + > /* >* One invocation of leak_balloon can deflate at most >* VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages, so we call it > @@ -837,7 +845,11 @@ static unsigned long > virtio_balloon_shrinker_count(struct shrinker *shrinker, > struct virtio_balloon, shrinker); > unsigned long count; > > - count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE; > + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES)) I'd rather have an API for that in mm/. In particular, do we want other shrinkers to run, not just pagecache? To pick an example I'm familiar with, kvm mmu cache for nested virt? > + count = 0; > + else > + count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE; > + > count += vb->num_free_page_blocks * VIRTIO_BALLOON_HINT_BLOCK_PAGES; > > return count; > -- > 2.17.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH RFC] virtio_balloon: conservative balloon page shrinking
There are cases that users want to shrink balloon pages after the pagecache depleted. The conservative_shrinker lets the shrinker shrink balloon pages when all the pagecache has been reclaimed. Signed-off-by: Wei Wang --- drivers/virtio/virtio_balloon.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c index 93f995f6cf36..b4c5bb13a867 100644 --- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@ -42,6 +42,10 @@ static struct vfsmount *balloon_mnt; #endif +static bool conservative_shrinker = true; +module_param(conservative_shrinker, bool, 0644); +MODULE_PARM_DESC(conservative_shrinker, "conservatively shrink balloon pages"); + enum virtio_balloon_vq { VIRTIO_BALLOON_VQ_INFLATE, VIRTIO_BALLOON_VQ_DEFLATE, @@ -796,6 +800,10 @@ static unsigned long shrink_balloon_pages(struct virtio_balloon *vb, { unsigned long pages_freed = 0; + /* Balloon pages only gets shrunk when the pagecache depleted */ + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES)) + return 0; + /* * One invocation of leak_balloon can deflate at most * VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages, so we call it @@ -837,7 +845,11 @@ static unsigned long virtio_balloon_shrinker_count(struct shrinker *shrinker, struct virtio_balloon, shrinker); unsigned long count; - count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE; + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES)) + count = 0; + else + count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE; + count += vb->num_free_page_blocks * VIRTIO_BALLOON_HINT_BLOCK_PAGES; return count; -- 2.17.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization