Re: [PATCH 2/2] virtio_balloon: free some memory from baloon on OOM

2014-10-15 Thread Denis V. Lunev

On 14/10/14 13:10, Michael S. Tsirkin wrote:

On Tue, Oct 14, 2014 at 10:14:05AM +1030, Rusty Russell wrote:

Michael S. Tsirkin m...@redhat.com writes:


On Mon, Oct 13, 2014 at 04:02:52PM +1030, Rusty Russell wrote:

Denis V. Lunev d...@parallels.com writes:

From: Raushaniya Maksudova rmaksud...@parallels.com

Excessive virtio_balloon inflation can cause invocation of OOM-killer,
when Linux is under severe memory pressure. Various mechanisms are
responsible for correct virtio_balloon memory management. Nevertheless
it is often the case that these control tools does not have enough time
to react on fast changing memory load. As a result OS runs out of memory
and invokes OOM-killer. The balancing of memory by use of the virtio
balloon should not cause the termination of processes while there are
pages in the balloon. Now there is no way for virtio balloon driver to
free some memory at the last moment before some process will be get
killed by OOM-killer.


This makes some amount of sense.


This reminds me of the balloon fs that Google once proposed.
This really needs to be controlled from host though.
At the moment host does not expect guest to deflate before
requests.
So as a minimum, add a feature bit for this.  what if you want a mix of
mandatory and optional balooning? I guess we can use multiple balloons,
is that the idea?


Trying to claw back some pages on OOM is almost certainly correct,
even if the host doesn't expect it.  It's roughly equivalent to not
giving up pages in the first place.


Well the difference is that there are management tools that
poll balloon in host until they see balloon size reaches
the expected value.

They don't expect balloon to shrink below num_pages and will respond in various
unexpected ways like e.g. killing the VM if it does.
Killing a userspace process within the guest might be better
for VM health.

Besides the fact that we always did it like this, these tools seem to have
basis in the spec.
Specifically, this is based on this text from the spec:
the device asks for a certain amount of memory, and the driver
supplies it (or withdraws it, if the device has more than it asks for).
This allows the guest to adapt to changes in allowance of underlying
physical memory.

and

The device is driven by the receipt of a configuration change interrupt.




Cheers,
Rusty.
PS.  Yes, a real guest-driven balloon is preferable, but that's a much
  larger task.



Any objection to making the feature depend on a feature flag?




OK. I got the point. This sounds good for me. We will prepare patch for
kernel and proper bits (command line option) for QEMU.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 2/2] virtio_balloon: free some memory from baloon on OOM

2014-10-15 Thread Rusty Russell
Michael S. Tsirkin m...@redhat.com writes:
 On Tue, Oct 14, 2014 at 10:14:05AM +1030, Rusty Russell wrote:
 Michael S. Tsirkin m...@redhat.com writes:
 
  On Mon, Oct 13, 2014 at 04:02:52PM +1030, Rusty Russell wrote:
  Denis V. Lunev d...@parallels.com writes:
   From: Raushaniya Maksudova rmaksud...@parallels.com
  
   Excessive virtio_balloon inflation can cause invocation of OOM-killer,
   when Linux is under severe memory pressure. Various mechanisms are
   responsible for correct virtio_balloon memory management. Nevertheless
   it is often the case that these control tools does not have enough time
   to react on fast changing memory load. As a result OS runs out of memory
   and invokes OOM-killer. The balancing of memory by use of the virtio
   balloon should not cause the termination of processes while there are
   pages in the balloon. Now there is no way for virtio balloon driver to
   free some memory at the last moment before some process will be get
   killed by OOM-killer.
  
  This makes some amount of sense.
 
  This reminds me of the balloon fs that Google once proposed.
  This really needs to be controlled from host though.
  At the moment host does not expect guest to deflate before
  requests.
  So as a minimum, add a feature bit for this.  what if you want a mix of
  mandatory and optional balooning? I guess we can use multiple balloons,
  is that the idea?
 
 Trying to claw back some pages on OOM is almost certainly correct,
 even if the host doesn't expect it.  It's roughly equivalent to not
 giving up pages in the first place.

 Well the difference is that there are management tools that
 poll balloon in host until they see balloon size reaches
 the expected value.

 They don't expect balloon to shrink below num_pages and will respond in 
 various
 unexpected ways like e.g. killing the VM if it does.
 Killing a userspace process within the guest might be better
 for VM health.

 Besides the fact that we always did it like this, these tools seem to have
 basis in the spec.
 Specifically, this is based on this text from the spec:
   the device asks for a certain amount of memory, and the driver
   supplies it (or withdraws it, if the device has more than it asks for).
   This allows the guest to adapt to changes in allowance of underlying
   physical memory.

 and

   The device is driven by the receipt of a configuration change interrupt.



 Cheers,
 Rusty.
 PS.  Yes, a real guest-driven balloon is preferable, but that's a much
  larger task.


 Any objection to making the feature depend on a feature flag?

If you believe a guest which does this will cause drastic failure on the
host side (ie. killing the VM), then yes, we can do this.

However, I'm not aware of anything that sophisticated...

Cheers,
Rusty.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 2/2] virtio_balloon: free some memory from baloon on OOM

2014-10-14 Thread Rusty Russell
Michael S. Tsirkin m...@redhat.com writes:

 On Mon, Oct 13, 2014 at 04:02:52PM +1030, Rusty Russell wrote:
 Denis V. Lunev d...@parallels.com writes:
  From: Raushaniya Maksudova rmaksud...@parallels.com
 
  Excessive virtio_balloon inflation can cause invocation of OOM-killer,
  when Linux is under severe memory pressure. Various mechanisms are
  responsible for correct virtio_balloon memory management. Nevertheless
  it is often the case that these control tools does not have enough time
  to react on fast changing memory load. As a result OS runs out of memory
  and invokes OOM-killer. The balancing of memory by use of the virtio
  balloon should not cause the termination of processes while there are
  pages in the balloon. Now there is no way for virtio balloon driver to
  free some memory at the last moment before some process will be get
  killed by OOM-killer.
 
 This makes some amount of sense.

 This reminds me of the balloon fs that Google once proposed.
 This really needs to be controlled from host though.
 At the moment host does not expect guest to deflate before
 requests.
 So as a minimum, add a feature bit for this.  what if you want a mix of
 mandatory and optional balooning? I guess we can use multiple balloons,
 is that the idea?

Trying to claw back some pages on OOM is almost certainly correct,
even if the host doesn't expect it.  It's roughly equivalent to not
giving up pages in the first place.

Cheers,
Rusty.
PS.  Yes, a real guest-driven balloon is preferable, but that's a much
 larger task.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 2/2] virtio_balloon: free some memory from baloon on OOM

2014-10-14 Thread Michael S. Tsirkin
On Tue, Oct 14, 2014 at 10:14:05AM +1030, Rusty Russell wrote:
 Michael S. Tsirkin m...@redhat.com writes:
 
  On Mon, Oct 13, 2014 at 04:02:52PM +1030, Rusty Russell wrote:
  Denis V. Lunev d...@parallels.com writes:
   From: Raushaniya Maksudova rmaksud...@parallels.com
  
   Excessive virtio_balloon inflation can cause invocation of OOM-killer,
   when Linux is under severe memory pressure. Various mechanisms are
   responsible for correct virtio_balloon memory management. Nevertheless
   it is often the case that these control tools does not have enough time
   to react on fast changing memory load. As a result OS runs out of memory
   and invokes OOM-killer. The balancing of memory by use of the virtio
   balloon should not cause the termination of processes while there are
   pages in the balloon. Now there is no way for virtio balloon driver to
   free some memory at the last moment before some process will be get
   killed by OOM-killer.
  
  This makes some amount of sense.
 
  This reminds me of the balloon fs that Google once proposed.
  This really needs to be controlled from host though.
  At the moment host does not expect guest to deflate before
  requests.
  So as a minimum, add a feature bit for this.  what if you want a mix of
  mandatory and optional balooning? I guess we can use multiple balloons,
  is that the idea?
 
 Trying to claw back some pages on OOM is almost certainly correct,
 even if the host doesn't expect it.  It's roughly equivalent to not
 giving up pages in the first place.

Well the difference is that there are management tools that
poll balloon in host until they see balloon size reaches
the expected value.

They don't expect balloon to shrink below num_pages and will respond in various
unexpected ways like e.g. killing the VM if it does.
Killing a userspace process within the guest might be better
for VM health.

Besides the fact that we always did it like this, these tools seem to have
basis in the spec.
Specifically, this is based on this text from the spec:
the device asks for a certain amount of memory, and the driver
supplies it (or withdraws it, if the device has more than it asks for).
This allows the guest to adapt to changes in allowance of underlying
physical memory.

and

The device is driven by the receipt of a configuration change interrupt.



 Cheers,
 Rusty.
 PS.  Yes, a real guest-driven balloon is preferable, but that's a much
  larger task.


Any objection to making the feature depend on a feature flag?

-- 
MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 2/2] virtio_balloon: free some memory from baloon on OOM

2014-10-13 Thread Rusty Russell
Denis V. Lunev d...@parallels.com writes:
 From: Raushaniya Maksudova rmaksud...@parallels.com

 Excessive virtio_balloon inflation can cause invocation of OOM-killer,
 when Linux is under severe memory pressure. Various mechanisms are
 responsible for correct virtio_balloon memory management. Nevertheless
 it is often the case that these control tools does not have enough time
 to react on fast changing memory load. As a result OS runs out of memory
 and invokes OOM-killer. The balancing of memory by use of the virtio
 balloon should not cause the termination of processes while there are
 pages in the balloon. Now there is no way for virtio balloon driver to
 free some memory at the last moment before some process will be get
 killed by OOM-killer.

This makes some amount of sense.

But I suggest a few minor changes:

 +static int oom_vballoon_pages = OOM_VBALLOON_DEFAULT_PAGES;
 +module_param(oom_vballoon_pages, int, S_IRUSR | S_IWUSR);
 +MODULE_PARM_DESC(oom_vballoon_pages, pages to free on OOM);

Since this is already prefixed with virtio_balloon. I suggest just
calling it oom_pages.

 +static int virtballoon_oom_notify(struct notifier_block *self,
 +   unsigned long dummy, void *parm)
 +{
 + unsigned int num_freed_pages;
 + unsigned long *freed = (unsigned long *)parm;
 + struct virtio_balloon *vb = container_of((struct notifier_block *)self,
 +  struct virtio_balloon, nb);

Why cast self here?

 + num_freed_pages = leak_balloon(vb, oom_vballoon_pages);
 + update_balloon_size(vb);
 + *freed += num_freed_pages;
 +
 + return NOTIFY_OK;
 +}

Cheers,
Rusty.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 2/2] virtio_balloon: free some memory from baloon on OOM

2014-10-13 Thread Michael S. Tsirkin
On Mon, Oct 13, 2014 at 04:02:52PM +1030, Rusty Russell wrote:
 Denis V. Lunev d...@parallels.com writes:
  From: Raushaniya Maksudova rmaksud...@parallels.com
 
  Excessive virtio_balloon inflation can cause invocation of OOM-killer,
  when Linux is under severe memory pressure. Various mechanisms are
  responsible for correct virtio_balloon memory management. Nevertheless
  it is often the case that these control tools does not have enough time
  to react on fast changing memory load. As a result OS runs out of memory
  and invokes OOM-killer. The balancing of memory by use of the virtio
  balloon should not cause the termination of processes while there are
  pages in the balloon. Now there is no way for virtio balloon driver to
  free some memory at the last moment before some process will be get
  killed by OOM-killer.
 
 This makes some amount of sense.

This reminds me of the balloon fs that Google once proposed.
This really needs to be controlled from host though.
At the moment host does not expect guest to deflate before
requests.
So as a minimum, add a feature bit for this.  what if you want a mix of
mandatory and optional balooning? I guess we can use multiple balloons,
is that the idea?


 But I suggest a few minor changes:
 
  +static int oom_vballoon_pages = OOM_VBALLOON_DEFAULT_PAGES;
  +module_param(oom_vballoon_pages, int, S_IRUSR | S_IWUSR);
  +MODULE_PARM_DESC(oom_vballoon_pages, pages to free on OOM);
 
 Since this is already prefixed with virtio_balloon. I suggest just
 calling it oom_pages.
 
  +static int virtballoon_oom_notify(struct notifier_block *self,
  + unsigned long dummy, void *parm)
  +{
  +   unsigned int num_freed_pages;
  +   unsigned long *freed = (unsigned long *)parm;
  +   struct virtio_balloon *vb = container_of((struct notifier_block *)self,
  +struct virtio_balloon, nb);
 
 Why cast self here?
 
  +   num_freed_pages = leak_balloon(vb, oom_vballoon_pages);
  +   update_balloon_size(vb);
  +   *freed += num_freed_pages;
  +
  +   return NOTIFY_OK;
  +}
 
 Cheers,
 Rusty.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 2/2] virtio_balloon: free some memory from baloon on OOM

2014-10-13 Thread Denis V. Lunev

On 13/10/14 09:32, Rusty Russell wrote:

Denis V. Lunev d...@parallels.com writes:

From: Raushaniya Maksudova rmaksud...@parallels.com

Excessive virtio_balloon inflation can cause invocation of OOM-killer,
when Linux is under severe memory pressure. Various mechanisms are
responsible for correct virtio_balloon memory management. Nevertheless
it is often the case that these control tools does not have enough time
to react on fast changing memory load. As a result OS runs out of memory
and invokes OOM-killer. The balancing of memory by use of the virtio
balloon should not cause the termination of processes while there are
pages in the balloon. Now there is no way for virtio balloon driver to
free some memory at the last moment before some process will be get
killed by OOM-killer.

This makes some amount of sense.

But I suggest a few minor changes:


+static int oom_vballoon_pages = OOM_VBALLOON_DEFAULT_PAGES;
+module_param(oom_vballoon_pages, int, S_IRUSR | S_IWUSR);
+MODULE_PARM_DESC(oom_vballoon_pages, pages to free on OOM);

Since this is already prefixed with virtio_balloon. I suggest just
calling it oom_pages.

ok, will do


+static int virtballoon_oom_notify(struct notifier_block *self,
+ unsigned long dummy, void *parm)
+{
+   unsigned int num_freed_pages;
+   unsigned long *freed = (unsigned long *)parm;
+   struct virtio_balloon *vb = container_of((struct notifier_block *)self,
+struct virtio_balloon, nb);

Why cast self here?
this is a piece from a previous version of the patch, I'll fix this and 
resend.



+   num_freed_pages = leak_balloon(vb, oom_vballoon_pages);
+   update_balloon_size(vb);
+   *freed += num_freed_pages;
+
+   return NOTIFY_OK;
+}

Cheers,
Rusty.


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 2/2] virtio_balloon: free some memory from baloon on OOM

2014-10-08 Thread Denis V. Lunev
From: Raushaniya Maksudova rmaksud...@parallels.com

Excessive virtio_balloon inflation can cause invocation of OOM-killer,
when Linux is under severe memory pressure. Various mechanisms are
responsible for correct virtio_balloon memory management. Nevertheless
it is often the case that these control tools does not have enough time
to react on fast changing memory load. As a result OS runs out of memory
and invokes OOM-killer. The balancing of memory by use of the virtio
balloon should not cause the termination of processes while there are
pages in the balloon. Now there is no way for virtio balloon driver to
free some memory at the last moment before some process will be get
killed by OOM-killer.

This does not provide a security breach as baloon itself is running
inside guest OS and is working in the cooperation with the host. Thus
some improvements from guest side should be considered as normal.

To solve the problem, introduce a virtio_balloon callback which is
expected to be called from the oom notifier call chain in out_of_memory()
function. If virtio balloon could release some memory, it will make
the system to return and retry the allocation that forced the out of
memory killer to run.

Signed-off-by: Raushaniya Maksudova rmaksud...@parallels.com
Signed-off-by: Denis V. Lunev d...@openvz.org
CC: Rusty Russell ru...@rustcorp.com.au
CC: Michael S. Tsirkin m...@redhat.com
CC: virtualization@lists.linux-foundation.org
---
 drivers/virtio/virtio_balloon.c | 46 +
 1 file changed, 46 insertions(+)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 213da41..ca77831 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -28,6 +28,7 @@
 #include linux/slab.h
 #include linux/module.h
 #include linux/balloon_compaction.h
+#include linux/oom.h
 
 /*
  * Balloon device works in 4K page units.  So each page is pointed to by
@@ -36,6 +37,12 @@
  */
 #define VIRTIO_BALLOON_PAGES_PER_PAGE (unsigned)(PAGE_SIZE  
VIRTIO_BALLOON_PFN_SHIFT)
 #define VIRTIO_BALLOON_ARRAY_PFNS_MAX 256
+#define OOM_VBALLOON_DEFAULT_PAGES 256
+#define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
+
+static int oom_vballoon_pages = OOM_VBALLOON_DEFAULT_PAGES;
+module_param(oom_vballoon_pages, int, S_IRUSR | S_IWUSR);
+MODULE_PARM_DESC(oom_vballoon_pages, pages to free on OOM);
 
 struct virtio_balloon
 {
@@ -71,6 +78,9 @@ struct virtio_balloon
/* Memory statistics */
int need_stats_update;
struct virtio_balloon_stat stats[VIRTIO_BALLOON_S_NR];
+
+   /* To register callback in oom notifier call chain */
+   struct notifier_block nb;
 };
 
 static struct virtio_device_id id_table[] = {
@@ -290,6 +300,33 @@ static void update_balloon_size(struct virtio_balloon *vb)
  actual);
 }
 
+/*
+ * virtballoon_oom_notify - release pages when system is under severe
+ *  memory pressure (called from out_of_memory())
+ * @self : notifier block struct
+ * @dummy: not used
+ * @parm : returned - number of freed pages
+ *
+ * The balancing of memory by use of the virtio balloon should not cause
+ * the termination of processes while there are pages in the balloon.
+ * If virtio balloon manages to release some memory, it will make the system
+ * return and retry the allocation that forced the OOM killer to run.
+ */
+static int virtballoon_oom_notify(struct notifier_block *self,
+ unsigned long dummy, void *parm)
+{
+   unsigned int num_freed_pages;
+   unsigned long *freed = (unsigned long *)parm;
+   struct virtio_balloon *vb = container_of((struct notifier_block *)self,
+struct virtio_balloon, nb);
+
+   num_freed_pages = leak_balloon(vb, oom_vballoon_pages);
+   update_balloon_size(vb);
+   *freed += num_freed_pages;
+
+   return NOTIFY_OK;
+}
+
 static int balloon(void *_vballoon)
 {
struct virtio_balloon *vb = _vballoon;
@@ -474,6 +511,12 @@ static int virtballoon_probe(struct virtio_device *vdev)
if (err)
goto out_free_vb_mapping;
 
+   vb-nb.notifier_call = virtballoon_oom_notify;
+   vb-nb.priority = VIRTBALLOON_OOM_NOTIFY_PRIORITY;
+   err = register_oom_notifier(vb-nb);
+   if (err  0)
+   goto out_oom_notify;
+
vb-thread = kthread_run(balloon, vb, vballoon);
if (IS_ERR(vb-thread)) {
err = PTR_ERR(vb-thread);
@@ -483,6 +526,8 @@ static int virtballoon_probe(struct virtio_device *vdev)
return 0;
 
 out_del_vqs:
+   unregister_oom_notifier(vb-nb);
+out_oom_notify:
vdev-config-del_vqs(vdev);
 out_free_vb_mapping:
balloon_mapping_free(vb_mapping);
@@ -511,6 +556,7 @@ static void virtballoon_remove(struct virtio_device *vdev)
 {
struct virtio_balloon *vb = vdev-priv;
 
+   unregister_oom_notifier(vb-nb);
kthread_stop(vb-thread);