Re: [Qemu-devel] [virtio-dev] Re: [PATCH v17 6/6] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_VQ

2017-11-19 Thread Michael S. Tsirkin
On Sat, Nov 18, 2017 at 05:22:28AM +, Wang, Wei W wrote:
> On Friday, November 17, 2017 8:45 PM, Michael S. Tsirkin wrote:
> > On Fri, Nov 17, 2017 at 07:35:03PM +0800, Wei Wang wrote:
> > > On 11/16/2017 09:27 PM, Wei Wang wrote:
> > > > On 11/16/2017 04:32 AM, Michael S. Tsirkin wrote:
> > > > > On Fri, Nov 03, 2017 at 04:13:06PM +0800, Wei Wang wrote:
> > > > > > Negotiation of the VIRTIO_BALLOON_F_FREE_PAGE_VQ feature
> > > > > > indicates the support of reporting hints of guest free pages to
> > > > > > the host via virtio-balloon. The host requests the guest to
> > > > > > report the free pages by sending commands via the virtio-balloon
> > configuration registers.
> > > > > >
> > > > > > When the guest starts to report, the first element added to the
> > > > > > free page vq is a sequence id of the start reporting command.
> > > > > > The id is given by the host, and it indicates whether the
> > > > > > following free pages correspond to the command. For example, the
> > > > > > host may stop the report and start again with a new command id.
> > > > > > The obsolete pages for the previous start command can be
> > > > > > detected by the id dismatching on the host. The id is added to
> > > > > > the vq using an output buffer, and the free pages are added to
> > > > > > the vq using input buffer.
> > > > > >
> > > > > > Here are some explainations about the added configuration registers:
> > > > > > - host2guest_cmd: a register used by the host to send commands
> > > > > > to the guest.
> > > > > > - guest2host_cmd: written by the guest to ACK to the host about
> > > > > > the commands that have been received. The host will clear the
> > > > > > corresponding bits on the host2guest_cmd register. The guest
> > > > > > also uses this register to send commands to the host (e.g. when 
> > > > > > finish
> > free page reporting).
> > > > > > - free_page_cmd_id: the sequence id of the free page report
> > > > > > command given by the host.
> > > > > >
> > > > > > Signed-off-by: Wei Wang 
> > > > > > Signed-off-by: Liang Li 
> > > > > > Cc: Michael S. Tsirkin 
> > > > > > Cc: Michal Hocko 
> > > > > > ---
> > > > > >
> > > > > > +
> > > > > > +static void report_free_page(struct work_struct *work) {
> > > > > > +struct virtio_balloon *vb;
> > > > > > +
> > > > > > +vb = container_of(work, struct virtio_balloon,
> > > > > > report_free_page_work);
> > > > > > +report_free_page_cmd_id(vb);
> > > > > > +walk_free_mem_block(vb, 0, _balloon_send_free_pages);
> > > > > > +/*
> > > > > > + * The last few free page blocks that were added may not reach 
> > > > > > the
> > > > > > + * batch size, but need a kick to notify the device to
> > > > > > handle them.
> > > > > > + */
> > > > > > +virtqueue_kick(vb->free_page_vq);
> > > > > > +report_free_page_end(vb);
> > > > > > +}
> > > > > > +
> > > > > I think there's an issue here: if pages are poisoned and
> > > > > hypervisor subsequently drops them, testing them after allocation
> > > > > will trigger a false positive.
> > > > >
> > > > > The specific configuration:
> > > > >
> > > > > PAGE_POISONING on
> > > > > PAGE_POISONING_NO_SANITY off
> > > > > PAGE_POISONING_ZERO off
> > > > >
> > > > >
> > > > > Solutions:
> > > > > 1. disable the feature in that configuration
> > > > > suggested as an initial step
> > > >
> > > > Thanks for the finding.
> > > > Similar to this option: I'm thinking could we make
> > > > walk_free_mem_block() simply return if that option is on?
> > > > That is, at the beginning of the function:
> > > > if (!page_poisoning_enabled())
> > > > return;
> > > >
> > >
> > >
> > > Thought about it more, I think it would be better to put this logic to
> > > virtio_balloon:
> > >
> > > send_free_page_cmd_id(vb, >start_cmd_id);
> > > if (page_poisoning_enabled() &&
> > > !IS_ENABLED(CONFIG_PAGE_POISONING_NO_SANITY))
> > > walk_free_mem_block(vb, 0, 
> > > _balloon_send_free_pages);
> > > send_free_page_cmd_id(vb, >stop_cmd_id);
> > >
> > >
> > > walk_free_mem_block() should be a more generic API, and this potential
> > > page poisoning issue is specific to live migration which is only one
> > > use case of this function, so I think it is better to handle it in the
> > > special use case itself.
> > >
> > > Best,
> > > Wei
> > >
> > 
> > It's a quick work-around but it doesn't make me very happy.
> > 
> > AFAIK e.g. RHEL has a debug kernel with poisoning enabled.
> > If this never uses free page hinting at all, it will be much less useful for
> > debugging guests.
> > 
> 
> I understand your concern. I think people who use debugging guests
> don't regard performance as the first priority, and most vendors
> usually wouldn't use debugging guests for their products.

And when one of these crashes but only after migration what do you do?  A
very common step 

Re: [Qemu-devel] [virtio-dev] Re: [PATCH v17 6/6] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_VQ

2017-11-17 Thread Wang, Wei W
On Friday, November 17, 2017 8:45 PM, Michael S. Tsirkin wrote:
> On Fri, Nov 17, 2017 at 07:35:03PM +0800, Wei Wang wrote:
> > On 11/16/2017 09:27 PM, Wei Wang wrote:
> > > On 11/16/2017 04:32 AM, Michael S. Tsirkin wrote:
> > > > On Fri, Nov 03, 2017 at 04:13:06PM +0800, Wei Wang wrote:
> > > > > Negotiation of the VIRTIO_BALLOON_F_FREE_PAGE_VQ feature
> > > > > indicates the support of reporting hints of guest free pages to
> > > > > the host via virtio-balloon. The host requests the guest to
> > > > > report the free pages by sending commands via the virtio-balloon
> configuration registers.
> > > > >
> > > > > When the guest starts to report, the first element added to the
> > > > > free page vq is a sequence id of the start reporting command.
> > > > > The id is given by the host, and it indicates whether the
> > > > > following free pages correspond to the command. For example, the
> > > > > host may stop the report and start again with a new command id.
> > > > > The obsolete pages for the previous start command can be
> > > > > detected by the id dismatching on the host. The id is added to
> > > > > the vq using an output buffer, and the free pages are added to
> > > > > the vq using input buffer.
> > > > >
> > > > > Here are some explainations about the added configuration registers:
> > > > > - host2guest_cmd: a register used by the host to send commands
> > > > > to the guest.
> > > > > - guest2host_cmd: written by the guest to ACK to the host about
> > > > > the commands that have been received. The host will clear the
> > > > > corresponding bits on the host2guest_cmd register. The guest
> > > > > also uses this register to send commands to the host (e.g. when finish
> free page reporting).
> > > > > - free_page_cmd_id: the sequence id of the free page report
> > > > > command given by the host.
> > > > >
> > > > > Signed-off-by: Wei Wang 
> > > > > Signed-off-by: Liang Li 
> > > > > Cc: Michael S. Tsirkin 
> > > > > Cc: Michal Hocko 
> > > > > ---
> > > > >
> > > > > +
> > > > > +static void report_free_page(struct work_struct *work) {
> > > > > +struct virtio_balloon *vb;
> > > > > +
> > > > > +vb = container_of(work, struct virtio_balloon,
> > > > > report_free_page_work);
> > > > > +report_free_page_cmd_id(vb);
> > > > > +walk_free_mem_block(vb, 0, _balloon_send_free_pages);
> > > > > +/*
> > > > > + * The last few free page blocks that were added may not reach 
> > > > > the
> > > > > + * batch size, but need a kick to notify the device to
> > > > > handle them.
> > > > > + */
> > > > > +virtqueue_kick(vb->free_page_vq);
> > > > > +report_free_page_end(vb);
> > > > > +}
> > > > > +
> > > > I think there's an issue here: if pages are poisoned and
> > > > hypervisor subsequently drops them, testing them after allocation
> > > > will trigger a false positive.
> > > >
> > > > The specific configuration:
> > > >
> > > > PAGE_POISONING on
> > > > PAGE_POISONING_NO_SANITY off
> > > > PAGE_POISONING_ZERO off
> > > >
> > > >
> > > > Solutions:
> > > > 1. disable the feature in that configuration
> > > > suggested as an initial step
> > >
> > > Thanks for the finding.
> > > Similar to this option: I'm thinking could we make
> > > walk_free_mem_block() simply return if that option is on?
> > > That is, at the beginning of the function:
> > > if (!page_poisoning_enabled())
> > > return;
> > >
> >
> >
> > Thought about it more, I think it would be better to put this logic to
> > virtio_balloon:
> >
> > send_free_page_cmd_id(vb, >start_cmd_id);
> > if (page_poisoning_enabled() &&
> > !IS_ENABLED(CONFIG_PAGE_POISONING_NO_SANITY))
> > walk_free_mem_block(vb, 0, _balloon_send_free_pages);
> > send_free_page_cmd_id(vb, >stop_cmd_id);
> >
> >
> > walk_free_mem_block() should be a more generic API, and this potential
> > page poisoning issue is specific to live migration which is only one
> > use case of this function, so I think it is better to handle it in the
> > special use case itself.
> >
> > Best,
> > Wei
> >
> 
> It's a quick work-around but it doesn't make me very happy.
> 
> AFAIK e.g. RHEL has a debug kernel with poisoning enabled.
> If this never uses free page hinting at all, it will be much less useful for
> debugging guests.
> 

I understand your concern. I think people who use debugging guests don't regard 
performance as the first priority, and most vendors usually wouldn't use 
debugging guests for their products.

How about taking it as the initial solution? We can exploit more solutions 
after this series is done.

Best,
Wei





Re: [Qemu-devel] [virtio-dev] Re: [PATCH v17 6/6] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_VQ

2017-11-17 Thread Michael S. Tsirkin
On Thu, Nov 16, 2017 at 09:27:24PM +0800, Wei Wang wrote:
> On 11/16/2017 04:32 AM, Michael S. Tsirkin wrote:
> > On Fri, Nov 03, 2017 at 04:13:06PM +0800, Wei Wang wrote:
> > > Negotiation of the VIRTIO_BALLOON_F_FREE_PAGE_VQ feature indicates the
> > > support of reporting hints of guest free pages to the host via
> > > virtio-balloon. The host requests the guest to report the free pages by
> > > sending commands via the virtio-balloon configuration registers.
> > > 
> > > When the guest starts to report, the first element added to the free page
> > > vq is a sequence id of the start reporting command. The id is given by
> > > the host, and it indicates whether the following free pages correspond
> > > to the command. For example, the host may stop the report and start again
> > > with a new command id. The obsolete pages for the previous start command
> > > can be detected by the id dismatching on the host. The id is added to the
> > > vq using an output buffer, and the free pages are added to the vq using
> > > input buffer.
> > > 
> > > Here are some explainations about the added configuration registers:
> > > - host2guest_cmd: a register used by the host to send commands to the
> > > guest.
> > > - guest2host_cmd: written by the guest to ACK to the host about the
> > > commands that have been received. The host will clear the corresponding
> > > bits on the host2guest_cmd register. The guest also uses this register
> > > to send commands to the host (e.g. when finish free page reporting).
> > > - free_page_cmd_id: the sequence id of the free page report command
> > > given by the host.
> > > 
> > > Signed-off-by: Wei Wang 
> > > Signed-off-by: Liang Li 
> > > Cc: Michael S. Tsirkin 
> > > Cc: Michal Hocko 
> > > ---
> > > 
> > > +
> > > +static void report_free_page(struct work_struct *work)
> > > +{
> > > + struct virtio_balloon *vb;
> > > +
> > > + vb = container_of(work, struct virtio_balloon, report_free_page_work);
> > > + report_free_page_cmd_id(vb);
> > > + walk_free_mem_block(vb, 0, _balloon_send_free_pages);
> > > + /*
> > > +  * The last few free page blocks that were added may not reach the
> > > +  * batch size, but need a kick to notify the device to handle them.
> > > +  */
> > > + virtqueue_kick(vb->free_page_vq);
> > > + report_free_page_end(vb);
> > > +}
> > > +
> > I think there's an issue here: if pages are poisoned and hypervisor
> > subsequently drops them, testing them after allocation will
> > trigger a false positive.
> > 
> > The specific configuration:
> > 
> > PAGE_POISONING on
> > PAGE_POISONING_NO_SANITY off
> > PAGE_POISONING_ZERO off
> > 
> > 
> > Solutions:
> > 1. disable the feature in that configuration
> > suggested as an initial step
> 
> Thanks for the finding.
> Similar to this option: I'm thinking could we make walk_free_mem_block()
> simply return if that option is on?
> That is, at the beginning of the function:
> if (!page_poisoning_enabled())
> return;
> 
> I think in most usages, people would not choose to use the poisoning option
> due to the added overhead.
> 
> 
> Probably we could make it a separate fix patch of this report following
> patch 5 to explain the above reasons in the commit.
> 
> > 2. pass poison value to host so it can validate page content
> > before it drops it
> > 3. pass poison value to host so it can init allocated pages with that value
> > 
> > In fact one nice side effect would be that unmap
> > becomes safe even though free list is not locked anymore.
> 
> I haven't got this point yet,  how would it bring performance benefit?

Upon getting a free page, host could check that its content
matches the poison value. If it doesn't page has been used.

But let's ignore this for now.

> > It would be interesting to see whether this last has
> > any value performance-wise.
> > 
> 
> Best,
> Wei



Re: [Qemu-devel] [virtio-dev] Re: [PATCH v17 6/6] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_VQ

2017-11-17 Thread Michael S. Tsirkin
On Fri, Nov 17, 2017 at 07:35:03PM +0800, Wei Wang wrote:
> On 11/16/2017 09:27 PM, Wei Wang wrote:
> > On 11/16/2017 04:32 AM, Michael S. Tsirkin wrote:
> > > On Fri, Nov 03, 2017 at 04:13:06PM +0800, Wei Wang wrote:
> > > > Negotiation of the VIRTIO_BALLOON_F_FREE_PAGE_VQ feature indicates the
> > > > support of reporting hints of guest free pages to the host via
> > > > virtio-balloon. The host requests the guest to report the free pages by
> > > > sending commands via the virtio-balloon configuration registers.
> > > > 
> > > > When the guest starts to report, the first element added to the
> > > > free page
> > > > vq is a sequence id of the start reporting command. The id is given by
> > > > the host, and it indicates whether the following free pages correspond
> > > > to the command. For example, the host may stop the report and
> > > > start again
> > > > with a new command id. The obsolete pages for the previous start
> > > > command
> > > > can be detected by the id dismatching on the host. The id is
> > > > added to the
> > > > vq using an output buffer, and the free pages are added to the vq using
> > > > input buffer.
> > > > 
> > > > Here are some explainations about the added configuration registers:
> > > > - host2guest_cmd: a register used by the host to send commands to the
> > > > guest.
> > > > - guest2host_cmd: written by the guest to ACK to the host about the
> > > > commands that have been received. The host will clear the corresponding
> > > > bits on the host2guest_cmd register. The guest also uses this register
> > > > to send commands to the host (e.g. when finish free page reporting).
> > > > - free_page_cmd_id: the sequence id of the free page report command
> > > > given by the host.
> > > > 
> > > > Signed-off-by: Wei Wang 
> > > > Signed-off-by: Liang Li 
> > > > Cc: Michael S. Tsirkin 
> > > > Cc: Michal Hocko 
> > > > ---
> > > > 
> > > > +
> > > > +static void report_free_page(struct work_struct *work)
> > > > +{
> > > > +struct virtio_balloon *vb;
> > > > +
> > > > +vb = container_of(work, struct virtio_balloon,
> > > > report_free_page_work);
> > > > +report_free_page_cmd_id(vb);
> > > > +walk_free_mem_block(vb, 0, _balloon_send_free_pages);
> > > > +/*
> > > > + * The last few free page blocks that were added may not reach the
> > > > + * batch size, but need a kick to notify the device to
> > > > handle them.
> > > > + */
> > > > +virtqueue_kick(vb->free_page_vq);
> > > > +report_free_page_end(vb);
> > > > +}
> > > > +
> > > I think there's an issue here: if pages are poisoned and hypervisor
> > > subsequently drops them, testing them after allocation will
> > > trigger a false positive.
> > > 
> > > The specific configuration:
> > > 
> > > PAGE_POISONING on
> > > PAGE_POISONING_NO_SANITY off
> > > PAGE_POISONING_ZERO off
> > > 
> > > 
> > > Solutions:
> > > 1. disable the feature in that configuration
> > > suggested as an initial step
> > 
> > Thanks for the finding.
> > Similar to this option: I'm thinking could we make walk_free_mem_block()
> > simply return if that option is on?
> > That is, at the beginning of the function:
> > if (!page_poisoning_enabled())
> > return;
> > 
> 
> 
> Thought about it more, I think it would be better to put this logic to
> virtio_balloon:
> 
> send_free_page_cmd_id(vb, >start_cmd_id);
> if (page_poisoning_enabled() &&
> !IS_ENABLED(CONFIG_PAGE_POISONING_NO_SANITY))
> walk_free_mem_block(vb, 0, _balloon_send_free_pages);
> send_free_page_cmd_id(vb, >stop_cmd_id);
> 
> 
> walk_free_mem_block() should be a more generic API, and this potential page
> poisoning issue is specific to live migration which is only one use case of
> this function, so I think it is better to handle it in the special use case
> itself.
> 
> Best,
> Wei
> 

It's a quick work-around but it doesn't make me very happy.

AFAIK e.g. RHEL has a debug kernel with poisoning enabled.
If this never uses free page hinting at all, it will
be much less useful for debugging guests.

-- 
MST



Re: [Qemu-devel] [virtio-dev] Re: [PATCH v17 6/6] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_VQ

2017-11-17 Thread Wei Wang

On 11/17/2017 07:35 PM, Wei Wang wrote:

On 11/16/2017 09:27 PM, Wei Wang wrote:

On 11/16/2017 04:32 AM, Michael S. Tsirkin wrote:

On Fri, Nov 03, 2017 at 04:13:06PM +0800, Wei Wang wrote:

Negotiation of the VIRTIO_BALLOON_F_FREE_PAGE_VQ feature indicates the
support of reporting hints of guest free pages to the host via
virtio-balloon. The host requests the guest to report the free 
pages by

sending commands via the virtio-balloon configuration registers.

When the guest starts to report, the first element added to the 
free page

vq is a sequence id of the start reporting command. The id is given by
the host, and it indicates whether the following free pages correspond
to the command. For example, the host may stop the report and start 
again
with a new command id. The obsolete pages for the previous start 
command
can be detected by the id dismatching on the host. The id is added 
to the
vq using an output buffer, and the free pages are added to the vq 
using

input buffer.

Here are some explainations about the added configuration registers:
- host2guest_cmd: a register used by the host to send commands to the
guest.
- guest2host_cmd: written by the guest to ACK to the host about the
commands that have been received. The host will clear the 
corresponding

bits on the host2guest_cmd register. The guest also uses this register
to send commands to the host (e.g. when finish free page reporting).
- free_page_cmd_id: the sequence id of the free page report command
given by the host.

Signed-off-by: Wei Wang 
Signed-off-by: Liang Li 
Cc: Michael S. Tsirkin 
Cc: Michal Hocko 
---

+
+static void report_free_page(struct work_struct *work)
+{
+struct virtio_balloon *vb;
+
+vb = container_of(work, struct virtio_balloon, 
report_free_page_work);

+report_free_page_cmd_id(vb);
+walk_free_mem_block(vb, 0, _balloon_send_free_pages);
+/*
+ * The last few free page blocks that were added may not reach 
the
+ * batch size, but need a kick to notify the device to handle 
them.

+ */
+virtqueue_kick(vb->free_page_vq);
+report_free_page_end(vb);
+}
+

I think there's an issue here: if pages are poisoned and hypervisor
subsequently drops them, testing them after allocation will
trigger a false positive.

The specific configuration:

PAGE_POISONING on
PAGE_POISONING_NO_SANITY off
PAGE_POISONING_ZERO off


Solutions:
1. disable the feature in that configuration
suggested as an initial step


Thanks for the finding.
Similar to this option: I'm thinking could we make 
walk_free_mem_block() simply return if that option is on?

That is, at the beginning of the function:
if (!page_poisoning_enabled())
return;




Thought about it more, I think it would be better to put this logic to 
virtio_balloon:


send_free_page_cmd_id(vb, >start_cmd_id);
if (page_poisoning_enabled() &&
!IS_ENABLED(CONFIG_PAGE_POISONING_NO_SANITY))
walk_free_mem_block(vb, 0, 
_balloon_send_free_pages);


logic should be inverse:
if (!(page_poisoning_enabled() &&
!IS_ENABLED(CONFIG_PAGE_POISONING_NO_SANITY)))

Best,
Wei




Re: [Qemu-devel] [virtio-dev] Re: [PATCH v17 6/6] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_VQ

2017-11-17 Thread Wei Wang

On 11/16/2017 09:27 PM, Wei Wang wrote:

On 11/16/2017 04:32 AM, Michael S. Tsirkin wrote:

On Fri, Nov 03, 2017 at 04:13:06PM +0800, Wei Wang wrote:

Negotiation of the VIRTIO_BALLOON_F_FREE_PAGE_VQ feature indicates the
support of reporting hints of guest free pages to the host via
virtio-balloon. The host requests the guest to report the free pages by
sending commands via the virtio-balloon configuration registers.

When the guest starts to report, the first element added to the free 
page

vq is a sequence id of the start reporting command. The id is given by
the host, and it indicates whether the following free pages correspond
to the command. For example, the host may stop the report and start 
again
with a new command id. The obsolete pages for the previous start 
command
can be detected by the id dismatching on the host. The id is added 
to the

vq using an output buffer, and the free pages are added to the vq using
input buffer.

Here are some explainations about the added configuration registers:
- host2guest_cmd: a register used by the host to send commands to the
guest.
- guest2host_cmd: written by the guest to ACK to the host about the
commands that have been received. The host will clear the corresponding
bits on the host2guest_cmd register. The guest also uses this register
to send commands to the host (e.g. when finish free page reporting).
- free_page_cmd_id: the sequence id of the free page report command
given by the host.

Signed-off-by: Wei Wang 
Signed-off-by: Liang Li 
Cc: Michael S. Tsirkin 
Cc: Michal Hocko 
---

+
+static void report_free_page(struct work_struct *work)
+{
+struct virtio_balloon *vb;
+
+vb = container_of(work, struct virtio_balloon, 
report_free_page_work);

+report_free_page_cmd_id(vb);
+walk_free_mem_block(vb, 0, _balloon_send_free_pages);
+/*
+ * The last few free page blocks that were added may not reach the
+ * batch size, but need a kick to notify the device to handle 
them.

+ */
+virtqueue_kick(vb->free_page_vq);
+report_free_page_end(vb);
+}
+

I think there's an issue here: if pages are poisoned and hypervisor
subsequently drops them, testing them after allocation will
trigger a false positive.

The specific configuration:

PAGE_POISONING on
PAGE_POISONING_NO_SANITY off
PAGE_POISONING_ZERO off


Solutions:
1. disable the feature in that configuration
suggested as an initial step


Thanks for the finding.
Similar to this option: I'm thinking could we make 
walk_free_mem_block() simply return if that option is on?

That is, at the beginning of the function:
if (!page_poisoning_enabled())
return;




Thought about it more, I think it would be better to put this logic to 
virtio_balloon:


send_free_page_cmd_id(vb, >start_cmd_id);
if (page_poisoning_enabled() &&
!IS_ENABLED(CONFIG_PAGE_POISONING_NO_SANITY))
walk_free_mem_block(vb, 0, 
_balloon_send_free_pages);

send_free_page_cmd_id(vb, >stop_cmd_id);


walk_free_mem_block() should be a more generic API, and this potential 
page poisoning issue is specific to live migration which is only one use 
case of this function, so I think it is better to handle it in the 
special use case itself.


Best,
Wei






Re: [Qemu-devel] [virtio-dev] Re: [PATCH v17 6/6] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_VQ

2017-11-16 Thread Wei Wang

On 11/16/2017 04:32 AM, Michael S. Tsirkin wrote:

On Fri, Nov 03, 2017 at 04:13:06PM +0800, Wei Wang wrote:

Negotiation of the VIRTIO_BALLOON_F_FREE_PAGE_VQ feature indicates the
support of reporting hints of guest free pages to the host via
virtio-balloon. The host requests the guest to report the free pages by
sending commands via the virtio-balloon configuration registers.

When the guest starts to report, the first element added to the free page
vq is a sequence id of the start reporting command. The id is given by
the host, and it indicates whether the following free pages correspond
to the command. For example, the host may stop the report and start again
with a new command id. The obsolete pages for the previous start command
can be detected by the id dismatching on the host. The id is added to the
vq using an output buffer, and the free pages are added to the vq using
input buffer.

Here are some explainations about the added configuration registers:
- host2guest_cmd: a register used by the host to send commands to the
guest.
- guest2host_cmd: written by the guest to ACK to the host about the
commands that have been received. The host will clear the corresponding
bits on the host2guest_cmd register. The guest also uses this register
to send commands to the host (e.g. when finish free page reporting).
- free_page_cmd_id: the sequence id of the free page report command
given by the host.

Signed-off-by: Wei Wang 
Signed-off-by: Liang Li 
Cc: Michael S. Tsirkin 
Cc: Michal Hocko 
---

+
+static void report_free_page(struct work_struct *work)
+{
+   struct virtio_balloon *vb;
+
+   vb = container_of(work, struct virtio_balloon, report_free_page_work);
+   report_free_page_cmd_id(vb);
+   walk_free_mem_block(vb, 0, _balloon_send_free_pages);
+   /*
+* The last few free page blocks that were added may not reach the
+* batch size, but need a kick to notify the device to handle them.
+*/
+   virtqueue_kick(vb->free_page_vq);
+   report_free_page_end(vb);
+}
+

I think there's an issue here: if pages are poisoned and hypervisor
subsequently drops them, testing them after allocation will
trigger a false positive.

The specific configuration:

PAGE_POISONING on
PAGE_POISONING_NO_SANITY off
PAGE_POISONING_ZERO off


Solutions:
1. disable the feature in that configuration
suggested as an initial step


Thanks for the finding.
Similar to this option: I'm thinking could we make walk_free_mem_block() 
simply return if that option is on?

That is, at the beginning of the function:
if (!page_poisoning_enabled())
return;

I think in most usages, people would not choose to use the poisoning 
option due to the added overhead.



Probably we could make it a separate fix patch of this report following 
patch 5 to explain the above reasons in the commit.



2. pass poison value to host so it can validate page content
before it drops it
3. pass poison value to host so it can init allocated pages with that value

In fact one nice side effect would be that unmap
becomes safe even though free list is not locked anymore.


I haven't got this point yet,  how would it bring performance benefit?


It would be interesting to see whether this last has
any value performance-wise.



Best,
Wei