[PATCH] mm: don't zero ballooned pages

2017-07-30 Thread Wei Wang
Ballooned pages will be marked as MADV_DONTNEED by the hypervisor and
shouldn't be given to the host ksmd to scan. Therefore, it is not
necessary to zero ballooned pages, which is very time consuming when
the page amount is large. The ongoing fast balloon tests show that the
time to balloon 7G pages is increased from ~491ms to 2.8 seconds with
__GFP_ZERO added. So, this patch removes the flag.

Signed-off-by: Wei Wang 
---
 mm/balloon_compaction.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c
index 9075aa5..b06d9fe 100644
--- a/mm/balloon_compaction.c
+++ b/mm/balloon_compaction.c
@@ -24,7 +24,7 @@ struct page *balloon_page_enqueue(struct balloon_dev_info 
*b_dev_info)
 {
unsigned long flags;
struct page *page = alloc_page(balloon_mapping_gfp_mask() |
-   __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_ZERO);
+  __GFP_NOMEMALLOC | __GFP_NORETRY);
if (!page)
return NULL;
 
-- 
2.7.4

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v12 5/8] virtio-balloon: VIRTIO_BALLOON_F_SG

2017-07-30 Thread Michael S. Tsirkin
On Sun, Jul 30, 2017 at 05:59:17AM +, Wang, Wei W wrote:
> On Sunday, July 30, 2017 12:23 PM, Michael S. Tsirkin wrote:
> > On Sat, Jul 29, 2017 at 08:47:08PM +0800, Wei Wang wrote:
> > > On 07/29/2017 07:08 AM, Michael S. Tsirkin wrote:
> > > > On Thu, Jul 27, 2017 at 10:50:11AM +0800, Wei Wang wrote:
> > > > > > > > OK I thought this over. While we might need these new APIs
> > > > > > > > in the future, I think that at the moment, there's a way to
> > > > > > > > implement this feature that is significantly simpler. Just
> > > > > > > > add each s/g as a separate input buffer.
> > > > > > > Should it be an output buffer?
> > > > > > Hypervisor overwrites these pages with zeroes. Therefore it is
> > > > > > writeable by device: DMA_FROM_DEVICE.
> > > > > Why would the hypervisor need to zero the buffer?
> > > > The page is supplied to hypervisor and can lose the value that is
> > > > there.  That is the definition of writeable by device.
> > >
> > > I think for the free pages, it should be clear that they will be added
> > > as output buffer to the device, because (as we discussed) they are
> > > just hints, and some of them may be used by the guest after the report_ 
> > > API is
> > invoked.
> > > The device/hypervisor should not use or discard them.
> > 
> > Discarding contents is exactly what you propose doing if migration is going 
> > on,
> > isn't it?
> 
> That's actually a different concept. Please let me explain it with this 
> example:
> 
> The hypervisor receives the hint saying the guest PageX is a free page, but 
> as we know, 
> after that report_ API exits, the guest kernel may take PageX to use, so 
> PageX is not free
> page any more. At this time, if the hypervisor writes to the page, that would 
> crash the guest.
> So, I think the cornerstone of this work is that the hypervisor should not 
> touch the
> reported pages.
> 
> Best,
> Wei

That's a hypervisor implementation detail. From guest point of view,
discarding contents can not be distinguished from writing old contents.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net] Revert "vhost: cache used event for better performance"

2017-07-30 Thread K. Den
On Wed, 2017-07-26 at 19:08 +0300, Michael S. Tsirkin wrote:
> On Wed, Jul 26, 2017 at 09:37:15PM +0800, Jason Wang wrote:
> > 
> > 
> > On 2017年07月26日 21:18, Jason Wang wrote:
> > > 
> > > 
> > > On 2017年07月26日 20:57, Michael S. Tsirkin wrote:
> > > > On Wed, Jul 26, 2017 at 04:03:17PM +0800, Jason Wang wrote:
> > > > > This reverts commit 809ecb9bca6a9424ccd392d67e368160f8b76c92. Since it
> > > > > was reported to break vhost_net. We want to cache used event and use
> > > > > it to check for notification. We try to valid cached used event by
> > > > > checking whether or not it was ahead of new, but this is not correct
> > > > > all the time, it could be stale and there's no way to know about this.
> > > > > 
> > > > > Signed-off-by: Jason Wang
> > > > 
> > > > Could you supply a bit more data here please?  How does it get stale?
> > > > What does guest need to do to make it stale?  This will be helpful if
> > > > anyone wants to bring it back, or if we want to extend the protocol.
> > > > 
> > > 
> > > The problem we don't know whether or not guest has published a new used
> > > event. The check vring_need_event(vq->last_used_event, new + vq->num,
> > > new) is not sufficient to check for this.
> > > 
> > > Thanks
> > 
> > More notes, the previous assumption is that we don't move used event back,
> > but this could happen in fact if idx is wrapper around.
> 
> You mean if the 16 bit index wraps around after 64K entries.
> Makes sense.
> 
> > Will repost and add
> > this into commit log.
> > 
> > Thanks

Hi,

I am just curious but I have got a question:
AFAIU, if you wanted to keep the caching mechanism alive in the code base,
the following two changes could clear off the issue, or not?:
(1) Always fetch the latest event value from guest when signalled_used event is
invalid, which includes last_used_idx wraps-around case. Otherwise we might need
changes which would complicate too much the logic to properly decide whether or
not to skip signalling in the next vhost_notify round.
(2) On top of that, split the signal-postponing logic to three cases like:
* if the interval of vq.num is [2^16, UINT_MAX]:
any cached event is in should-postpone-signalling interval, so paradoxically
must always do signalling.
* else if the interval of vq.num is [2^15, 2^16):
the logic in the original patch (809ecb9bca6a9) suffices
* else (= less than 2^15) (optional):
checking only (vring_need_event(vq->last_used_event, new + vq->num, new)
would suffice.

Am I missing something, or is this irrelevant?
I would appreciate if you could elaborate a bit more how the situation where
event idx wraps around and moves back would make trouble.

Thanks.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization