Re: [PATCH] mm: don't zero ballooned pages

2017-10-26 Thread ZhenweiPi

On 07/31/2017 03:51 PM, Michal Hocko wrote:


On Mon 31-07-17 15:41:49, Wei Wang wrote:

>On 07/31/2017 02:55 PM, Michal Hocko wrote:

> >On Mon 31-07-17 12:13:33, Wei Wang wrote:

> >>Ballooned pages will be marked as MADV_DONTNEED by the hypervisor and
> >>shouldn't be given to the host ksmd to scan.

> >Could you point me where this MADV_DONTNEED is done, please?

>
>Sure. It's done in the hypervisor when the balloon pages are received.
>
>Please see line 40 at
>https://github.com/qemu/qemu/blob/master/hw/virtio/virtio-balloon.c

And one more thing. I am not familiar with ksm much. But how is
MADV_DONTNEED even helping? This madvise is not sticky - aka it will
unmap the range without leaving any note behind. AFAICS the only way
to have vma scanned is to have VM_MERGEABLE and that is an opt in:
See Documentation/vm/ksm.txt
"
KSM only operates on those areas of address space which an application
has advised to be likely candidates for merging, by using the madvise(2)
system call: int madvise(addr, length, MADV_MERGEABLE).
"

So what exactly is going on here? The original patch looks highly
suspicious as well. If somebody wants to make that memory mergable then
the user of that memory should zero them out.


Kernel starts a kthread named "ksmd". ksmd scans the VM_MERGEABLE

memory, and merge the same pages.(same page means memcmp(page1,

page2, PAGESIZE) == 0).

Guest can not use ballooned pages, and these pages will not be accessed

in a long time. Kswapd on host will swap these pages out and get more

free memory.

Rather than swapping, KSM has better performence.  Presently pages in

the balloon device have random value,  they usually cannot be merged.

So enqueue zero pages will resolve this problem.

Because MADV_DONTNEED depends on host os capability and hypervisor capability,
I prefer to enqueue zero pages to balloon device and made this patch.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] mm: don't zero ballooned pages

2017-10-26 Thread ZhenweiPi

on qemu upstream, code in qemu/util/osdep.c

int qemu_madvise(void *addr, size_t len, int advice)

{

if (advice == QEMU_MADV_INVALID) {

errno = EINVAL;

return -1;

}

#if defined(CONFIG_MADVISE)

return madvise(addr, len, advice);

#elif defined(CONFIG_POSIX_MADVISE)

return posix_madvise(addr, len, advice);

#else

errno = EINVAL;

return -1;

#endif

}

Host OS maybe not support MADV_DONTNEED.
And madvise syscall uses more time.


On 07/31/2017 02:55 PM, Michal Hocko wrote:

On Mon 31-07-17 12:13:33, Wei Wang wrote:

Ballooned pages will be marked as MADV_DONTNEED by the hypervisor and
shouldn't be given to the host ksmd to scan.

Could you point me where this MADV_DONTNEED is done, please?


Therefore, it is not
necessary to zero ballooned pages, which is very time consuming when
the page amount is large. The ongoing fast balloon tests show that the
time to balloon 7G pages is increased from ~491ms to 2.8 seconds with
__GFP_ZERO added. So, this patch removes the flag.

Please make it obvious that this is a revert of bb01b64cfab7
("mm/balloon_compaction.c: enqueue zero page to balloon device").


Signed-off-by: Wei Wang 
---
  mm/balloon_compaction.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c
index 9075aa5..b06d9fe 100644
--- a/mm/balloon_compaction.c
+++ b/mm/balloon_compaction.c
@@ -24,7 +24,7 @@ struct page *balloon_page_enqueue(struct balloon_dev_info 
*b_dev_info)
  {
unsigned long flags;
struct page *page = alloc_page(balloon_mapping_gfp_mask() |
-   __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_ZERO);
+  __GFP_NOMEMALLOC | __GFP_NORETRY);
if (!page)
return NULL;
  
--

2.7.4



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] mm: don't zero ballooned pages

2017-08-01 Thread Michael S. Tsirkin
On Mon, Jul 31, 2017 at 10:37:24AM +0200, Michal Hocko wrote:
> On Mon 31-07-17 16:23:26, ZhenweiPi wrote:
> > On 07/31/2017 03:51 PM, Michal Hocko wrote:
> > 
> > >On Mon 31-07-17 15:41:49, Wei Wang wrote:
> > >>>On 07/31/2017 02:55 PM, Michal Hocko wrote:
> >  >On Mon 31-07-17 12:13:33, Wei Wang wrote:
> > > >>Ballooned pages will be marked as MADV_DONTNEED by the hypervisor 
> > > >>and
> > > >>shouldn't be given to the host ksmd to scan.
> >  >Could you point me where this MADV_DONTNEED is done, please?
> > >>>
> > >>>Sure. It's done in the hypervisor when the balloon pages are received.
> > >>>
> > >>>Please see line 40 at
> > >>>https://github.com/qemu/qemu/blob/master/hw/virtio/virtio-balloon.c
> > >And one more thing. I am not familiar with ksm much. But how is
> > >MADV_DONTNEED even helping? This madvise is not sticky - aka it will
> > >unmap the range without leaving any note behind. AFAICS the only way
> > >to have vma scanned is to have VM_MERGEABLE and that is an opt in:
> > >See Documentation/vm/ksm.txt
> > >"
> > >KSM only operates on those areas of address space which an application
> > >has advised to be likely candidates for merging, by using the madvise(2)
> > >system call: int madvise(addr, length, MADV_MERGEABLE).
> > >"
> > >
> > >So what exactly is going on here? The original patch looks highly
> > >suspicious as well. If somebody wants to make that memory mergable then
> > >the user of that memory should zero them out.
> > 
> > Kernel starts a kthread named "ksmd". ksmd scans the VM_MERGEABLE
> > memory, and merge the same pages.(same page means memcmp(page1,
> > page2, PAGESIZE) == 0).
> > 
> > Guest can not use ballooned pages, and these pages will not be accessed
> > in a long time. Kswapd on host will swap these pages out and get more
> > free memory.
> > 
> > Rather than swapping, KSM has better performence.  Presently pages in
> > the balloon device have random value,  they usually cannot be merged.
> > So enqueue zero pages will resolve this problem.
> > 
> > Because MADV_DONTNEED depends on host os capability and hypervisor 
> > capability,
> > I prefer to enqueue zero pages to balloon device and made this patch.

I think you should have hypervisor zero them out if it wants to then. Seems 
cleaner.

> 
> So why exactly are we zeroying pages (and pay some cost for that) in
> guest when we do not know what host actually does with them?

I suspect this is some special hypervisor that somehow benefits from
this patch. It should just use a feature bit for its special needs
I think.

Michal is also exactly right that patches like this should come
with some performance numbers.
I'll post a patch adding virtio lists for mm/balloon_compaction.c
so that we notice when people tweak it like that.

> -- 
> Michal Hocko
> SUSE Labs
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] mm: don't zero ballooned pages

2017-07-31 Thread Michal Hocko
On Mon 31-07-17 16:23:26, ZhenweiPi wrote:
> On 07/31/2017 03:51 PM, Michal Hocko wrote:
> 
> >On Mon 31-07-17 15:41:49, Wei Wang wrote:
> >>>On 07/31/2017 02:55 PM, Michal Hocko wrote:
>  >On Mon 31-07-17 12:13:33, Wei Wang wrote:
> > >>Ballooned pages will be marked as MADV_DONTNEED by the hypervisor and
> > >>shouldn't be given to the host ksmd to scan.
>  >Could you point me where this MADV_DONTNEED is done, please?
> >>>
> >>>Sure. It's done in the hypervisor when the balloon pages are received.
> >>>
> >>>Please see line 40 at
> >>>https://github.com/qemu/qemu/blob/master/hw/virtio/virtio-balloon.c
> >And one more thing. I am not familiar with ksm much. But how is
> >MADV_DONTNEED even helping? This madvise is not sticky - aka it will
> >unmap the range without leaving any note behind. AFAICS the only way
> >to have vma scanned is to have VM_MERGEABLE and that is an opt in:
> >See Documentation/vm/ksm.txt
> >"
> >KSM only operates on those areas of address space which an application
> >has advised to be likely candidates for merging, by using the madvise(2)
> >system call: int madvise(addr, length, MADV_MERGEABLE).
> >"
> >
> >So what exactly is going on here? The original patch looks highly
> >suspicious as well. If somebody wants to make that memory mergable then
> >the user of that memory should zero them out.
> 
> Kernel starts a kthread named "ksmd". ksmd scans the VM_MERGEABLE
> memory, and merge the same pages.(same page means memcmp(page1,
> page2, PAGESIZE) == 0).
> 
> Guest can not use ballooned pages, and these pages will not be accessed
> in a long time. Kswapd on host will swap these pages out and get more
> free memory.
> 
> Rather than swapping, KSM has better performence.  Presently pages in
> the balloon device have random value,  they usually cannot be merged.
> So enqueue zero pages will resolve this problem.
> 
> Because MADV_DONTNEED depends on host os capability and hypervisor capability,
> I prefer to enqueue zero pages to balloon device and made this patch.

So why exactly are we zeroying pages (and pay some cost for that) in
guest when we do not know what host actually does with them?
-- 
Michal Hocko
SUSE Labs
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] mm: don't zero ballooned pages

2017-07-31 Thread Michal Hocko
On Mon 31-07-17 15:41:49, Wei Wang wrote:
> On 07/31/2017 02:55 PM, Michal Hocko wrote:
> >On Mon 31-07-17 12:13:33, Wei Wang wrote:
> >>Ballooned pages will be marked as MADV_DONTNEED by the hypervisor and
> >>shouldn't be given to the host ksmd to scan.
> >Could you point me where this MADV_DONTNEED is done, please?
> 
> Sure. It's done in the hypervisor when the balloon pages are received.
> 
> Please see line 40 at
> https://github.com/qemu/qemu/blob/master/hw/virtio/virtio-balloon.c

Thanks. Are all hypervisors which are using this API doing this?
bb01b64cfab7 doesn't mention the specify hypervisor nor does it mention
any real numbers so I suspect the revert is the right thing to do but
the changelog should mention all those details.
-- 
Michal Hocko
SUSE Labs
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] mm: don't zero ballooned pages

2017-07-31 Thread Wei Wang

On 07/31/2017 02:55 PM, Michal Hocko wrote:

On Mon 31-07-17 12:13:33, Wei Wang wrote:

Ballooned pages will be marked as MADV_DONTNEED by the hypervisor and
shouldn't be given to the host ksmd to scan.

Could you point me where this MADV_DONTNEED is done, please?


Sure. It's done in the hypervisor when the balloon pages are received.

Please see line 40 at
https://github.com/qemu/qemu/blob/master/hw/virtio/virtio-balloon.c





Therefore, it is not
necessary to zero ballooned pages, which is very time consuming when
the page amount is large. The ongoing fast balloon tests show that the
time to balloon 7G pages is increased from ~491ms to 2.8 seconds with
__GFP_ZERO added. So, this patch removes the flag.

Please make it obvious that this is a revert of bb01b64cfab7
("mm/balloon_compaction.c: enqueue zero page to balloon device").




Ok, will do.

Best,
Wei
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] mm: don't zero ballooned pages

2017-07-31 Thread Michal Hocko
On Mon 31-07-17 12:13:33, Wei Wang wrote:
> Ballooned pages will be marked as MADV_DONTNEED by the hypervisor and
> shouldn't be given to the host ksmd to scan.

Could you point me where this MADV_DONTNEED is done, please?

> Therefore, it is not
> necessary to zero ballooned pages, which is very time consuming when
> the page amount is large. The ongoing fast balloon tests show that the
> time to balloon 7G pages is increased from ~491ms to 2.8 seconds with
> __GFP_ZERO added. So, this patch removes the flag.

Please make it obvious that this is a revert of bb01b64cfab7
("mm/balloon_compaction.c: enqueue zero page to balloon device").

> Signed-off-by: Wei Wang 
> ---
>  mm/balloon_compaction.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c
> index 9075aa5..b06d9fe 100644
> --- a/mm/balloon_compaction.c
> +++ b/mm/balloon_compaction.c
> @@ -24,7 +24,7 @@ struct page *balloon_page_enqueue(struct balloon_dev_info 
> *b_dev_info)
>  {
>   unsigned long flags;
>   struct page *page = alloc_page(balloon_mapping_gfp_mask() |
> - __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_ZERO);
> +__GFP_NOMEMALLOC | __GFP_NORETRY);
>   if (!page)
>   return NULL;
>  
> -- 
> 2.7.4
> 

-- 
Michal Hocko
SUSE Labs
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH] mm: don't zero ballooned pages

2017-07-30 Thread Wei Wang
Ballooned pages will be marked as MADV_DONTNEED by the hypervisor and
shouldn't be given to the host ksmd to scan. Therefore, it is not
necessary to zero ballooned pages, which is very time consuming when
the page amount is large. The ongoing fast balloon tests show that the
time to balloon 7G pages is increased from ~491ms to 2.8 seconds with
__GFP_ZERO added. So, this patch removes the flag.

Signed-off-by: Wei Wang 
---
 mm/balloon_compaction.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c
index 9075aa5..b06d9fe 100644
--- a/mm/balloon_compaction.c
+++ b/mm/balloon_compaction.c
@@ -24,7 +24,7 @@ struct page *balloon_page_enqueue(struct balloon_dev_info 
*b_dev_info)
 {
unsigned long flags;
struct page *page = alloc_page(balloon_mapping_gfp_mask() |
-   __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_ZERO);
+  __GFP_NOMEMALLOC | __GFP_NORETRY);
if (!page)
return NULL;
 
-- 
2.7.4

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization