[PATCH v1 18/19] zsmalloc: migrate tail pages in zspage

2016-03-10 Thread Minchan Kim
This patch enables tail page migration of zspage. In this point, I tested zsmalloc regression with micro-benchmark which does zs_malloc/map/unmap/zs_free for all size class in every CPU(my system is 12) during 20 sec. It shows 1% regression which is really small when we consider the benefit of

[PATCH v1 15/19] zsmalloc: zs_compact refactoring

2016-03-10 Thread Minchan Kim
Currently, we rely on class->lock to prevent zspage destruction. It was okay until now because the critical section is short but with run-time migration, it could be long so class->lock is not a good apporach any more. So, this patch introduces [un]freeze_zspage functions which freeze allocated

[PATCH v1 16/19] zsmalloc: migrate head page of zspage

2016-03-10 Thread Minchan Kim
This patch introduces run-time migration feature for zspage. To begin with, it supports only head page migration for easy review(later patches will support tail page migration). For migration, it supports three functions * zs_page_isolate It isolates a zspage which includes a subpage VM want to

[PATCH v1 19/19] zram: use __GFP_MOVABLE for memory allocation

2016-03-10 Thread Minchan Kim
Zsmalloc is ready for page migration so zram can use __GFP_MOVABLE from now on. I did test to see how it helps to make higher order pages. Test scenario is as follows. KVM guest, 1G memory, ext4 formated zram block device, for i in `seq 1 8`; do dd if=/dev/vda1 of=mnt/test$i.txt bs=128M

[PATCH v1 17/19] zsmalloc: use single linked list for page chain

2016-03-10 Thread Minchan Kim
For tail page migration, we shouldn't use page->lru which was used for page chaining because VM will use it for own purpose so that we need another field for chaining. For chaining, singly linked list is enough and page->index of tail page to point first object offset in the page could be replaced

[PATCH v1 12/19] zsmalloc: move struct zs_meta from mapping to freelist

2016-03-10 Thread Minchan Kim
For supporting migration from VM, we need to have address_space on every page so zsmalloc shouldn't use page->mapping. So, this patch moves zs_meta from mapping to freelist. Signed-off-by: Minchan Kim --- mm/zsmalloc.c | 23 --- 1 file changed, 12

[PATCH v1 14/19] zsmalloc: separate free_zspage from putback_zspage

2016-03-10 Thread Minchan Kim
Currently, putback_zspage does free zspage under class->lock if fullness become ZS_EMPTY but it makes trouble to implement locking scheme for new zspage migration. So, this patch is to separate free_zspage from putback_zspage and free zspage out of class->lock which is preparation for zspage

[PATCH v1 11/19] zsmalloc: squeeze freelist into page->mapping

2016-03-10 Thread Minchan Kim
Zsmalloc stores first free object's position into first_page->freelist in each zspage. If we change it with object index from first_page instead of location, we could squeeze it into page->mapping because the number of bit we need to store offset is at most 11bit. Signed-off-by: Minchan Kim

[PATCH v1 13/19] zsmalloc: factor page chain functionality out

2016-03-10 Thread Minchan Kim
For migration, we need to create sub-page chain of zspage dynamically so this patch factors it out from alloc_zspage. As a minor refactoring, it makes OBJ_ALLOCATED_TAG assign more clear in obj_malloc(it could be another patch but it's trivial so I want to put together in this patch).

[PATCH v1 10/19] zsmalloc: squeeze inuse into page->mapping

2016-03-10 Thread Minchan Kim
Currently, we store class:fullness into page->mapping. The number of class we can support is 255 and fullness is 4 so (8 + 2 = 10bit) is enough to represent them. Meanwhile, the bits we need to store in-use objects in zspage is that 11bit is enough. For example, If we assume that 64K PAGE_SIZE,

[PATCH v1 02/19] mm/compaction: support non-lru movable page migration

2016-03-10 Thread Minchan Kim
We have allowed migration for only LRU pages until now and it was enough to make high-order pages. But recently, embedded system(e.g., webOS, android) uses lots of non-movable pages(e.g., zram, GPU memory) so we have seen several reports about troubles of small high-order allocation. For fixing

[PATCH v1 09/19] zsmalloc: keep max_object in size_class

2016-03-10 Thread Minchan Kim
Every zspage in a size_class has same number of max objects so we could move it to a size_class. Signed-off-by: Minchan Kim --- mm/zsmalloc.c | 29 ++--- 1 file changed, 14 insertions(+), 15 deletions(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c

[PATCH v1 04/19] mm/balloon: use general movable page feature into balloon

2016-03-10 Thread Minchan Kim
Now, VM has a feature to migrate non-lru movable pages so balloon doesn't need custom migration hooks in migrate.c and compact.c. Instead, this patch implements page->mapping ->{isolate|migrate|putback} functions. With that, we could remove hooks for ballooning in general migration functions and

[PATCH v1 03/19] fs/anon_inodes: new interface to create new inode

2016-03-10 Thread Minchan Kim
From: Gioh Kim The anon_inodes has already complete interfaces to create manage many anonymous inodes but don't have interface to get new inode. Other sub-modules can create anonymous inode without creating and mounting it's own pseudo filesystem. Acked-by: Rafael Aquini

[PATCH v1 07/19] zsmalloc: reordering function parameter

2016-03-10 Thread Minchan Kim
This patch cleans up function parameter ordering to order higher data structure first. Signed-off-by: Minchan Kim --- mm/zsmalloc.c | 50 ++ 1 file changed, 26 insertions(+), 24 deletions(-) diff --git a/mm/zsmalloc.c

[PATCH v1 06/19] zsmalloc: clean up many BUG_ON

2016-03-10 Thread Minchan Kim
There are many BUG_ON in zsmalloc.c which is not recommened so change them as alternatives. Normal rule is as follows: 1. avoid BUG_ON if possible. Instead, use VM_BUG_ON or VM_BUG_ON_PAGE 2. use VM_BUG_ON_PAGE if we need to see struct page's fields 3. use those assertion in primitive functions

[PATCH v1 05/19] zsmalloc: use first_page rather than page

2016-03-10 Thread Minchan Kim
This patch cleans up function parameter "struct page". Many functions of zsmalloc expects that page paramter is "first_page" so use "first_page" rather than "page" for code readability. Signed-off-by: Minchan Kim --- mm/zsmalloc.c | 62

[PATCH v1 08/19] zsmalloc: remove unused pool param in obj_free

2016-03-10 Thread Minchan Kim
Let's remove unused pool param in obj_free Signed-off-by: Minchan Kim --- mm/zsmalloc.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 156edf909046..b4fb11831acb 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@

[PATCH v1 01/19] mm: use put_page to free page instead of putback_lru_page

2016-03-10 Thread Minchan Kim
Procedure of page migration is as follows: First of all, it should isolate a page from LRU and try to migrate the page. If it is successful, it releases the page for freeing. Otherwise, it should put the page back to LRU list. For LRU pages, we have used putback_lru_page for both freeing and

[PATCH v1 00/19] Support non-lru page migration

2016-03-10 Thread Minchan Kim
Recently, I got many reports about perfermance degradation in embedded system(Android mobile phone, webOS TV and so on) and failed to fork easily. The problem was fragmentation caused by zram and GPU driver pages. Their pages cannot be migrated so compaction cannot work well, either so reclaimer

RE: [RFC qemu 0/4] A PV solution for live migration optimization

2016-03-10 Thread Li, Liang Z
> > Hi, > I'm just catching back up on this thread; so without reference to any > particular previous mail in the thread. > > 1) How many of the free pages do we tell the host about? > Your main change is telling the host about all the > free pages. Yes, all the guest's free

Re: virtio-vsock live migration

2016-03-10 Thread Michael S. Tsirkin
On Thu, Mar 03, 2016 at 03:37:37PM +, Stefan Hajnoczi wrote: > Michael pointed out that the virtio-vsock draft specification does not > address live migration and in fact currently precludes migration. > > Migration is fundamental so the device specification at least mustn't > preclude it.

Re: [RFC -next 2/2] virtio_net: Read and use the advised MTU

2016-03-10 Thread Sergei Shtylyov
Hello. On 03/10/2016 05:28 PM, Aaron Conole wrote: This patch checks the feature bit for the VIRTIO_NET_F_MTU feature. If it exists, read the advised MTU and use it. No proper error handling is provided for the case where a user changes the negotiated MTU. A future commit will add proper

KVM Forum 2016: Call For Participation

2016-03-10 Thread Paolo Bonzini
= KVM Forum 2016: Call For Participation August 24-26, 2016 - Westin Harbor Castle - Toronto, Canada (All submissions must be received before midnight May 1, 2016) =

Re: [RFC -next 2/2] virtio_net: Read and use the advised MTU

2016-03-10 Thread Paolo Abeni
On Thu, 2016-03-10 at 09:28 -0500, Aaron Conole wrote: > This patch checks the feature bit for the VIRTIO_NET_F_MTU feature. If it > exists, read the advised MTU and use it. > > No proper error handling is provided for the case where a user changes the > negotiated MTU. A future commit will add

[RFC -next 0/2] virtio-net: Advised MTU feature

2016-03-10 Thread Aaron Conole
The following series adds the ability for a hypervisor to set an MTU on the guest during feature negotiation phase. This is useful for VM orchestration when, for instance, tunneling is involved and the MTU of the various systems should be homogenous. The first patch adds the feature bit as

[RFC -next 1/2] virtio: Start the advised MTU feature support

2016-03-10 Thread Aaron Conole
This commit adds the feature bit and associated mtu device entry for the virtio network device. Future commits will make use of these bits to support negotiated MTU. Signed-off-by: Aaron Conole --- include/uapi/linux/virtio_net.h | 3 +++ 1 file changed, 3 insertions(+)

Re: [Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization

2016-03-10 Thread Michael S. Tsirkin
On Thu, Mar 10, 2016 at 01:41:16AM +, Li, Liang Z wrote: > > > > > > Yes, we really can teach qemu to skip these pages and it's not hard. > > > > > > The problem is the poor performance, this PV solution > > > > > > > > > > Balloon is always PV. And do not call patches solutions please. > > >

Re: [RFC qemu 0/4] A PV solution for live migration optimization

2016-03-10 Thread Dr. David Alan Gilbert
Hi, I'm just catching back up on this thread; so without reference to any particular previous mail in the thread. 1) How many of the free pages do we tell the host about? Your main change is telling the host about all the free pages. If we tell the host about all the free

Re: [Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization

2016-03-10 Thread Roman Kagan
On Wed, Mar 09, 2016 at 07:39:18PM +0200, Michael S. Tsirkin wrote: > On Wed, Mar 09, 2016 at 08:04:39PM +0300, Roman Kagan wrote: > > On Wed, Mar 09, 2016 at 05:41:39PM +0200, Michael S. Tsirkin wrote: > > > On Wed, Mar 09, 2016 at 05:28:54PM +0300, Roman Kagan wrote: > > > > For (1) I've been

Re: [Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization

2016-03-10 Thread Roman Kagan
On Wed, Mar 09, 2016 at 02:38:52PM -0500, Rik van Riel wrote: > On Wed, 2016-03-09 at 20:04 +0300, Roman Kagan wrote: > > On Wed, Mar 09, 2016 at 05:41:39PM +0200, Michael S. Tsirkin wrote: > > > On Wed, Mar 09, 2016 at 05:28:54PM +0300, Roman Kagan wrote: > > > > For (1) I've been trying to make

RE: [RFC qemu 0/4] A PV solution for live migration optimization

2016-03-10 Thread Li, Liang Z
> > Could provide more information on how to use virtio-serial to exchange > data? Thread , Wiki or code are all OK. > > I have not find some useful information yet. > > See this commit in the Linux sources: > > 108fc82596e3b66b819df9d28c1ebbc9ab5de14c > > that adds a way to send guest trace