This patch enables tail page migration of zspage.
In this point, I tested zsmalloc regression with micro-benchmark
which does zs_malloc/map/unmap/zs_free for all size class
in every CPU(my system is 12) during 20 sec.
It shows 1% regression which is really small when we consider
the benefit of
Currently, we rely on class->lock to prevent zspage destruction.
It was okay until now because the critical section is short but
with run-time migration, it could be long so class->lock is not
a good apporach any more.
So, this patch introduces [un]freeze_zspage functions which
freeze allocated
This patch introduces run-time migration feature for zspage.
To begin with, it supports only head page migration for
easy review(later patches will support tail page migration).
For migration, it supports three functions
* zs_page_isolate
It isolates a zspage which includes a subpage VM want to
Zsmalloc is ready for page migration so zram can use __GFP_MOVABLE
from now on.
I did test to see how it helps to make higher order pages.
Test scenario is as follows.
KVM guest, 1G memory, ext4 formated zram block device,
for i in `seq 1 8`;
do
dd if=/dev/vda1 of=mnt/test$i.txt bs=128M
For tail page migration, we shouldn't use page->lru which
was used for page chaining because VM will use it for own
purpose so that we need another field for chaining.
For chaining, singly linked list is enough and page->index
of tail page to point first object offset in the page could
be replaced
For supporting migration from VM, we need to have address_space
on every page so zsmalloc shouldn't use page->mapping. So,
this patch moves zs_meta from mapping to freelist.
Signed-off-by: Minchan Kim
---
mm/zsmalloc.c | 23 ---
1 file changed, 12
Currently, putback_zspage does free zspage under class->lock
if fullness become ZS_EMPTY but it makes trouble to implement
locking scheme for new zspage migration.
So, this patch is to separate free_zspage from putback_zspage
and free zspage out of class->lock which is preparation for
zspage
Zsmalloc stores first free object's position into first_page->freelist
in each zspage. If we change it with object index from first_page
instead of location, we could squeeze it into page->mapping because
the number of bit we need to store offset is at most 11bit.
Signed-off-by: Minchan Kim
For migration, we need to create sub-page chain of zspage
dynamically so this patch factors it out from alloc_zspage.
As a minor refactoring, it makes OBJ_ALLOCATED_TAG assign
more clear in obj_malloc(it could be another patch but it's
trivial so I want to put together in this patch).
Currently, we store class:fullness into page->mapping.
The number of class we can support is 255 and fullness is 4 so
(8 + 2 = 10bit) is enough to represent them.
Meanwhile, the bits we need to store in-use objects in zspage
is that 11bit is enough.
For example, If we assume that 64K PAGE_SIZE,
We have allowed migration for only LRU pages until now and it was
enough to make high-order pages. But recently, embedded system(e.g.,
webOS, android) uses lots of non-movable pages(e.g., zram, GPU memory)
so we have seen several reports about troubles of small high-order
allocation. For fixing
Every zspage in a size_class has same number of max objects so
we could move it to a size_class.
Signed-off-by: Minchan Kim
---
mm/zsmalloc.c | 29 ++---
1 file changed, 14 insertions(+), 15 deletions(-)
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
Now, VM has a feature to migrate non-lru movable pages so
balloon doesn't need custom migration hooks in migrate.c
and compact.c. Instead, this patch implements page->mapping
->{isolate|migrate|putback} functions.
With that, we could remove hooks for ballooning in general
migration functions and
From: Gioh Kim
The anon_inodes has already complete interfaces to create manage
many anonymous inodes but don't have interface to get
new inode. Other sub-modules can create anonymous inode
without creating and mounting it's own pseudo filesystem.
Acked-by: Rafael Aquini
This patch cleans up function parameter ordering to order
higher data structure first.
Signed-off-by: Minchan Kim
---
mm/zsmalloc.c | 50 ++
1 file changed, 26 insertions(+), 24 deletions(-)
diff --git a/mm/zsmalloc.c
There are many BUG_ON in zsmalloc.c which is not recommened so
change them as alternatives.
Normal rule is as follows:
1. avoid BUG_ON if possible. Instead, use VM_BUG_ON or VM_BUG_ON_PAGE
2. use VM_BUG_ON_PAGE if we need to see struct page's fields
3. use those assertion in primitive functions
This patch cleans up function parameter "struct page".
Many functions of zsmalloc expects that page paramter is "first_page"
so use "first_page" rather than "page" for code readability.
Signed-off-by: Minchan Kim
---
mm/zsmalloc.c | 62
Let's remove unused pool param in obj_free
Signed-off-by: Minchan Kim
---
mm/zsmalloc.c | 7 +++
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 156edf909046..b4fb11831acb 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@
Procedure of page migration is as follows:
First of all, it should isolate a page from LRU and try to
migrate the page. If it is successful, it releases the page
for freeing. Otherwise, it should put the page back to LRU
list.
For LRU pages, we have used putback_lru_page for both freeing
and
Recently, I got many reports about perfermance degradation
in embedded system(Android mobile phone, webOS TV and so on)
and failed to fork easily.
The problem was fragmentation caused by zram and GPU driver
pages. Their pages cannot be migrated so compaction cannot
work well, either so reclaimer
>
> Hi,
> I'm just catching back up on this thread; so without reference to any
> particular previous mail in the thread.
>
> 1) How many of the free pages do we tell the host about?
> Your main change is telling the host about all the
> free pages.
Yes, all the guest's free
On Thu, Mar 03, 2016 at 03:37:37PM +, Stefan Hajnoczi wrote:
> Michael pointed out that the virtio-vsock draft specification does not
> address live migration and in fact currently precludes migration.
>
> Migration is fundamental so the device specification at least mustn't
> preclude it.
Hello.
On 03/10/2016 05:28 PM, Aaron Conole wrote:
This patch checks the feature bit for the VIRTIO_NET_F_MTU feature. If it
exists, read the advised MTU and use it.
No proper error handling is provided for the case where a user changes the
negotiated MTU. A future commit will add proper
=
KVM Forum 2016: Call For Participation
August 24-26, 2016 - Westin Harbor Castle - Toronto, Canada
(All submissions must be received before midnight May 1, 2016)
=
On Thu, 2016-03-10 at 09:28 -0500, Aaron Conole wrote:
> This patch checks the feature bit for the VIRTIO_NET_F_MTU feature. If it
> exists, read the advised MTU and use it.
>
> No proper error handling is provided for the case where a user changes the
> negotiated MTU. A future commit will add
The following series adds the ability for a hypervisor to set an MTU on the
guest during feature negotiation phase. This is useful for VM orchestration
when, for instance, tunneling is involved and the MTU of the various systems
should be homogenous.
The first patch adds the feature bit as
This commit adds the feature bit and associated mtu device entry for the
virtio network device. Future commits will make use of these bits to support
negotiated MTU.
Signed-off-by: Aaron Conole
---
include/uapi/linux/virtio_net.h | 3 +++
1 file changed, 3 insertions(+)
On Thu, Mar 10, 2016 at 01:41:16AM +, Li, Liang Z wrote:
> > > > > > Yes, we really can teach qemu to skip these pages and it's not hard.
> > > > > > The problem is the poor performance, this PV solution
> > > > >
> > > > > Balloon is always PV. And do not call patches solutions please.
> > >
Hi,
I'm just catching back up on this thread; so without reference to any
particular previous mail in the thread.
1) How many of the free pages do we tell the host about?
Your main change is telling the host about all the
free pages.
If we tell the host about all the free
On Wed, Mar 09, 2016 at 07:39:18PM +0200, Michael S. Tsirkin wrote:
> On Wed, Mar 09, 2016 at 08:04:39PM +0300, Roman Kagan wrote:
> > On Wed, Mar 09, 2016 at 05:41:39PM +0200, Michael S. Tsirkin wrote:
> > > On Wed, Mar 09, 2016 at 05:28:54PM +0300, Roman Kagan wrote:
> > > > For (1) I've been
On Wed, Mar 09, 2016 at 02:38:52PM -0500, Rik van Riel wrote:
> On Wed, 2016-03-09 at 20:04 +0300, Roman Kagan wrote:
> > On Wed, Mar 09, 2016 at 05:41:39PM +0200, Michael S. Tsirkin wrote:
> > > On Wed, Mar 09, 2016 at 05:28:54PM +0300, Roman Kagan wrote:
> > > > For (1) I've been trying to make
> > Could provide more information on how to use virtio-serial to exchange
> data? Thread , Wiki or code are all OK.
> > I have not find some useful information yet.
>
> See this commit in the Linux sources:
>
> 108fc82596e3b66b819df9d28c1ebbc9ab5de14c
>
> that adds a way to send guest trace
32 matches
Mail list logo