Re: [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration

2016-06-30 Thread Minchan Kim
On Thu, Jun 30, 2016 at 11:26:45AM +0530, Anshuman Khandual wrote:



> >> Did you get a chance to test the driver out ? I am still concerned about 
> >> how to
> >> handle the struct address_space override problem within the struct page.
> > 
> > Hi Anshuman,
> > 
> > Slow but I am working on that. :) However, as I said, I want to do it
> 
> I really appreciate. Was just curious about the problem and any potential
> solution we can look into.
> 
> > after soft landing of current non-lru-no-mapped page migration to solve
> > current real field issues.
> 
> yeah it makes sense.
> 
> > 
> > About the overriding problem of non-lru-mapped-page, I implemented dummy
> > driver as miscellaneous device and in test_mmap(file_operations.mmap),
> > I changed a_ops with my address_space_operations.
> > 
> > int test_mmap(struct file *filp, struct vm_area_struct *vma)
> > {
> > filp->f_mapping->a_ops = _aops;
> > vma->vm_ops = _vm_ops;
> > vma->vm_private_data = filp->private_data;
> > return 0;
> > }
> > 
> 
> Okay.
> 
> > test_aops should have *set_page_dirty* overriding.
> > 
> > static int test_set_pag_dirty(struct page *page)
> > {
> > if (!PageDirty(page))
> > SetPageDirty*page);
> > return 0;
> > }
> > 
> > Otherwise, it goes BUG_ON during radix tree operation because
> > currently try_to_unmap is designed for file-lru pages which lives
> > in page cache so it propagates page table dirty bit to PG_dirty flag
> > of struct page by set_page_dirty. And set_page_dirty want to mark
> > dirty tag in radix tree node but it's character driver so the page
> > cache doesn't have it. That's why we encounter BUG_ON in radix tree
> > operation. Anyway, to test, I implemented set_page_dirty in my dummy
> > driver.
> 
> Okay and the above test_set_page_dirty() example is sufficient ?

I guess just return 0 is sufficeint without any dirting a page.

> 
> > 
> > With only that, it doesn't work because I need to modify migrate.c to
> > work non-lru-mapped-page and changing PG_isolated flag which is
> > override of PG_reclaim which is cleared in set_page_dirty.
> 
> Got it, so what changes you did ? Implemented PG_isolated differently
> not by overriding PG_reclaim or something else ? Yes set_page_dirty
> indeed clears the PG_reclaim flag.
> 
> > 
> > With that, it seems to work. But I'm not saying it's right model now
> 
> So the mapped pages migration was successful ? Even after overloading
> filp->f_mapping->a_ops = _aops, we still have the RMAP information
> intact with filp->f_mappinp pointed interval tree. But would really like
> to see the code changes.
> 
> > for device drivers. In runtime, replacing filp->f_mapping->a_ops with
> > custom a_ops of own driver seems to be hacky to me.
> 
> Yeah I thought so.
> 
> > So, I'm considering now new pseudo fs "movable_inode" which will
> > support 
> > 
> > struct file *movable_inode_getfile(const char *name,
> > const struct file_operations *fop,
> > const struct address_space_operations *a_ops)
> > {
> > struct path path;
> > struct qstr this;
> > struct inode *inode;
> > struct super_block *sb;
> > 
> > this.name = name;
> > this.len = strlen(name);
> > this.hash = 0;
> > sb = movable_mnt.mnt_sb;
> > patch.denty = d_alloc_pseudo(movable_inode_mnt->mnt_sb, );
> > patch.mnt = mntget(movable_inode_mnt);
> > 
> > inode = new_inode(sb);
> > ..
> > ..
> > inode->i_mapping->a_ops = a_ops;
> > d_instantiate(path.dentry, inode);
> > 
> > return alloc_file(, FMODE_WRITE | FMODE_READ, f_op);
> > }
> > 
> > And in our driver, we can change vma->vm_file with new one.
> > 
> > int test_mmap(struct file *filp, struct vm_area_structd *vma)
> > {
> > struct file *newfile = movable_inode_getfile("[test"],
> > filep->f_op, _aops);
> > vma->vm_file = newfile;
> > ..
> > ..
> > }
> > 
> > When I read mmap_region in mm/mmap.c, it's reasonable usecase
> > which dirver's mmap changes vma->vm_file with own file.
> 
> I will look into these details.
> 
> > Anyway, it needs many subtle changes in mm/vfs/driver side so
> > need to review from each maintainers related subsystem so I
> > want to not be hurry.
> 
> Sure, makes sense. Mean while it will be really great if you could share
> your code changes as described above, so that I can try them out.
> 

It's almost done for draft version and I'm doing stress test now and
fortunately, doesn't see the problem until now.

I will send you when I'm ready.

Thanks.


Re: [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration

2016-06-30 Thread Minchan Kim
On Thu, Jun 30, 2016 at 11:26:45AM +0530, Anshuman Khandual wrote:



> >> Did you get a chance to test the driver out ? I am still concerned about 
> >> how to
> >> handle the struct address_space override problem within the struct page.
> > 
> > Hi Anshuman,
> > 
> > Slow but I am working on that. :) However, as I said, I want to do it
> 
> I really appreciate. Was just curious about the problem and any potential
> solution we can look into.
> 
> > after soft landing of current non-lru-no-mapped page migration to solve
> > current real field issues.
> 
> yeah it makes sense.
> 
> > 
> > About the overriding problem of non-lru-mapped-page, I implemented dummy
> > driver as miscellaneous device and in test_mmap(file_operations.mmap),
> > I changed a_ops with my address_space_operations.
> > 
> > int test_mmap(struct file *filp, struct vm_area_struct *vma)
> > {
> > filp->f_mapping->a_ops = _aops;
> > vma->vm_ops = _vm_ops;
> > vma->vm_private_data = filp->private_data;
> > return 0;
> > }
> > 
> 
> Okay.
> 
> > test_aops should have *set_page_dirty* overriding.
> > 
> > static int test_set_pag_dirty(struct page *page)
> > {
> > if (!PageDirty(page))
> > SetPageDirty*page);
> > return 0;
> > }
> > 
> > Otherwise, it goes BUG_ON during radix tree operation because
> > currently try_to_unmap is designed for file-lru pages which lives
> > in page cache so it propagates page table dirty bit to PG_dirty flag
> > of struct page by set_page_dirty. And set_page_dirty want to mark
> > dirty tag in radix tree node but it's character driver so the page
> > cache doesn't have it. That's why we encounter BUG_ON in radix tree
> > operation. Anyway, to test, I implemented set_page_dirty in my dummy
> > driver.
> 
> Okay and the above test_set_page_dirty() example is sufficient ?

I guess just return 0 is sufficeint without any dirting a page.

> 
> > 
> > With only that, it doesn't work because I need to modify migrate.c to
> > work non-lru-mapped-page and changing PG_isolated flag which is
> > override of PG_reclaim which is cleared in set_page_dirty.
> 
> Got it, so what changes you did ? Implemented PG_isolated differently
> not by overriding PG_reclaim or something else ? Yes set_page_dirty
> indeed clears the PG_reclaim flag.
> 
> > 
> > With that, it seems to work. But I'm not saying it's right model now
> 
> So the mapped pages migration was successful ? Even after overloading
> filp->f_mapping->a_ops = _aops, we still have the RMAP information
> intact with filp->f_mappinp pointed interval tree. But would really like
> to see the code changes.
> 
> > for device drivers. In runtime, replacing filp->f_mapping->a_ops with
> > custom a_ops of own driver seems to be hacky to me.
> 
> Yeah I thought so.
> 
> > So, I'm considering now new pseudo fs "movable_inode" which will
> > support 
> > 
> > struct file *movable_inode_getfile(const char *name,
> > const struct file_operations *fop,
> > const struct address_space_operations *a_ops)
> > {
> > struct path path;
> > struct qstr this;
> > struct inode *inode;
> > struct super_block *sb;
> > 
> > this.name = name;
> > this.len = strlen(name);
> > this.hash = 0;
> > sb = movable_mnt.mnt_sb;
> > patch.denty = d_alloc_pseudo(movable_inode_mnt->mnt_sb, );
> > patch.mnt = mntget(movable_inode_mnt);
> > 
> > inode = new_inode(sb);
> > ..
> > ..
> > inode->i_mapping->a_ops = a_ops;
> > d_instantiate(path.dentry, inode);
> > 
> > return alloc_file(, FMODE_WRITE | FMODE_READ, f_op);
> > }
> > 
> > And in our driver, we can change vma->vm_file with new one.
> > 
> > int test_mmap(struct file *filp, struct vm_area_structd *vma)
> > {
> > struct file *newfile = movable_inode_getfile("[test"],
> > filep->f_op, _aops);
> > vma->vm_file = newfile;
> > ..
> > ..
> > }
> > 
> > When I read mmap_region in mm/mmap.c, it's reasonable usecase
> > which dirver's mmap changes vma->vm_file with own file.
> 
> I will look into these details.
> 
> > Anyway, it needs many subtle changes in mm/vfs/driver side so
> > need to review from each maintainers related subsystem so I
> > want to not be hurry.
> 
> Sure, makes sense. Mean while it will be really great if you could share
> your code changes as described above, so that I can try them out.
> 

It's almost done for draft version and I'm doing stress test now and
fortunately, doesn't see the problem until now.

I will send you when I'm ready.

Thanks.


Re: [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration

2016-06-29 Thread Anshuman Khandual
On 06/28/2016 12:09 PM, Minchan Kim wrote:
> On Mon, Jun 27, 2016 at 11:21:01AM +0530, Anshuman Khandual wrote:
>> On 06/16/2016 11:07 AM, Minchan Kim wrote:
>>> On Thu, Jun 16, 2016 at 09:12:07AM +0530, Anshuman Khandual wrote:
 On 06/16/2016 05:56 AM, Minchan Kim wrote:
> On Wed, Jun 15, 2016 at 12:15:04PM +0530, Anshuman Khandual wrote:
>> On 06/15/2016 08:02 AM, Minchan Kim wrote:
>>> Hi,
>>>
>>> On Mon, Jun 13, 2016 at 03:08:19PM +0530, Anshuman Khandual wrote:
> On 05/31/2016 05:31 AM, Minchan Kim wrote:
>>> @@ -791,6 +921,7 @@ static int __unmap_and_move(struct page *page, 
>>> struct page *newpage,
>>> int rc = -EAGAIN;
>>> int page_was_mapped = 0;
>>> struct anon_vma *anon_vma = NULL;
>>> +   bool is_lru = !__PageMovable(page);
>>>  
>>> if (!trylock_page(page)) {
>>> if (!force || mode == MIGRATE_ASYNC)
>>> @@ -871,6 +1002,11 @@ static int __unmap_and_move(struct page 
>>> *page, struct page *newpage,
>>> goto out_unlock_both;
>>> }
>>>  
>>> +   if (unlikely(!is_lru)) {
>>> +   rc = move_to_new_page(newpage, page, mode);
>>> +   goto out_unlock_both;
>>> +   }
>>> +
>
> Hello Minchan,
>
> I might be missing something here but does this implementation 
> support the
> scenario where these non LRU pages owned by the driver mapped as PTE 
> into
> process page table ? Because the "goto out_unlock_both" statement 
> above
> skips all the PTE unmap, putting a migration PTE and removing the 
> migration
> PTE steps.
>>> You're right. Unfortunately, it doesn't support right now but surely,
>>> it's my TODO after landing this work.
>>>
>>> Could you share your usecase?
>>
>> Sure.
>
> Thanks a lot!
>
>>
>> My driver has privately managed non LRU pages which gets mapped into 
>> user space
>> process page table through f_ops->mmap() and vmops->fault() which then 
>> updates
>> the file RMAP (page->mapping->i_mmap) through page_add_file_rmap(page). 
>> One thing
>
> Hmm, page_add_file_rmap is not exported function. How does your driver 
> can use it?

 Its not using the function directly, I just re-iterated the sequence of 
 functions
 above. (do_set_pte -> page_add_file_rmap) gets called after we grab the 
 page from
 driver through (__do_fault->vma->vm_ops->fault()).

> Do you use vm_insert_pfn?
> What type your vma is? VM_PFNMMAP or VM_MIXEDMAP?

 I dont use vm_insert_pfn(). Here is the sequence of events how the user 
 space
 VMA gets the non LRU pages from the driver.

 - Driver registers a character device with 'struct file_operations' binding
 - Then the 'fops->mmap()' just binds the incoming 'struct vma' with a 
 'struct
   vm_operations_struct' which provides the 'vmops->fault()' routine which
   basically traps all page faults on the VMA and provides one page at a 
 time
   through a driver specific allocation routine which hands over non LRU 
 pages

 The VMA is not anything special as such. Its what we get when we try to do 
 a
 simple mmap() on a file descriptor pointing to a character device. I can
 figure out all the VM_* flags it holds after creation.

>
> I want to make dummy driver to simulate your case.

 Sure. I hope the above mentioned steps will help you but in case you need 
 more
 information, please do let me know.
>>>
>>> I got understood now. :)
>>> I will test it with dummy driver and will Cc'ed when I send a patch.
>>
>> Hello Minchan,
>>
>> Do you have any updates on this ? The V7 of the series still has this 
>> limitation.
>> Did you get a chance to test the driver out ? I am still concerned about how 
>> to
>> handle the struct address_space override problem within the struct page.
> 
> Hi Anshuman,
> 
> Slow but I am working on that. :) However, as I said, I want to do it

I really appreciate. Was just curious about the problem and any potential
solution we can look into.

> after soft landing of current non-lru-no-mapped page migration to solve
> current real field issues.

yeah it makes sense.

> 
> About the overriding problem of non-lru-mapped-page, I implemented dummy
> driver as miscellaneous device and in test_mmap(file_operations.mmap),
> I changed a_ops with my address_space_operations.
> 
> int test_mmap(struct file *filp, struct vm_area_struct *vma)
> {
> filp->f_mapping->a_ops = _aops;
> vma->vm_ops = _vm_ops;
> vma->vm_private_data = filp->private_data;
> return 0;
> }
> 

Okay.

> test_aops should have *set_page_dirty* overriding.
> 
> static int test_set_pag_dirty(struct page *page)

Re: [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration

2016-06-29 Thread Anshuman Khandual
On 06/28/2016 12:09 PM, Minchan Kim wrote:
> On Mon, Jun 27, 2016 at 11:21:01AM +0530, Anshuman Khandual wrote:
>> On 06/16/2016 11:07 AM, Minchan Kim wrote:
>>> On Thu, Jun 16, 2016 at 09:12:07AM +0530, Anshuman Khandual wrote:
 On 06/16/2016 05:56 AM, Minchan Kim wrote:
> On Wed, Jun 15, 2016 at 12:15:04PM +0530, Anshuman Khandual wrote:
>> On 06/15/2016 08:02 AM, Minchan Kim wrote:
>>> Hi,
>>>
>>> On Mon, Jun 13, 2016 at 03:08:19PM +0530, Anshuman Khandual wrote:
> On 05/31/2016 05:31 AM, Minchan Kim wrote:
>>> @@ -791,6 +921,7 @@ static int __unmap_and_move(struct page *page, 
>>> struct page *newpage,
>>> int rc = -EAGAIN;
>>> int page_was_mapped = 0;
>>> struct anon_vma *anon_vma = NULL;
>>> +   bool is_lru = !__PageMovable(page);
>>>  
>>> if (!trylock_page(page)) {
>>> if (!force || mode == MIGRATE_ASYNC)
>>> @@ -871,6 +1002,11 @@ static int __unmap_and_move(struct page 
>>> *page, struct page *newpage,
>>> goto out_unlock_both;
>>> }
>>>  
>>> +   if (unlikely(!is_lru)) {
>>> +   rc = move_to_new_page(newpage, page, mode);
>>> +   goto out_unlock_both;
>>> +   }
>>> +
>
> Hello Minchan,
>
> I might be missing something here but does this implementation 
> support the
> scenario where these non LRU pages owned by the driver mapped as PTE 
> into
> process page table ? Because the "goto out_unlock_both" statement 
> above
> skips all the PTE unmap, putting a migration PTE and removing the 
> migration
> PTE steps.
>>> You're right. Unfortunately, it doesn't support right now but surely,
>>> it's my TODO after landing this work.
>>>
>>> Could you share your usecase?
>>
>> Sure.
>
> Thanks a lot!
>
>>
>> My driver has privately managed non LRU pages which gets mapped into 
>> user space
>> process page table through f_ops->mmap() and vmops->fault() which then 
>> updates
>> the file RMAP (page->mapping->i_mmap) through page_add_file_rmap(page). 
>> One thing
>
> Hmm, page_add_file_rmap is not exported function. How does your driver 
> can use it?

 Its not using the function directly, I just re-iterated the sequence of 
 functions
 above. (do_set_pte -> page_add_file_rmap) gets called after we grab the 
 page from
 driver through (__do_fault->vma->vm_ops->fault()).

> Do you use vm_insert_pfn?
> What type your vma is? VM_PFNMMAP or VM_MIXEDMAP?

 I dont use vm_insert_pfn(). Here is the sequence of events how the user 
 space
 VMA gets the non LRU pages from the driver.

 - Driver registers a character device with 'struct file_operations' binding
 - Then the 'fops->mmap()' just binds the incoming 'struct vma' with a 
 'struct
   vm_operations_struct' which provides the 'vmops->fault()' routine which
   basically traps all page faults on the VMA and provides one page at a 
 time
   through a driver specific allocation routine which hands over non LRU 
 pages

 The VMA is not anything special as such. Its what we get when we try to do 
 a
 simple mmap() on a file descriptor pointing to a character device. I can
 figure out all the VM_* flags it holds after creation.

>
> I want to make dummy driver to simulate your case.

 Sure. I hope the above mentioned steps will help you but in case you need 
 more
 information, please do let me know.
>>>
>>> I got understood now. :)
>>> I will test it with dummy driver and will Cc'ed when I send a patch.
>>
>> Hello Minchan,
>>
>> Do you have any updates on this ? The V7 of the series still has this 
>> limitation.
>> Did you get a chance to test the driver out ? I am still concerned about how 
>> to
>> handle the struct address_space override problem within the struct page.
> 
> Hi Anshuman,
> 
> Slow but I am working on that. :) However, as I said, I want to do it

I really appreciate. Was just curious about the problem and any potential
solution we can look into.

> after soft landing of current non-lru-no-mapped page migration to solve
> current real field issues.

yeah it makes sense.

> 
> About the overriding problem of non-lru-mapped-page, I implemented dummy
> driver as miscellaneous device and in test_mmap(file_operations.mmap),
> I changed a_ops with my address_space_operations.
> 
> int test_mmap(struct file *filp, struct vm_area_struct *vma)
> {
> filp->f_mapping->a_ops = _aops;
> vma->vm_ops = _vm_ops;
> vma->vm_private_data = filp->private_data;
> return 0;
> }
> 

Okay.

> test_aops should have *set_page_dirty* overriding.
> 
> static int test_set_pag_dirty(struct page *page)

Re: [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration

2016-06-28 Thread Minchan Kim
On Mon, Jun 27, 2016 at 11:21:01AM +0530, Anshuman Khandual wrote:
> On 06/16/2016 11:07 AM, Minchan Kim wrote:
> > On Thu, Jun 16, 2016 at 09:12:07AM +0530, Anshuman Khandual wrote:
> >> On 06/16/2016 05:56 AM, Minchan Kim wrote:
> >>> On Wed, Jun 15, 2016 at 12:15:04PM +0530, Anshuman Khandual wrote:
>  On 06/15/2016 08:02 AM, Minchan Kim wrote:
> > Hi,
> >
> > On Mon, Jun 13, 2016 at 03:08:19PM +0530, Anshuman Khandual wrote:
> >>> On 05/31/2016 05:31 AM, Minchan Kim wrote:
> > @@ -791,6 +921,7 @@ static int __unmap_and_move(struct page *page, 
> > struct page *newpage,
> > int rc = -EAGAIN;
> > int page_was_mapped = 0;
> > struct anon_vma *anon_vma = NULL;
> > +   bool is_lru = !__PageMovable(page);
> >  
> > if (!trylock_page(page)) {
> > if (!force || mode == MIGRATE_ASYNC)
> > @@ -871,6 +1002,11 @@ static int __unmap_and_move(struct page 
> > *page, struct page *newpage,
> > goto out_unlock_both;
> > }
> >  
> > +   if (unlikely(!is_lru)) {
> > +   rc = move_to_new_page(newpage, page, mode);
> > +   goto out_unlock_both;
> > +   }
> > +
> >>>
> >>> Hello Minchan,
> >>>
> >>> I might be missing something here but does this implementation 
> >>> support the
> >>> scenario where these non LRU pages owned by the driver mapped as PTE 
> >>> into
> >>> process page table ? Because the "goto out_unlock_both" statement 
> >>> above
> >>> skips all the PTE unmap, putting a migration PTE and removing the 
> >>> migration
> >>> PTE steps.
> > You're right. Unfortunately, it doesn't support right now but surely,
> > it's my TODO after landing this work.
> >
> > Could you share your usecase?
> 
>  Sure.
> >>>
> >>> Thanks a lot!
> >>>
> 
>  My driver has privately managed non LRU pages which gets mapped into 
>  user space
>  process page table through f_ops->mmap() and vmops->fault() which then 
>  updates
>  the file RMAP (page->mapping->i_mmap) through page_add_file_rmap(page). 
>  One thing
> >>>
> >>> Hmm, page_add_file_rmap is not exported function. How does your driver 
> >>> can use it?
> >>
> >> Its not using the function directly, I just re-iterated the sequence of 
> >> functions
> >> above. (do_set_pte -> page_add_file_rmap) gets called after we grab the 
> >> page from
> >> driver through (__do_fault->vma->vm_ops->fault()).
> >>
> >>> Do you use vm_insert_pfn?
> >>> What type your vma is? VM_PFNMMAP or VM_MIXEDMAP?
> >>
> >> I dont use vm_insert_pfn(). Here is the sequence of events how the user 
> >> space
> >> VMA gets the non LRU pages from the driver.
> >>
> >> - Driver registers a character device with 'struct file_operations' binding
> >> - Then the 'fops->mmap()' just binds the incoming 'struct vma' with a 
> >> 'struct
> >>   vm_operations_struct' which provides the 'vmops->fault()' routine which
> >>   basically traps all page faults on the VMA and provides one page at a 
> >> time
> >>   through a driver specific allocation routine which hands over non LRU 
> >> pages
> >>
> >> The VMA is not anything special as such. Its what we get when we try to do 
> >> a
> >> simple mmap() on a file descriptor pointing to a character device. I can
> >> figure out all the VM_* flags it holds after creation.
> >>
> >>>
> >>> I want to make dummy driver to simulate your case.
> >>
> >> Sure. I hope the above mentioned steps will help you but in case you need 
> >> more
> >> information, please do let me know.
> > 
> > I got understood now. :)
> > I will test it with dummy driver and will Cc'ed when I send a patch.
> 
> Hello Minchan,
> 
> Do you have any updates on this ? The V7 of the series still has this 
> limitation.
> Did you get a chance to test the driver out ? I am still concerned about how 
> to
> handle the struct address_space override problem within the struct page.

Hi Anshuman,

Slow but I am working on that. :) However, as I said, I want to do it
after soft landing of current non-lru-no-mapped page migration to solve
current real field issues.

About the overriding problem of non-lru-mapped-page, I implemented dummy
driver as miscellaneous device and in test_mmap(file_operations.mmap),
I changed a_ops with my address_space_operations.

int test_mmap(struct file *filp, struct vm_area_struct *vma)
{
filp->f_mapping->a_ops = _aops;
vma->vm_ops = _vm_ops;
vma->vm_private_data = filp->private_data;
return 0;
}

test_aops should have *set_page_dirty* overriding.

static int test_set_pag_dirty(struct page *page)
{
if (!PageDirty(page))
SetPageDirty*page);
return 0;
}

Otherwise, it goes BUG_ON during radix tree operation because
currently try_to_unmap is designed for file-lru pages which lives
in page 

Re: [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration

2016-06-28 Thread Minchan Kim
On Mon, Jun 27, 2016 at 11:21:01AM +0530, Anshuman Khandual wrote:
> On 06/16/2016 11:07 AM, Minchan Kim wrote:
> > On Thu, Jun 16, 2016 at 09:12:07AM +0530, Anshuman Khandual wrote:
> >> On 06/16/2016 05:56 AM, Minchan Kim wrote:
> >>> On Wed, Jun 15, 2016 at 12:15:04PM +0530, Anshuman Khandual wrote:
>  On 06/15/2016 08:02 AM, Minchan Kim wrote:
> > Hi,
> >
> > On Mon, Jun 13, 2016 at 03:08:19PM +0530, Anshuman Khandual wrote:
> >>> On 05/31/2016 05:31 AM, Minchan Kim wrote:
> > @@ -791,6 +921,7 @@ static int __unmap_and_move(struct page *page, 
> > struct page *newpage,
> > int rc = -EAGAIN;
> > int page_was_mapped = 0;
> > struct anon_vma *anon_vma = NULL;
> > +   bool is_lru = !__PageMovable(page);
> >  
> > if (!trylock_page(page)) {
> > if (!force || mode == MIGRATE_ASYNC)
> > @@ -871,6 +1002,11 @@ static int __unmap_and_move(struct page 
> > *page, struct page *newpage,
> > goto out_unlock_both;
> > }
> >  
> > +   if (unlikely(!is_lru)) {
> > +   rc = move_to_new_page(newpage, page, mode);
> > +   goto out_unlock_both;
> > +   }
> > +
> >>>
> >>> Hello Minchan,
> >>>
> >>> I might be missing something here but does this implementation 
> >>> support the
> >>> scenario where these non LRU pages owned by the driver mapped as PTE 
> >>> into
> >>> process page table ? Because the "goto out_unlock_both" statement 
> >>> above
> >>> skips all the PTE unmap, putting a migration PTE and removing the 
> >>> migration
> >>> PTE steps.
> > You're right. Unfortunately, it doesn't support right now but surely,
> > it's my TODO after landing this work.
> >
> > Could you share your usecase?
> 
>  Sure.
> >>>
> >>> Thanks a lot!
> >>>
> 
>  My driver has privately managed non LRU pages which gets mapped into 
>  user space
>  process page table through f_ops->mmap() and vmops->fault() which then 
>  updates
>  the file RMAP (page->mapping->i_mmap) through page_add_file_rmap(page). 
>  One thing
> >>>
> >>> Hmm, page_add_file_rmap is not exported function. How does your driver 
> >>> can use it?
> >>
> >> Its not using the function directly, I just re-iterated the sequence of 
> >> functions
> >> above. (do_set_pte -> page_add_file_rmap) gets called after we grab the 
> >> page from
> >> driver through (__do_fault->vma->vm_ops->fault()).
> >>
> >>> Do you use vm_insert_pfn?
> >>> What type your vma is? VM_PFNMMAP or VM_MIXEDMAP?
> >>
> >> I dont use vm_insert_pfn(). Here is the sequence of events how the user 
> >> space
> >> VMA gets the non LRU pages from the driver.
> >>
> >> - Driver registers a character device with 'struct file_operations' binding
> >> - Then the 'fops->mmap()' just binds the incoming 'struct vma' with a 
> >> 'struct
> >>   vm_operations_struct' which provides the 'vmops->fault()' routine which
> >>   basically traps all page faults on the VMA and provides one page at a 
> >> time
> >>   through a driver specific allocation routine which hands over non LRU 
> >> pages
> >>
> >> The VMA is not anything special as such. Its what we get when we try to do 
> >> a
> >> simple mmap() on a file descriptor pointing to a character device. I can
> >> figure out all the VM_* flags it holds after creation.
> >>
> >>>
> >>> I want to make dummy driver to simulate your case.
> >>
> >> Sure. I hope the above mentioned steps will help you but in case you need 
> >> more
> >> information, please do let me know.
> > 
> > I got understood now. :)
> > I will test it with dummy driver and will Cc'ed when I send a patch.
> 
> Hello Minchan,
> 
> Do you have any updates on this ? The V7 of the series still has this 
> limitation.
> Did you get a chance to test the driver out ? I am still concerned about how 
> to
> handle the struct address_space override problem within the struct page.

Hi Anshuman,

Slow but I am working on that. :) However, as I said, I want to do it
after soft landing of current non-lru-no-mapped page migration to solve
current real field issues.

About the overriding problem of non-lru-mapped-page, I implemented dummy
driver as miscellaneous device and in test_mmap(file_operations.mmap),
I changed a_ops with my address_space_operations.

int test_mmap(struct file *filp, struct vm_area_struct *vma)
{
filp->f_mapping->a_ops = _aops;
vma->vm_ops = _vm_ops;
vma->vm_private_data = filp->private_data;
return 0;
}

test_aops should have *set_page_dirty* overriding.

static int test_set_pag_dirty(struct page *page)
{
if (!PageDirty(page))
SetPageDirty*page);
return 0;
}

Otherwise, it goes BUG_ON during radix tree operation because
currently try_to_unmap is designed for file-lru pages which lives
in page 

Re: [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration

2016-06-26 Thread Anshuman Khandual
On 06/16/2016 11:07 AM, Minchan Kim wrote:
> On Thu, Jun 16, 2016 at 09:12:07AM +0530, Anshuman Khandual wrote:
>> On 06/16/2016 05:56 AM, Minchan Kim wrote:
>>> On Wed, Jun 15, 2016 at 12:15:04PM +0530, Anshuman Khandual wrote:
 On 06/15/2016 08:02 AM, Minchan Kim wrote:
> Hi,
>
> On Mon, Jun 13, 2016 at 03:08:19PM +0530, Anshuman Khandual wrote:
>>> On 05/31/2016 05:31 AM, Minchan Kim wrote:
> @@ -791,6 +921,7 @@ static int __unmap_and_move(struct page *page, 
> struct page *newpage,
>   int rc = -EAGAIN;
>   int page_was_mapped = 0;
>   struct anon_vma *anon_vma = NULL;
> + bool is_lru = !__PageMovable(page);
>  
>   if (!trylock_page(page)) {
>   if (!force || mode == MIGRATE_ASYNC)
> @@ -871,6 +1002,11 @@ static int __unmap_and_move(struct page *page, 
> struct page *newpage,
>   goto out_unlock_both;
>   }
>  
> + if (unlikely(!is_lru)) {
> + rc = move_to_new_page(newpage, page, mode);
> + goto out_unlock_both;
> + }
> +
>>>
>>> Hello Minchan,
>>>
>>> I might be missing something here but does this implementation support 
>>> the
>>> scenario where these non LRU pages owned by the driver mapped as PTE 
>>> into
>>> process page table ? Because the "goto out_unlock_both" statement above
>>> skips all the PTE unmap, putting a migration PTE and removing the 
>>> migration
>>> PTE steps.
> You're right. Unfortunately, it doesn't support right now but surely,
> it's my TODO after landing this work.
>
> Could you share your usecase?

 Sure.
>>>
>>> Thanks a lot!
>>>

 My driver has privately managed non LRU pages which gets mapped into user 
 space
 process page table through f_ops->mmap() and vmops->fault() which then 
 updates
 the file RMAP (page->mapping->i_mmap) through page_add_file_rmap(page). 
 One thing
>>>
>>> Hmm, page_add_file_rmap is not exported function. How does your driver can 
>>> use it?
>>
>> Its not using the function directly, I just re-iterated the sequence of 
>> functions
>> above. (do_set_pte -> page_add_file_rmap) gets called after we grab the page 
>> from
>> driver through (__do_fault->vma->vm_ops->fault()).
>>
>>> Do you use vm_insert_pfn?
>>> What type your vma is? VM_PFNMMAP or VM_MIXEDMAP?
>>
>> I dont use vm_insert_pfn(). Here is the sequence of events how the user space
>> VMA gets the non LRU pages from the driver.
>>
>> - Driver registers a character device with 'struct file_operations' binding
>> - Then the 'fops->mmap()' just binds the incoming 'struct vma' with a 'struct
>>   vm_operations_struct' which provides the 'vmops->fault()' routine which
>>   basically traps all page faults on the VMA and provides one page at a time
>>   through a driver specific allocation routine which hands over non LRU pages
>>
>> The VMA is not anything special as such. Its what we get when we try to do a
>> simple mmap() on a file descriptor pointing to a character device. I can
>> figure out all the VM_* flags it holds after creation.
>>
>>>
>>> I want to make dummy driver to simulate your case.
>>
>> Sure. I hope the above mentioned steps will help you but in case you need 
>> more
>> information, please do let me know.
> 
> I got understood now. :)
> I will test it with dummy driver and will Cc'ed when I send a patch.

Hello Minchan,

Do you have any updates on this ? The V7 of the series still has this 
limitation.
Did you get a chance to test the driver out ? I am still concerned about how to
handle the struct address_space override problem within the struct page.

- Anshuman



Re: [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration

2016-06-26 Thread Anshuman Khandual
On 06/16/2016 11:07 AM, Minchan Kim wrote:
> On Thu, Jun 16, 2016 at 09:12:07AM +0530, Anshuman Khandual wrote:
>> On 06/16/2016 05:56 AM, Minchan Kim wrote:
>>> On Wed, Jun 15, 2016 at 12:15:04PM +0530, Anshuman Khandual wrote:
 On 06/15/2016 08:02 AM, Minchan Kim wrote:
> Hi,
>
> On Mon, Jun 13, 2016 at 03:08:19PM +0530, Anshuman Khandual wrote:
>>> On 05/31/2016 05:31 AM, Minchan Kim wrote:
> @@ -791,6 +921,7 @@ static int __unmap_and_move(struct page *page, 
> struct page *newpage,
>   int rc = -EAGAIN;
>   int page_was_mapped = 0;
>   struct anon_vma *anon_vma = NULL;
> + bool is_lru = !__PageMovable(page);
>  
>   if (!trylock_page(page)) {
>   if (!force || mode == MIGRATE_ASYNC)
> @@ -871,6 +1002,11 @@ static int __unmap_and_move(struct page *page, 
> struct page *newpage,
>   goto out_unlock_both;
>   }
>  
> + if (unlikely(!is_lru)) {
> + rc = move_to_new_page(newpage, page, mode);
> + goto out_unlock_both;
> + }
> +
>>>
>>> Hello Minchan,
>>>
>>> I might be missing something here but does this implementation support 
>>> the
>>> scenario where these non LRU pages owned by the driver mapped as PTE 
>>> into
>>> process page table ? Because the "goto out_unlock_both" statement above
>>> skips all the PTE unmap, putting a migration PTE and removing the 
>>> migration
>>> PTE steps.
> You're right. Unfortunately, it doesn't support right now but surely,
> it's my TODO after landing this work.
>
> Could you share your usecase?

 Sure.
>>>
>>> Thanks a lot!
>>>

 My driver has privately managed non LRU pages which gets mapped into user 
 space
 process page table through f_ops->mmap() and vmops->fault() which then 
 updates
 the file RMAP (page->mapping->i_mmap) through page_add_file_rmap(page). 
 One thing
>>>
>>> Hmm, page_add_file_rmap is not exported function. How does your driver can 
>>> use it?
>>
>> Its not using the function directly, I just re-iterated the sequence of 
>> functions
>> above. (do_set_pte -> page_add_file_rmap) gets called after we grab the page 
>> from
>> driver through (__do_fault->vma->vm_ops->fault()).
>>
>>> Do you use vm_insert_pfn?
>>> What type your vma is? VM_PFNMMAP or VM_MIXEDMAP?
>>
>> I dont use vm_insert_pfn(). Here is the sequence of events how the user space
>> VMA gets the non LRU pages from the driver.
>>
>> - Driver registers a character device with 'struct file_operations' binding
>> - Then the 'fops->mmap()' just binds the incoming 'struct vma' with a 'struct
>>   vm_operations_struct' which provides the 'vmops->fault()' routine which
>>   basically traps all page faults on the VMA and provides one page at a time
>>   through a driver specific allocation routine which hands over non LRU pages
>>
>> The VMA is not anything special as such. Its what we get when we try to do a
>> simple mmap() on a file descriptor pointing to a character device. I can
>> figure out all the VM_* flags it holds after creation.
>>
>>>
>>> I want to make dummy driver to simulate your case.
>>
>> Sure. I hope the above mentioned steps will help you but in case you need 
>> more
>> information, please do let me know.
> 
> I got understood now. :)
> I will test it with dummy driver and will Cc'ed when I send a patch.

Hello Minchan,

Do you have any updates on this ? The V7 of the series still has this 
limitation.
Did you get a chance to test the driver out ? I am still concerned about how to
handle the struct address_space override problem within the struct page.

- Anshuman



Re: [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration

2016-06-15 Thread Minchan Kim
On Thu, Jun 16, 2016 at 09:12:07AM +0530, Anshuman Khandual wrote:
> On 06/16/2016 05:56 AM, Minchan Kim wrote:
> > On Wed, Jun 15, 2016 at 12:15:04PM +0530, Anshuman Khandual wrote:
> >> On 06/15/2016 08:02 AM, Minchan Kim wrote:
> >>> Hi,
> >>>
> >>> On Mon, Jun 13, 2016 at 03:08:19PM +0530, Anshuman Khandual wrote:
> > On 05/31/2016 05:31 AM, Minchan Kim wrote:
> >>> @@ -791,6 +921,7 @@ static int __unmap_and_move(struct page *page, 
> >>> struct page *newpage,
> >>>   int rc = -EAGAIN;
> >>>   int page_was_mapped = 0;
> >>>   struct anon_vma *anon_vma = NULL;
> >>> + bool is_lru = !__PageMovable(page);
> >>>  
> >>>   if (!trylock_page(page)) {
> >>>   if (!force || mode == MIGRATE_ASYNC)
> >>> @@ -871,6 +1002,11 @@ static int __unmap_and_move(struct page *page, 
> >>> struct page *newpage,
> >>>   goto out_unlock_both;
> >>>   }
> >>>  
> >>> + if (unlikely(!is_lru)) {
> >>> + rc = move_to_new_page(newpage, page, mode);
> >>> + goto out_unlock_both;
> >>> + }
> >>> +
> >
> > Hello Minchan,
> >
> > I might be missing something here but does this implementation support 
> > the
> > scenario where these non LRU pages owned by the driver mapped as PTE 
> > into
> > process page table ? Because the "goto out_unlock_both" statement above
> > skips all the PTE unmap, putting a migration PTE and removing the 
> > migration
> > PTE steps.
> >>> You're right. Unfortunately, it doesn't support right now but surely,
> >>> it's my TODO after landing this work.
> >>>
> >>> Could you share your usecase?
> >>
> >> Sure.
> > 
> > Thanks a lot!
> > 
> >>
> >> My driver has privately managed non LRU pages which gets mapped into user 
> >> space
> >> process page table through f_ops->mmap() and vmops->fault() which then 
> >> updates
> >> the file RMAP (page->mapping->i_mmap) through page_add_file_rmap(page). 
> >> One thing
> > 
> > Hmm, page_add_file_rmap is not exported function. How does your driver can 
> > use it?
> 
> Its not using the function directly, I just re-iterated the sequence of 
> functions
> above. (do_set_pte -> page_add_file_rmap) gets called after we grab the page 
> from
> driver through (__do_fault->vma->vm_ops->fault()).
> 
> > Do you use vm_insert_pfn?
> > What type your vma is? VM_PFNMMAP or VM_MIXEDMAP?
> 
> I dont use vm_insert_pfn(). Here is the sequence of events how the user space
> VMA gets the non LRU pages from the driver.
> 
> - Driver registers a character device with 'struct file_operations' binding
> - Then the 'fops->mmap()' just binds the incoming 'struct vma' with a 'struct
>   vm_operations_struct' which provides the 'vmops->fault()' routine which
>   basically traps all page faults on the VMA and provides one page at a time
>   through a driver specific allocation routine which hands over non LRU pages
> 
> The VMA is not anything special as such. Its what we get when we try to do a
> simple mmap() on a file descriptor pointing to a character device. I can
> figure out all the VM_* flags it holds after creation.
> 
> > 
> > I want to make dummy driver to simulate your case.
> 
> Sure. I hope the above mentioned steps will help you but in case you need more
> information, please do let me know.

I got understood now. :)
I will test it with dummy driver and will Cc'ed when I send a patch.

Thanks.


Re: [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration

2016-06-15 Thread Minchan Kim
On Thu, Jun 16, 2016 at 09:12:07AM +0530, Anshuman Khandual wrote:
> On 06/16/2016 05:56 AM, Minchan Kim wrote:
> > On Wed, Jun 15, 2016 at 12:15:04PM +0530, Anshuman Khandual wrote:
> >> On 06/15/2016 08:02 AM, Minchan Kim wrote:
> >>> Hi,
> >>>
> >>> On Mon, Jun 13, 2016 at 03:08:19PM +0530, Anshuman Khandual wrote:
> > On 05/31/2016 05:31 AM, Minchan Kim wrote:
> >>> @@ -791,6 +921,7 @@ static int __unmap_and_move(struct page *page, 
> >>> struct page *newpage,
> >>>   int rc = -EAGAIN;
> >>>   int page_was_mapped = 0;
> >>>   struct anon_vma *anon_vma = NULL;
> >>> + bool is_lru = !__PageMovable(page);
> >>>  
> >>>   if (!trylock_page(page)) {
> >>>   if (!force || mode == MIGRATE_ASYNC)
> >>> @@ -871,6 +1002,11 @@ static int __unmap_and_move(struct page *page, 
> >>> struct page *newpage,
> >>>   goto out_unlock_both;
> >>>   }
> >>>  
> >>> + if (unlikely(!is_lru)) {
> >>> + rc = move_to_new_page(newpage, page, mode);
> >>> + goto out_unlock_both;
> >>> + }
> >>> +
> >
> > Hello Minchan,
> >
> > I might be missing something here but does this implementation support 
> > the
> > scenario where these non LRU pages owned by the driver mapped as PTE 
> > into
> > process page table ? Because the "goto out_unlock_both" statement above
> > skips all the PTE unmap, putting a migration PTE and removing the 
> > migration
> > PTE steps.
> >>> You're right. Unfortunately, it doesn't support right now but surely,
> >>> it's my TODO after landing this work.
> >>>
> >>> Could you share your usecase?
> >>
> >> Sure.
> > 
> > Thanks a lot!
> > 
> >>
> >> My driver has privately managed non LRU pages which gets mapped into user 
> >> space
> >> process page table through f_ops->mmap() and vmops->fault() which then 
> >> updates
> >> the file RMAP (page->mapping->i_mmap) through page_add_file_rmap(page). 
> >> One thing
> > 
> > Hmm, page_add_file_rmap is not exported function. How does your driver can 
> > use it?
> 
> Its not using the function directly, I just re-iterated the sequence of 
> functions
> above. (do_set_pte -> page_add_file_rmap) gets called after we grab the page 
> from
> driver through (__do_fault->vma->vm_ops->fault()).
> 
> > Do you use vm_insert_pfn?
> > What type your vma is? VM_PFNMMAP or VM_MIXEDMAP?
> 
> I dont use vm_insert_pfn(). Here is the sequence of events how the user space
> VMA gets the non LRU pages from the driver.
> 
> - Driver registers a character device with 'struct file_operations' binding
> - Then the 'fops->mmap()' just binds the incoming 'struct vma' with a 'struct
>   vm_operations_struct' which provides the 'vmops->fault()' routine which
>   basically traps all page faults on the VMA and provides one page at a time
>   through a driver specific allocation routine which hands over non LRU pages
> 
> The VMA is not anything special as such. Its what we get when we try to do a
> simple mmap() on a file descriptor pointing to a character device. I can
> figure out all the VM_* flags it holds after creation.
> 
> > 
> > I want to make dummy driver to simulate your case.
> 
> Sure. I hope the above mentioned steps will help you but in case you need more
> information, please do let me know.

I got understood now. :)
I will test it with dummy driver and will Cc'ed when I send a patch.

Thanks.


Re: [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration

2016-06-15 Thread Anshuman Khandual
On 06/16/2016 05:56 AM, Minchan Kim wrote:
> On Wed, Jun 15, 2016 at 12:15:04PM +0530, Anshuman Khandual wrote:
>> On 06/15/2016 08:02 AM, Minchan Kim wrote:
>>> Hi,
>>>
>>> On Mon, Jun 13, 2016 at 03:08:19PM +0530, Anshuman Khandual wrote:
> On 05/31/2016 05:31 AM, Minchan Kim wrote:
>>> @@ -791,6 +921,7 @@ static int __unmap_and_move(struct page *page, 
>>> struct page *newpage,
>>> int rc = -EAGAIN;
>>> int page_was_mapped = 0;
>>> struct anon_vma *anon_vma = NULL;
>>> +   bool is_lru = !__PageMovable(page);
>>>  
>>> if (!trylock_page(page)) {
>>> if (!force || mode == MIGRATE_ASYNC)
>>> @@ -871,6 +1002,11 @@ static int __unmap_and_move(struct page *page, 
>>> struct page *newpage,
>>> goto out_unlock_both;
>>> }
>>>  
>>> +   if (unlikely(!is_lru)) {
>>> +   rc = move_to_new_page(newpage, page, mode);
>>> +   goto out_unlock_both;
>>> +   }
>>> +
>
> Hello Minchan,
>
> I might be missing something here but does this implementation support the
> scenario where these non LRU pages owned by the driver mapped as PTE into
> process page table ? Because the "goto out_unlock_both" statement above
> skips all the PTE unmap, putting a migration PTE and removing the 
> migration
> PTE steps.
>>> You're right. Unfortunately, it doesn't support right now but surely,
>>> it's my TODO after landing this work.
>>>
>>> Could you share your usecase?
>>
>> Sure.
> 
> Thanks a lot!
> 
>>
>> My driver has privately managed non LRU pages which gets mapped into user 
>> space
>> process page table through f_ops->mmap() and vmops->fault() which then 
>> updates
>> the file RMAP (page->mapping->i_mmap) through page_add_file_rmap(page). One 
>> thing
> 
> Hmm, page_add_file_rmap is not exported function. How does your driver can 
> use it?

Its not using the function directly, I just re-iterated the sequence of 
functions
above. (do_set_pte -> page_add_file_rmap) gets called after we grab the page 
from
driver through (__do_fault->vma->vm_ops->fault()).

> Do you use vm_insert_pfn?
> What type your vma is? VM_PFNMMAP or VM_MIXEDMAP?

I dont use vm_insert_pfn(). Here is the sequence of events how the user space
VMA gets the non LRU pages from the driver.

- Driver registers a character device with 'struct file_operations' binding
- Then the 'fops->mmap()' just binds the incoming 'struct vma' with a 'struct
  vm_operations_struct' which provides the 'vmops->fault()' routine which
  basically traps all page faults on the VMA and provides one page at a time
  through a driver specific allocation routine which hands over non LRU pages

The VMA is not anything special as such. Its what we get when we try to do a
simple mmap() on a file descriptor pointing to a character device. I can
figure out all the VM_* flags it holds after creation.

> 
> I want to make dummy driver to simulate your case.

Sure. I hope the above mentioned steps will help you but in case you need more
information, please do let me know.

> It would be very helpful to implement/test pte-mapped non-lru page
> migration feature. That's why I ask now.
> 
>> to note here is that the page->mapping eventually points to struct 
>> address_space
>> (file->f_mapping) which belongs to the character device file (created using 
>> mknod)
>> which we are using for establishing the mmap() regions in the user space.
>>
>> Now as per this new framework, all the page's are to be made 
>> __SetPageMovable before
>> passing the list down to migrate_pages(). Now __SetPageMovable() takes *new* 
>> struct
>> address_space as an argument and replaces the existing page->mapping. Now 
>> thats the
>> problem, we have lost all our connection to the existing file RMAP 
>> information. This
> 
> We could change __SetPageMovable doesn't need mapping argument.
> Instead, it just marks PAGE_MAPPING_MOVABLE into page->mapping.
> For that, user should take care of setting page->mapping earlier than
> marking the flag.

Sounds like a good idea, that way we dont loose the reverse mapping information.

> 
>> stands as a problem when we try to migrate these non LRU pages which are PTE 
>> mapped.
>> The rmap_walk_file() never finds them in the VMA, skips all the migrate PTE 
>> steps and
>> then the migration eventually fails.
>>
>> Seems like assigning a new struct address_space to the page through 
>> __SetPageMovable()
>> is the source of the problem. Can it take the existing (file->f_mapping) as 
>> an argument

> We can set existing file->f_mapping under the page_lock.

Thats another option along with what you mentioned above.

> 
>> in there ? Sure, but then can we override file system generic ->isolate(), 
>> ->putback(),
> 
> I don't get it. Why does it override file system generic functions?

Sure it does not, it was just an wild idea to over come the problem.



Re: [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration

2016-06-15 Thread Anshuman Khandual
On 06/16/2016 05:56 AM, Minchan Kim wrote:
> On Wed, Jun 15, 2016 at 12:15:04PM +0530, Anshuman Khandual wrote:
>> On 06/15/2016 08:02 AM, Minchan Kim wrote:
>>> Hi,
>>>
>>> On Mon, Jun 13, 2016 at 03:08:19PM +0530, Anshuman Khandual wrote:
> On 05/31/2016 05:31 AM, Minchan Kim wrote:
>>> @@ -791,6 +921,7 @@ static int __unmap_and_move(struct page *page, 
>>> struct page *newpage,
>>> int rc = -EAGAIN;
>>> int page_was_mapped = 0;
>>> struct anon_vma *anon_vma = NULL;
>>> +   bool is_lru = !__PageMovable(page);
>>>  
>>> if (!trylock_page(page)) {
>>> if (!force || mode == MIGRATE_ASYNC)
>>> @@ -871,6 +1002,11 @@ static int __unmap_and_move(struct page *page, 
>>> struct page *newpage,
>>> goto out_unlock_both;
>>> }
>>>  
>>> +   if (unlikely(!is_lru)) {
>>> +   rc = move_to_new_page(newpage, page, mode);
>>> +   goto out_unlock_both;
>>> +   }
>>> +
>
> Hello Minchan,
>
> I might be missing something here but does this implementation support the
> scenario where these non LRU pages owned by the driver mapped as PTE into
> process page table ? Because the "goto out_unlock_both" statement above
> skips all the PTE unmap, putting a migration PTE and removing the 
> migration
> PTE steps.
>>> You're right. Unfortunately, it doesn't support right now but surely,
>>> it's my TODO after landing this work.
>>>
>>> Could you share your usecase?
>>
>> Sure.
> 
> Thanks a lot!
> 
>>
>> My driver has privately managed non LRU pages which gets mapped into user 
>> space
>> process page table through f_ops->mmap() and vmops->fault() which then 
>> updates
>> the file RMAP (page->mapping->i_mmap) through page_add_file_rmap(page). One 
>> thing
> 
> Hmm, page_add_file_rmap is not exported function. How does your driver can 
> use it?

Its not using the function directly, I just re-iterated the sequence of 
functions
above. (do_set_pte -> page_add_file_rmap) gets called after we grab the page 
from
driver through (__do_fault->vma->vm_ops->fault()).

> Do you use vm_insert_pfn?
> What type your vma is? VM_PFNMMAP or VM_MIXEDMAP?

I dont use vm_insert_pfn(). Here is the sequence of events how the user space
VMA gets the non LRU pages from the driver.

- Driver registers a character device with 'struct file_operations' binding
- Then the 'fops->mmap()' just binds the incoming 'struct vma' with a 'struct
  vm_operations_struct' which provides the 'vmops->fault()' routine which
  basically traps all page faults on the VMA and provides one page at a time
  through a driver specific allocation routine which hands over non LRU pages

The VMA is not anything special as such. Its what we get when we try to do a
simple mmap() on a file descriptor pointing to a character device. I can
figure out all the VM_* flags it holds after creation.

> 
> I want to make dummy driver to simulate your case.

Sure. I hope the above mentioned steps will help you but in case you need more
information, please do let me know.

> It would be very helpful to implement/test pte-mapped non-lru page
> migration feature. That's why I ask now.
> 
>> to note here is that the page->mapping eventually points to struct 
>> address_space
>> (file->f_mapping) which belongs to the character device file (created using 
>> mknod)
>> which we are using for establishing the mmap() regions in the user space.
>>
>> Now as per this new framework, all the page's are to be made 
>> __SetPageMovable before
>> passing the list down to migrate_pages(). Now __SetPageMovable() takes *new* 
>> struct
>> address_space as an argument and replaces the existing page->mapping. Now 
>> thats the
>> problem, we have lost all our connection to the existing file RMAP 
>> information. This
> 
> We could change __SetPageMovable doesn't need mapping argument.
> Instead, it just marks PAGE_MAPPING_MOVABLE into page->mapping.
> For that, user should take care of setting page->mapping earlier than
> marking the flag.

Sounds like a good idea, that way we dont loose the reverse mapping information.

> 
>> stands as a problem when we try to migrate these non LRU pages which are PTE 
>> mapped.
>> The rmap_walk_file() never finds them in the VMA, skips all the migrate PTE 
>> steps and
>> then the migration eventually fails.
>>
>> Seems like assigning a new struct address_space to the page through 
>> __SetPageMovable()
>> is the source of the problem. Can it take the existing (file->f_mapping) as 
>> an argument

> We can set existing file->f_mapping under the page_lock.

Thats another option along with what you mentioned above.

> 
>> in there ? Sure, but then can we override file system generic ->isolate(), 
>> ->putback(),
> 
> I don't get it. Why does it override file system generic functions?

Sure it does not, it was just an wild idea to over come the problem.



Re: [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration

2016-06-15 Thread Minchan Kim
On Wed, Jun 15, 2016 at 12:15:04PM +0530, Anshuman Khandual wrote:
> On 06/15/2016 08:02 AM, Minchan Kim wrote:
> > Hi,
> > 
> > On Mon, Jun 13, 2016 at 03:08:19PM +0530, Anshuman Khandual wrote:
> >> > On 05/31/2016 05:31 AM, Minchan Kim wrote:
> >>> > > @@ -791,6 +921,7 @@ static int __unmap_and_move(struct page *page, 
> >>> > > struct page *newpage,
> >>> > >   int rc = -EAGAIN;
> >>> > >   int page_was_mapped = 0;
> >>> > >   struct anon_vma *anon_vma = NULL;
> >>> > > + bool is_lru = !__PageMovable(page);
> >>> > >  
> >>> > >   if (!trylock_page(page)) {
> >>> > >   if (!force || mode == MIGRATE_ASYNC)
> >>> > > @@ -871,6 +1002,11 @@ static int __unmap_and_move(struct page *page, 
> >>> > > struct page *newpage,
> >>> > >   goto out_unlock_both;
> >>> > >   }
> >>> > >  
> >>> > > + if (unlikely(!is_lru)) {
> >>> > > + rc = move_to_new_page(newpage, page, mode);
> >>> > > + goto out_unlock_both;
> >>> > > + }
> >>> > > +
> >> > 
> >> > Hello Minchan,
> >> > 
> >> > I might be missing something here but does this implementation support 
> >> > the
> >> > scenario where these non LRU pages owned by the driver mapped as PTE into
> >> > process page table ? Because the "goto out_unlock_both" statement above
> >> > skips all the PTE unmap, putting a migration PTE and removing the 
> >> > migration
> >> > PTE steps.
> > You're right. Unfortunately, it doesn't support right now but surely,
> > it's my TODO after landing this work.
> > 
> > Could you share your usecase?
> 
> Sure.

Thanks a lot!

> 
> My driver has privately managed non LRU pages which gets mapped into user 
> space
> process page table through f_ops->mmap() and vmops->fault() which then updates
> the file RMAP (page->mapping->i_mmap) through page_add_file_rmap(page). One 
> thing

Hmm, page_add_file_rmap is not exported function. How does your driver can use 
it?
Do you use vm_insert_pfn?
What type your vma is? VM_PFNMMAP or VM_MIXEDMAP?

I want to make dummy driver to simulate your case.
It would be very helpful to implement/test pte-mapped non-lru page
migration feature. That's why I ask now.

> to note here is that the page->mapping eventually points to struct 
> address_space
> (file->f_mapping) which belongs to the character device file (created using 
> mknod)
> which we are using for establishing the mmap() regions in the user space.
> 
> Now as per this new framework, all the page's are to be made __SetPageMovable 
> before
> passing the list down to migrate_pages(). Now __SetPageMovable() takes *new* 
> struct
> address_space as an argument and replaces the existing page->mapping. Now 
> thats the
> problem, we have lost all our connection to the existing file RMAP 
> information. This

We could change __SetPageMovable doesn't need mapping argument.
Instead, it just marks PAGE_MAPPING_MOVABLE into page->mapping.
For that, user should take care of setting page->mapping earlier than
marking the flag.

> stands as a problem when we try to migrate these non LRU pages which are PTE 
> mapped.
> The rmap_walk_file() never finds them in the VMA, skips all the migrate PTE 
> steps and
> then the migration eventually fails.
> 
> Seems like assigning a new struct address_space to the page through 
> __SetPageMovable()
> is the source of the problem. Can it take the existing (file->f_mapping) as 
> an argument
We can set existing file->f_mapping under the page_lock.

> in there ? Sure, but then can we override file system generic ->isolate(), 
> ->putback(),

I don't get it. Why does it override file system generic functions?

> ->migratepages() functions ? I dont think so. I am sure, there must be some 
> work around
> to fix this problem for the driver. But we need to rethink this framework 
> from supporting
> these mapped non LRU pages point of view.
> 
> I might be missing something here, feel free to point out.
> 
> - Anshuman
> 


Re: [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration

2016-06-15 Thread Minchan Kim
On Wed, Jun 15, 2016 at 12:15:04PM +0530, Anshuman Khandual wrote:
> On 06/15/2016 08:02 AM, Minchan Kim wrote:
> > Hi,
> > 
> > On Mon, Jun 13, 2016 at 03:08:19PM +0530, Anshuman Khandual wrote:
> >> > On 05/31/2016 05:31 AM, Minchan Kim wrote:
> >>> > > @@ -791,6 +921,7 @@ static int __unmap_and_move(struct page *page, 
> >>> > > struct page *newpage,
> >>> > >   int rc = -EAGAIN;
> >>> > >   int page_was_mapped = 0;
> >>> > >   struct anon_vma *anon_vma = NULL;
> >>> > > + bool is_lru = !__PageMovable(page);
> >>> > >  
> >>> > >   if (!trylock_page(page)) {
> >>> > >   if (!force || mode == MIGRATE_ASYNC)
> >>> > > @@ -871,6 +1002,11 @@ static int __unmap_and_move(struct page *page, 
> >>> > > struct page *newpage,
> >>> > >   goto out_unlock_both;
> >>> > >   }
> >>> > >  
> >>> > > + if (unlikely(!is_lru)) {
> >>> > > + rc = move_to_new_page(newpage, page, mode);
> >>> > > + goto out_unlock_both;
> >>> > > + }
> >>> > > +
> >> > 
> >> > Hello Minchan,
> >> > 
> >> > I might be missing something here but does this implementation support 
> >> > the
> >> > scenario where these non LRU pages owned by the driver mapped as PTE into
> >> > process page table ? Because the "goto out_unlock_both" statement above
> >> > skips all the PTE unmap, putting a migration PTE and removing the 
> >> > migration
> >> > PTE steps.
> > You're right. Unfortunately, it doesn't support right now but surely,
> > it's my TODO after landing this work.
> > 
> > Could you share your usecase?
> 
> Sure.

Thanks a lot!

> 
> My driver has privately managed non LRU pages which gets mapped into user 
> space
> process page table through f_ops->mmap() and vmops->fault() which then updates
> the file RMAP (page->mapping->i_mmap) through page_add_file_rmap(page). One 
> thing

Hmm, page_add_file_rmap is not exported function. How does your driver can use 
it?
Do you use vm_insert_pfn?
What type your vma is? VM_PFNMMAP or VM_MIXEDMAP?

I want to make dummy driver to simulate your case.
It would be very helpful to implement/test pte-mapped non-lru page
migration feature. That's why I ask now.

> to note here is that the page->mapping eventually points to struct 
> address_space
> (file->f_mapping) which belongs to the character device file (created using 
> mknod)
> which we are using for establishing the mmap() regions in the user space.
> 
> Now as per this new framework, all the page's are to be made __SetPageMovable 
> before
> passing the list down to migrate_pages(). Now __SetPageMovable() takes *new* 
> struct
> address_space as an argument and replaces the existing page->mapping. Now 
> thats the
> problem, we have lost all our connection to the existing file RMAP 
> information. This

We could change __SetPageMovable doesn't need mapping argument.
Instead, it just marks PAGE_MAPPING_MOVABLE into page->mapping.
For that, user should take care of setting page->mapping earlier than
marking the flag.

> stands as a problem when we try to migrate these non LRU pages which are PTE 
> mapped.
> The rmap_walk_file() never finds them in the VMA, skips all the migrate PTE 
> steps and
> then the migration eventually fails.
> 
> Seems like assigning a new struct address_space to the page through 
> __SetPageMovable()
> is the source of the problem. Can it take the existing (file->f_mapping) as 
> an argument
We can set existing file->f_mapping under the page_lock.

> in there ? Sure, but then can we override file system generic ->isolate(), 
> ->putback(),

I don't get it. Why does it override file system generic functions?

> ->migratepages() functions ? I dont think so. I am sure, there must be some 
> work around
> to fix this problem for the driver. But we need to rethink this framework 
> from supporting
> these mapped non LRU pages point of view.
> 
> I might be missing something here, feel free to point out.
> 
> - Anshuman
> 


Re: [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration

2016-06-15 Thread Anshuman Khandual
On 06/15/2016 08:02 AM, Minchan Kim wrote:
> Hi,
> 
> On Mon, Jun 13, 2016 at 03:08:19PM +0530, Anshuman Khandual wrote:
>> > On 05/31/2016 05:31 AM, Minchan Kim wrote:
>>> > > @@ -791,6 +921,7 @@ static int __unmap_and_move(struct page *page, 
>>> > > struct page *newpage,
>>> > > int rc = -EAGAIN;
>>> > > int page_was_mapped = 0;
>>> > > struct anon_vma *anon_vma = NULL;
>>> > > +   bool is_lru = !__PageMovable(page);
>>> > >  
>>> > > if (!trylock_page(page)) {
>>> > > if (!force || mode == MIGRATE_ASYNC)
>>> > > @@ -871,6 +1002,11 @@ static int __unmap_and_move(struct page *page, 
>>> > > struct page *newpage,
>>> > > goto out_unlock_both;
>>> > > }
>>> > >  
>>> > > +   if (unlikely(!is_lru)) {
>>> > > +   rc = move_to_new_page(newpage, page, mode);
>>> > > +   goto out_unlock_both;
>>> > > +   }
>>> > > +
>> > 
>> > Hello Minchan,
>> > 
>> > I might be missing something here but does this implementation support the
>> > scenario where these non LRU pages owned by the driver mapped as PTE into
>> > process page table ? Because the "goto out_unlock_both" statement above
>> > skips all the PTE unmap, putting a migration PTE and removing the migration
>> > PTE steps.
> You're right. Unfortunately, it doesn't support right now but surely,
> it's my TODO after landing this work.
> 
> Could you share your usecase?

Sure.

My driver has privately managed non LRU pages which gets mapped into user space
process page table through f_ops->mmap() and vmops->fault() which then updates
the file RMAP (page->mapping->i_mmap) through page_add_file_rmap(page). One 
thing
to note here is that the page->mapping eventually points to struct address_space
(file->f_mapping) which belongs to the character device file (created using 
mknod)
which we are using for establishing the mmap() regions in the user space.

Now as per this new framework, all the page's are to be made __SetPageMovable 
before
passing the list down to migrate_pages(). Now __SetPageMovable() takes *new* 
struct
address_space as an argument and replaces the existing page->mapping. Now thats 
the
problem, we have lost all our connection to the existing file RMAP information. 
This
stands as a problem when we try to migrate these non LRU pages which are PTE 
mapped.
The rmap_walk_file() never finds them in the VMA, skips all the migrate PTE 
steps and
then the migration eventually fails.

Seems like assigning a new struct address_space to the page through 
__SetPageMovable()
is the source of the problem. Can it take the existing (file->f_mapping) as an 
argument
in there ? Sure, but then can we override file system generic ->isolate(), 
->putback(),
->migratepages() functions ? I dont think so. I am sure, there must be some 
work around
to fix this problem for the driver. But we need to rethink this framework from 
supporting
these mapped non LRU pages point of view.

I might be missing something here, feel free to point out.

- Anshuman



Re: [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration

2016-06-15 Thread Anshuman Khandual
On 06/15/2016 08:02 AM, Minchan Kim wrote:
> Hi,
> 
> On Mon, Jun 13, 2016 at 03:08:19PM +0530, Anshuman Khandual wrote:
>> > On 05/31/2016 05:31 AM, Minchan Kim wrote:
>>> > > @@ -791,6 +921,7 @@ static int __unmap_and_move(struct page *page, 
>>> > > struct page *newpage,
>>> > > int rc = -EAGAIN;
>>> > > int page_was_mapped = 0;
>>> > > struct anon_vma *anon_vma = NULL;
>>> > > +   bool is_lru = !__PageMovable(page);
>>> > >  
>>> > > if (!trylock_page(page)) {
>>> > > if (!force || mode == MIGRATE_ASYNC)
>>> > > @@ -871,6 +1002,11 @@ static int __unmap_and_move(struct page *page, 
>>> > > struct page *newpage,
>>> > > goto out_unlock_both;
>>> > > }
>>> > >  
>>> > > +   if (unlikely(!is_lru)) {
>>> > > +   rc = move_to_new_page(newpage, page, mode);
>>> > > +   goto out_unlock_both;
>>> > > +   }
>>> > > +
>> > 
>> > Hello Minchan,
>> > 
>> > I might be missing something here but does this implementation support the
>> > scenario where these non LRU pages owned by the driver mapped as PTE into
>> > process page table ? Because the "goto out_unlock_both" statement above
>> > skips all the PTE unmap, putting a migration PTE and removing the migration
>> > PTE steps.
> You're right. Unfortunately, it doesn't support right now but surely,
> it's my TODO after landing this work.
> 
> Could you share your usecase?

Sure.

My driver has privately managed non LRU pages which gets mapped into user space
process page table through f_ops->mmap() and vmops->fault() which then updates
the file RMAP (page->mapping->i_mmap) through page_add_file_rmap(page). One 
thing
to note here is that the page->mapping eventually points to struct address_space
(file->f_mapping) which belongs to the character device file (created using 
mknod)
which we are using for establishing the mmap() regions in the user space.

Now as per this new framework, all the page's are to be made __SetPageMovable 
before
passing the list down to migrate_pages(). Now __SetPageMovable() takes *new* 
struct
address_space as an argument and replaces the existing page->mapping. Now thats 
the
problem, we have lost all our connection to the existing file RMAP information. 
This
stands as a problem when we try to migrate these non LRU pages which are PTE 
mapped.
The rmap_walk_file() never finds them in the VMA, skips all the migrate PTE 
steps and
then the migration eventually fails.

Seems like assigning a new struct address_space to the page through 
__SetPageMovable()
is the source of the problem. Can it take the existing (file->f_mapping) as an 
argument
in there ? Sure, but then can we override file system generic ->isolate(), 
->putback(),
->migratepages() functions ? I dont think so. I am sure, there must be some 
work around
to fix this problem for the driver. But we need to rethink this framework from 
supporting
these mapped non LRU pages point of view.

I might be missing something here, feel free to point out.

- Anshuman



Re: [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration

2016-06-14 Thread Minchan Kim
Hi,

On Mon, Jun 13, 2016 at 03:08:19PM +0530, Anshuman Khandual wrote:
> On 05/31/2016 05:31 AM, Minchan Kim wrote:
> > @@ -791,6 +921,7 @@ static int __unmap_and_move(struct page *page, struct 
> > page *newpage,
> > int rc = -EAGAIN;
> > int page_was_mapped = 0;
> > struct anon_vma *anon_vma = NULL;
> > +   bool is_lru = !__PageMovable(page);
> >  
> > if (!trylock_page(page)) {
> > if (!force || mode == MIGRATE_ASYNC)
> > @@ -871,6 +1002,11 @@ static int __unmap_and_move(struct page *page, struct 
> > page *newpage,
> > goto out_unlock_both;
> > }
> >  
> > +   if (unlikely(!is_lru)) {
> > +   rc = move_to_new_page(newpage, page, mode);
> > +   goto out_unlock_both;
> > +   }
> > +
> 
> Hello Minchan,
> 
> I might be missing something here but does this implementation support the
> scenario where these non LRU pages owned by the driver mapped as PTE into
> process page table ? Because the "goto out_unlock_both" statement above
> skips all the PTE unmap, putting a migration PTE and removing the migration
> PTE steps.

You're right. Unfortunately, it doesn't support right now but surely,
it's my TODO after landing this work.

Could you share your usecase?

It would be helpful for merging when I wll send patchset.

Thanks!


Re: [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration

2016-06-14 Thread Minchan Kim
Hi,

On Mon, Jun 13, 2016 at 03:08:19PM +0530, Anshuman Khandual wrote:
> On 05/31/2016 05:31 AM, Minchan Kim wrote:
> > @@ -791,6 +921,7 @@ static int __unmap_and_move(struct page *page, struct 
> > page *newpage,
> > int rc = -EAGAIN;
> > int page_was_mapped = 0;
> > struct anon_vma *anon_vma = NULL;
> > +   bool is_lru = !__PageMovable(page);
> >  
> > if (!trylock_page(page)) {
> > if (!force || mode == MIGRATE_ASYNC)
> > @@ -871,6 +1002,11 @@ static int __unmap_and_move(struct page *page, struct 
> > page *newpage,
> > goto out_unlock_both;
> > }
> >  
> > +   if (unlikely(!is_lru)) {
> > +   rc = move_to_new_page(newpage, page, mode);
> > +   goto out_unlock_both;
> > +   }
> > +
> 
> Hello Minchan,
> 
> I might be missing something here but does this implementation support the
> scenario where these non LRU pages owned by the driver mapped as PTE into
> process page table ? Because the "goto out_unlock_both" statement above
> skips all the PTE unmap, putting a migration PTE and removing the migration
> PTE steps.

You're right. Unfortunately, it doesn't support right now but surely,
it's my TODO after landing this work.

Could you share your usecase?

It would be helpful for merging when I wll send patchset.

Thanks!


Re: [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration

2016-06-13 Thread Anshuman Khandual
On 05/31/2016 05:31 AM, Minchan Kim wrote:
> @@ -791,6 +921,7 @@ static int __unmap_and_move(struct page *page, struct 
> page *newpage,
>   int rc = -EAGAIN;
>   int page_was_mapped = 0;
>   struct anon_vma *anon_vma = NULL;
> + bool is_lru = !__PageMovable(page);
>  
>   if (!trylock_page(page)) {
>   if (!force || mode == MIGRATE_ASYNC)
> @@ -871,6 +1002,11 @@ static int __unmap_and_move(struct page *page, struct 
> page *newpage,
>   goto out_unlock_both;
>   }
>  
> + if (unlikely(!is_lru)) {
> + rc = move_to_new_page(newpage, page, mode);
> + goto out_unlock_both;
> + }
> +

Hello Minchan,

I might be missing something here but does this implementation support the
scenario where these non LRU pages owned by the driver mapped as PTE into
process page table ? Because the "goto out_unlock_both" statement above
skips all the PTE unmap, putting a migration PTE and removing the migration
PTE steps.

Regards
Anshuman



Re: [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration

2016-06-13 Thread Anshuman Khandual
On 05/31/2016 05:31 AM, Minchan Kim wrote:
> @@ -791,6 +921,7 @@ static int __unmap_and_move(struct page *page, struct 
> page *newpage,
>   int rc = -EAGAIN;
>   int page_was_mapped = 0;
>   struct anon_vma *anon_vma = NULL;
> + bool is_lru = !__PageMovable(page);
>  
>   if (!trylock_page(page)) {
>   if (!force || mode == MIGRATE_ASYNC)
> @@ -871,6 +1002,11 @@ static int __unmap_and_move(struct page *page, struct 
> page *newpage,
>   goto out_unlock_both;
>   }
>  
> + if (unlikely(!is_lru)) {
> + rc = move_to_new_page(newpage, page, mode);
> + goto out_unlock_both;
> + }
> +

Hello Minchan,

I might be missing something here but does this implementation support the
scenario where these non LRU pages owned by the driver mapped as PTE into
process page table ? Because the "goto out_unlock_both" statement above
skips all the PTE unmap, putting a migration PTE and removing the migration
PTE steps.

Regards
Anshuman



Re: [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration

2016-05-31 Thread Minchan Kim
On Tue, May 31, 2016 at 09:52:48AM +0200, Vlastimil Babka wrote:
> On 05/31/2016 02:01 AM, Minchan Kim wrote:
> >Per Vlastimi's review comment.
> >
> >Thanks for the detail review, Vlastimi!
> >If you have another concern, feel free to say.
> 
> I don't for now :)
> 
> [...]
> 
> >Cc: Rik van Riel 
> >Cc: Vlastimil Babka 
> >Cc: Joonsoo Kim 
> >Cc: Mel Gorman 
> >Cc: Hugh Dickins 
> >Cc: Rafael Aquini 
> >Cc: virtualizat...@lists.linux-foundation.org
> >Cc: Jonathan Corbet 
> >Cc: John Einar Reitan 
> >Cc: dri-de...@lists.freedesktop.org
> >Cc: Sergey Senozhatsky 
> >Signed-off-by: Gioh Kim 
> >Signed-off-by: Minchan Kim 
> 
> Acked-by: Vlastimil Babka 

Thanks for the review, Vlastimil!


Re: [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration

2016-05-31 Thread Minchan Kim
On Tue, May 31, 2016 at 09:52:48AM +0200, Vlastimil Babka wrote:
> On 05/31/2016 02:01 AM, Minchan Kim wrote:
> >Per Vlastimi's review comment.
> >
> >Thanks for the detail review, Vlastimi!
> >If you have another concern, feel free to say.
> 
> I don't for now :)
> 
> [...]
> 
> >Cc: Rik van Riel 
> >Cc: Vlastimil Babka 
> >Cc: Joonsoo Kim 
> >Cc: Mel Gorman 
> >Cc: Hugh Dickins 
> >Cc: Rafael Aquini 
> >Cc: virtualizat...@lists.linux-foundation.org
> >Cc: Jonathan Corbet 
> >Cc: John Einar Reitan 
> >Cc: dri-de...@lists.freedesktop.org
> >Cc: Sergey Senozhatsky 
> >Signed-off-by: Gioh Kim 
> >Signed-off-by: Minchan Kim 
> 
> Acked-by: Vlastimil Babka 

Thanks for the review, Vlastimil!


Re: [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration

2016-05-31 Thread Vlastimil Babka

On 05/31/2016 02:01 AM, Minchan Kim wrote:

Per Vlastimi's review comment.

Thanks for the detail review, Vlastimi!
If you have another concern, feel free to say.


I don't for now :)

[...]


Cc: Rik van Riel 
Cc: Vlastimil Babka 
Cc: Joonsoo Kim 
Cc: Mel Gorman 
Cc: Hugh Dickins 
Cc: Rafael Aquini 
Cc: virtualizat...@lists.linux-foundation.org
Cc: Jonathan Corbet 
Cc: John Einar Reitan 
Cc: dri-de...@lists.freedesktop.org
Cc: Sergey Senozhatsky 
Signed-off-by: Gioh Kim 
Signed-off-by: Minchan Kim 


Acked-by: Vlastimil Babka 



Re: [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration

2016-05-31 Thread Vlastimil Babka

On 05/31/2016 02:01 AM, Minchan Kim wrote:

Per Vlastimi's review comment.

Thanks for the detail review, Vlastimi!
If you have another concern, feel free to say.


I don't for now :)

[...]


Cc: Rik van Riel 
Cc: Vlastimil Babka 
Cc: Joonsoo Kim 
Cc: Mel Gorman 
Cc: Hugh Dickins 
Cc: Rafael Aquini 
Cc: virtualizat...@lists.linux-foundation.org
Cc: Jonathan Corbet 
Cc: John Einar Reitan 
Cc: dri-de...@lists.freedesktop.org
Cc: Sergey Senozhatsky 
Signed-off-by: Gioh Kim 
Signed-off-by: Minchan Kim 


Acked-by: Vlastimil Babka 



[PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration

2016-05-30 Thread Minchan Kim
Per Vlastimi's review comment.

Thanks for the detail review, Vlastimi!
If you have another concern, feel free to say.
After I resolve all thing, I will send v7 rebased on recent mmotm.

>From b14aab2d3ac0702c7b2eec36409d74406d43 Mon Sep 17 00:00:00 2001
From: Minchan Kim 
Date: Fri, 8 Apr 2016 10:34:49 +0900
Subject: [PATCH] mm: migrate: support non-lru movable page migration

We have allowed migration for only LRU pages until now and it was
enough to make high-order pages. But recently, embedded system(e.g.,
webOS, android) uses lots of non-movable pages(e.g., zram, GPU memory)
so we have seen several reports about troubles of small high-order
allocation. For fixing the problem, there were several efforts
(e,g,. enhance compaction algorithm, SLUB fallback to 0-order page,
reserved memory, vmalloc and so on) but if there are lots of
non-movable pages in system, their solutions are void in the long run.

So, this patch is to support facility to change non-movable pages
with movable. For the feature, this patch introduces functions related
to migration to address_space_operations as well as some page flags.

If a driver want to make own pages movable, it should define three functions
which are function pointers of struct address_space_operations.

1. bool (*isolate_page) (struct page *page, isolate_mode_t mode);

What VM expects on isolate_page function of driver is to return *true*
if driver isolates page successfully. On returing true, VM marks the page
as PG_isolated so concurrent isolation in several CPUs skip the page
for isolation. If a driver cannot isolate the page, it should return *false*.

Once page is successfully isolated, VM uses page.lru fields so driver
shouldn't expect to preserve values in that fields.

2. int (*migratepage) (struct address_space *mapping,
struct page *newpage, struct page *oldpage, enum migrate_mode);

After isolation, VM calls migratepage of driver with isolated page.
The function of migratepage is to move content of the old page to new page
and set up fields of struct page newpage. Keep in mind that you should
indicate to the VM the oldpage is no longer movable via __ClearPageMovable()
under page_lock if you migrated the oldpage successfully and returns 0.
If driver cannot migrate the page at the moment, driver can return -EAGAIN.
On -EAGAIN, VM will retry page migration in a short time because VM interprets
-EAGAIN as "temporal migration failure". On returning any error except -EAGAIN,
VM will give up the page migration without retrying in this time.

Driver shouldn't touch page.lru field VM using in the functions.

3. void (*putback_page)(struct page *);

If migration fails on isolated page, VM should return the isolated page
to the driver so VM calls driver's putback_page with migration failed page.
In this function, driver should put the isolated page back to the own data
structure.

4. non-lru movable page flags

There are two page flags for supporting non-lru movable page.

* PG_movable

Driver should use the below function to make page movable under page_lock.

void __SetPageMovable(struct page *page, struct address_space *mapping)

It needs argument of address_space for registering migration family functions
which will be called by VM. Exactly speaking, PG_movable is not a real flag of
struct page. Rather than, VM reuses page->mapping's lower bits to represent it.

#define PAGE_MAPPING_MOVABLE 0x2
page->mapping = page->mapping | PAGE_MAPPING_MOVABLE;

so driver shouldn't access page->mapping directly. Instead, driver should
use page_mapping which mask off the low two bits of page->mapping so it can get
right struct address_space.

For testing of non-lru movable page, VM supports __PageMovable function.
However, it doesn't guarantee to identify non-lru movable page because
page->mapping field is unified with other variables in struct page.
As well, if driver releases the page after isolation by VM, page->mapping
doesn't have stable value although it has PAGE_MAPPING_MOVABLE
(Look at __ClearPageMovable). But __PageMovable is cheap to catch whether
page is LRU or non-lru movable once the page has been isolated. Because
LRU pages never can have PAGE_MAPPING_MOVABLE in page->mapping. It is also
good for just peeking to test non-lru movable pages before more expensive
checking with lock_page in pfn scanning to select victim.

For guaranteeing non-lru movable page, VM provides PageMovable function.
Unlike __PageMovable, PageMovable functions validates page->mapping and
mapping->a_ops->isolate_page under lock_page. The lock_page prevents sudden
destroying of page->mapping.

Driver using __SetPageMovable should clear the flag via __ClearMovablePage
under page_lock before the releasing the page.

* PG_isolated

To prevent concurrent isolation among several CPUs, VM marks isolated page
as PG_isolated under lock_page. So if a CPU encounters PG_isolated non-lru
movable page, it can skip it. Driver doesn't need 

[PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration

2016-05-30 Thread Minchan Kim
Per Vlastimi's review comment.

Thanks for the detail review, Vlastimi!
If you have another concern, feel free to say.
After I resolve all thing, I will send v7 rebased on recent mmotm.

>From b14aab2d3ac0702c7b2eec36409d74406d43 Mon Sep 17 00:00:00 2001
From: Minchan Kim 
Date: Fri, 8 Apr 2016 10:34:49 +0900
Subject: [PATCH] mm: migrate: support non-lru movable page migration

We have allowed migration for only LRU pages until now and it was
enough to make high-order pages. But recently, embedded system(e.g.,
webOS, android) uses lots of non-movable pages(e.g., zram, GPU memory)
so we have seen several reports about troubles of small high-order
allocation. For fixing the problem, there were several efforts
(e,g,. enhance compaction algorithm, SLUB fallback to 0-order page,
reserved memory, vmalloc and so on) but if there are lots of
non-movable pages in system, their solutions are void in the long run.

So, this patch is to support facility to change non-movable pages
with movable. For the feature, this patch introduces functions related
to migration to address_space_operations as well as some page flags.

If a driver want to make own pages movable, it should define three functions
which are function pointers of struct address_space_operations.

1. bool (*isolate_page) (struct page *page, isolate_mode_t mode);

What VM expects on isolate_page function of driver is to return *true*
if driver isolates page successfully. On returing true, VM marks the page
as PG_isolated so concurrent isolation in several CPUs skip the page
for isolation. If a driver cannot isolate the page, it should return *false*.

Once page is successfully isolated, VM uses page.lru fields so driver
shouldn't expect to preserve values in that fields.

2. int (*migratepage) (struct address_space *mapping,
struct page *newpage, struct page *oldpage, enum migrate_mode);

After isolation, VM calls migratepage of driver with isolated page.
The function of migratepage is to move content of the old page to new page
and set up fields of struct page newpage. Keep in mind that you should
indicate to the VM the oldpage is no longer movable via __ClearPageMovable()
under page_lock if you migrated the oldpage successfully and returns 0.
If driver cannot migrate the page at the moment, driver can return -EAGAIN.
On -EAGAIN, VM will retry page migration in a short time because VM interprets
-EAGAIN as "temporal migration failure". On returning any error except -EAGAIN,
VM will give up the page migration without retrying in this time.

Driver shouldn't touch page.lru field VM using in the functions.

3. void (*putback_page)(struct page *);

If migration fails on isolated page, VM should return the isolated page
to the driver so VM calls driver's putback_page with migration failed page.
In this function, driver should put the isolated page back to the own data
structure.

4. non-lru movable page flags

There are two page flags for supporting non-lru movable page.

* PG_movable

Driver should use the below function to make page movable under page_lock.

void __SetPageMovable(struct page *page, struct address_space *mapping)

It needs argument of address_space for registering migration family functions
which will be called by VM. Exactly speaking, PG_movable is not a real flag of
struct page. Rather than, VM reuses page->mapping's lower bits to represent it.

#define PAGE_MAPPING_MOVABLE 0x2
page->mapping = page->mapping | PAGE_MAPPING_MOVABLE;

so driver shouldn't access page->mapping directly. Instead, driver should
use page_mapping which mask off the low two bits of page->mapping so it can get
right struct address_space.

For testing of non-lru movable page, VM supports __PageMovable function.
However, it doesn't guarantee to identify non-lru movable page because
page->mapping field is unified with other variables in struct page.
As well, if driver releases the page after isolation by VM, page->mapping
doesn't have stable value although it has PAGE_MAPPING_MOVABLE
(Look at __ClearPageMovable). But __PageMovable is cheap to catch whether
page is LRU or non-lru movable once the page has been isolated. Because
LRU pages never can have PAGE_MAPPING_MOVABLE in page->mapping. It is also
good for just peeking to test non-lru movable pages before more expensive
checking with lock_page in pfn scanning to select victim.

For guaranteeing non-lru movable page, VM provides PageMovable function.
Unlike __PageMovable, PageMovable functions validates page->mapping and
mapping->a_ops->isolate_page under lock_page. The lock_page prevents sudden
destroying of page->mapping.

Driver using __SetPageMovable should clear the flag via __ClearMovablePage
under page_lock before the releasing the page.

* PG_isolated

To prevent concurrent isolation among several CPUs, VM marks isolated page
as PG_isolated under lock_page. So if a CPU encounters PG_isolated non-lru
movable page, it can skip it. Driver doesn't need to manipulate the