Re: [PATCH 2/2] thp: support split page table lock

2013-09-09 Thread Alex Thorlton
On Mon, Sep 09, 2013 at 10:34:45AM +0800, Daniel J Blueman wrote:
> On Saturday, 7 September 2013 02:10:02 UTC+8, Naoya Horiguchi  wrote:
> >Hi Alex,
> >
> >On Fri, Sep 06, 2013 at 11:04:23AM -0500, Alex Thorlton wrote:
> >> On Thu, Sep 05, 2013 at 05:27:46PM -0400, Naoya Horiguchi wrote:
> >> > Thp related code also uses per process mm->page_table_lock now.
> >> > So making it fine-grained can provide better performance.
> >> >
> >> > This patch makes thp support split page table lock by using page->ptl
> >> > of the pages storing "pmd_trans_huge" pmds.
> >> >
> >> > Some functions like pmd_trans_huge_lock() and
> page_check_address_pmd()
> >> > are expected by their caller to pass back the pointer of ptl, so this
> >> > patch adds to those functions new arguments for that. Rather than
> that,
> >> > this patch gives only straightforward replacement.
> >> >
> >> > ChangeLog v3:
> >> >  - fixed argument of huge_pmd_lockptr() in copy_huge_pmd()
> >> >  - added missing declaration of ptl in do_huge_pmd_anonymous_page()
> >>
> >> I've applied these and tested them using the same tests program that I
> >> used when I was working on the same issue, and I'm running into some
> >> bugs.  Here's a stack trace:
> >
> >Thank you for helping testing. This bug is new to me.
> 
> With 3.11, this patch series and CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS,
> I consistently hit the same failure when exiting one of my
> stress-testers [1] when using eg 24 cores.
> 
> Doesn't happen with 8 cores, so likely needs enough virtual memory
> to use multiple split locks. Otherwise, this is very promising work!

Daniel,

Hillf Danton (dhi...@gmail.com) suggested putting the big
page_table_lock back into the two functions seen below.  I re-tested
with this change and it seems to resolve the issue.  

- Alex

--- a/mm/pgtable-generic.c  Sat Sep  7 15:17:52 2013
+++ b/mm/pgtable-generic.c  Sat Sep  7 15:20:28 2013
@@ -127,12 +127,14 @@ void pmdp_splitting_flush(struct vm_area
 void pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp,
pgtable_t pgtable)
 {
+   spin_lock(>page_table_lock);
/* FIFO */
if (!mm->pmd_huge_pte)
INIT_LIST_HEAD(>lru);
else
list_add(>lru, >pmd_huge_pte->lru);
mm->pmd_huge_pte = pgtable;
+   spin_unlock(>page_table_lock);
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 #endif
@@ -144,6 +146,7 @@ pgtable_t pgtable_trans_huge_withdraw(st
 {
pgtable_t pgtable;

+   spin_lock(>page_table_lock);
/* FIFO */
pgtable = mm->pmd_huge_pte;
if (list_empty(>lru))
@@ -153,6 +156,7 @@ pgtable_t pgtable_trans_huge_withdraw(st
  struct page, lru);
list_del(>lru);
}
+   spin_unlock(>page_table_lock);
return pgtable;
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] thp: support split page table lock

2013-09-09 Thread Alex Thorlton
On Mon, Sep 09, 2013 at 10:34:45AM +0800, Daniel J Blueman wrote:
 On Saturday, 7 September 2013 02:10:02 UTC+8, Naoya Horiguchi  wrote:
 Hi Alex,
 
 On Fri, Sep 06, 2013 at 11:04:23AM -0500, Alex Thorlton wrote:
  On Thu, Sep 05, 2013 at 05:27:46PM -0400, Naoya Horiguchi wrote:
   Thp related code also uses per process mm-page_table_lock now.
   So making it fine-grained can provide better performance.
  
   This patch makes thp support split page table lock by using page-ptl
   of the pages storing pmd_trans_huge pmds.
  
   Some functions like pmd_trans_huge_lock() and
 page_check_address_pmd()
   are expected by their caller to pass back the pointer of ptl, so this
   patch adds to those functions new arguments for that. Rather than
 that,
   this patch gives only straightforward replacement.
  
   ChangeLog v3:
- fixed argument of huge_pmd_lockptr() in copy_huge_pmd()
- added missing declaration of ptl in do_huge_pmd_anonymous_page()
 
  I've applied these and tested them using the same tests program that I
  used when I was working on the same issue, and I'm running into some
  bugs.  Here's a stack trace:
 
 Thank you for helping testing. This bug is new to me.
 
 With 3.11, this patch series and CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS,
 I consistently hit the same failure when exiting one of my
 stress-testers [1] when using eg 24 cores.
 
 Doesn't happen with 8 cores, so likely needs enough virtual memory
 to use multiple split locks. Otherwise, this is very promising work!

Daniel,

Hillf Danton (dhi...@gmail.com) suggested putting the big
page_table_lock back into the two functions seen below.  I re-tested
with this change and it seems to resolve the issue.  

- Alex

--- a/mm/pgtable-generic.c  Sat Sep  7 15:17:52 2013
+++ b/mm/pgtable-generic.c  Sat Sep  7 15:20:28 2013
@@ -127,12 +127,14 @@ void pmdp_splitting_flush(struct vm_area
 void pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp,
pgtable_t pgtable)
 {
+   spin_lock(mm-page_table_lock);
/* FIFO */
if (!mm-pmd_huge_pte)
INIT_LIST_HEAD(pgtable-lru);
else
list_add(pgtable-lru, mm-pmd_huge_pte-lru);
mm-pmd_huge_pte = pgtable;
+   spin_unlock(mm-page_table_lock);
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 #endif
@@ -144,6 +146,7 @@ pgtable_t pgtable_trans_huge_withdraw(st
 {
pgtable_t pgtable;

+   spin_lock(mm-page_table_lock);
/* FIFO */
pgtable = mm-pmd_huge_pte;
if (list_empty(pgtable-lru))
@@ -153,6 +156,7 @@ pgtable_t pgtable_trans_huge_withdraw(st
  struct page, lru);
list_del(pgtable-lru);
}
+   spin_unlock(mm-page_table_lock);
return pgtable;
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
--
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] thp: support split page table lock

2013-09-08 Thread Daniel J Blueman

On Saturday, 7 September 2013 02:10:02 UTC+8, Naoya Horiguchi  wrote:

Hi Alex,

On Fri, Sep 06, 2013 at 11:04:23AM -0500, Alex Thorlton wrote:
> On Thu, Sep 05, 2013 at 05:27:46PM -0400, Naoya Horiguchi wrote:
> > Thp related code also uses per process mm->page_table_lock now.
> > So making it fine-grained can provide better performance.
> >
> > This patch makes thp support split page table lock by using page->ptl
> > of the pages storing "pmd_trans_huge" pmds.
> >
> > Some functions like pmd_trans_huge_lock() and

page_check_address_pmd()

> > are expected by their caller to pass back the pointer of ptl, so this
> > patch adds to those functions new arguments for that. Rather than

that,

> > this patch gives only straightforward replacement.
> >
> > ChangeLog v3:
> >  - fixed argument of huge_pmd_lockptr() in copy_huge_pmd()
> >  - added missing declaration of ptl in do_huge_pmd_anonymous_page()
>
> I've applied these and tested them using the same tests program that I
> used when I was working on the same issue, and I'm running into some
> bugs.  Here's a stack trace:

Thank you for helping testing. This bug is new to me.


With 3.11, this patch series and CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS, I 
consistently hit the same failure when exiting one of my stress-testers 
[1] when using eg 24 cores.


Doesn't happen with 8 cores, so likely needs enough virtual memory to 
use multiple split locks. Otherwise, this is very promising work!


[1] http://quora.org/2013/fft3d.c
--
Daniel J Blueman
Principal Software Engineer, Numascale
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 2/2] thp: support split page table lock

2013-09-08 Thread Aneesh Kumar K.V
"Kirill A. Shutemov"  writes:

> Naoya Horiguchi wrote:
>> Thp related code also uses per process mm->page_table_lock now.
>> So making it fine-grained can provide better performance.
>> 
>> This patch makes thp support split page table lock by using page->ptl
>> of the pages storing "pmd_trans_huge" pmds.
>> 
>> Some functions like pmd_trans_huge_lock() and page_check_address_pmd()
>> are expected by their caller to pass back the pointer of ptl, so this
>> patch adds to those functions new arguments for that. Rather than that,
>> this patch gives only straightforward replacement.
>> 
>> ChangeLog v3:
>>  - fixed argument of huge_pmd_lockptr() in copy_huge_pmd()
>>  - added missing declaration of ptl in do_huge_pmd_anonymous_page()
>> 
>> Signed-off-by: Naoya Horiguchi 
>
> Generally, looks good. Few notes:
>
> I believe you need to convert __pte_alloc() to new locking. Not sure about
> __pte_alloc_kernel().
> Have you check all rest mm->page_table_lock, that they shouldn't be
> converted to new locking?

May be we can have a CONFIG_DEBUG_VM version of pmd_populate that check 
check with assert_spin_locked ?


-aneesh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 2/2] thp: support split page table lock

2013-09-08 Thread Aneesh Kumar K.V
Kirill A. Shutemov kirill.shute...@linux.intel.com writes:

 Naoya Horiguchi wrote:
 Thp related code also uses per process mm-page_table_lock now.
 So making it fine-grained can provide better performance.
 
 This patch makes thp support split page table lock by using page-ptl
 of the pages storing pmd_trans_huge pmds.
 
 Some functions like pmd_trans_huge_lock() and page_check_address_pmd()
 are expected by their caller to pass back the pointer of ptl, so this
 patch adds to those functions new arguments for that. Rather than that,
 this patch gives only straightforward replacement.
 
 ChangeLog v3:
  - fixed argument of huge_pmd_lockptr() in copy_huge_pmd()
  - added missing declaration of ptl in do_huge_pmd_anonymous_page()
 
 Signed-off-by: Naoya Horiguchi n-horigu...@ah.jp.nec.com

 Generally, looks good. Few notes:

 I believe you need to convert __pte_alloc() to new locking. Not sure about
 __pte_alloc_kernel().
 Have you check all rest mm-page_table_lock, that they shouldn't be
 converted to new locking?

May be we can have a CONFIG_DEBUG_VM version of pmd_populate that check 
check with assert_spin_locked ?


-aneesh

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] thp: support split page table lock

2013-09-08 Thread Daniel J Blueman

On Saturday, 7 September 2013 02:10:02 UTC+8, Naoya Horiguchi  wrote:

Hi Alex,

On Fri, Sep 06, 2013 at 11:04:23AM -0500, Alex Thorlton wrote:
 On Thu, Sep 05, 2013 at 05:27:46PM -0400, Naoya Horiguchi wrote:
  Thp related code also uses per process mm-page_table_lock now.
  So making it fine-grained can provide better performance.
 
  This patch makes thp support split page table lock by using page-ptl
  of the pages storing pmd_trans_huge pmds.
 
  Some functions like pmd_trans_huge_lock() and

page_check_address_pmd()

  are expected by their caller to pass back the pointer of ptl, so this
  patch adds to those functions new arguments for that. Rather than

that,

  this patch gives only straightforward replacement.
 
  ChangeLog v3:
   - fixed argument of huge_pmd_lockptr() in copy_huge_pmd()
   - added missing declaration of ptl in do_huge_pmd_anonymous_page()

 I've applied these and tested them using the same tests program that I
 used when I was working on the same issue, and I'm running into some
 bugs.  Here's a stack trace:

Thank you for helping testing. This bug is new to me.


With 3.11, this patch series and CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS, I 
consistently hit the same failure when exiting one of my stress-testers 
[1] when using eg 24 cores.


Doesn't happen with 8 cores, so likely needs enough virtual memory to 
use multiple split locks. Otherwise, this is very promising work!


[1] http://quora.org/2013/fft3d.c
--
Daniel J Blueman
Principal Software Engineer, Numascale
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] thp: support split page table lock

2013-09-06 Thread Naoya Horiguchi
Hi Alex,

On Fri, Sep 06, 2013 at 11:04:23AM -0500, Alex Thorlton wrote:
> On Thu, Sep 05, 2013 at 05:27:46PM -0400, Naoya Horiguchi wrote:
> > Thp related code also uses per process mm->page_table_lock now.
> > So making it fine-grained can provide better performance.
> > 
> > This patch makes thp support split page table lock by using page->ptl
> > of the pages storing "pmd_trans_huge" pmds.
> > 
> > Some functions like pmd_trans_huge_lock() and page_check_address_pmd()
> > are expected by their caller to pass back the pointer of ptl, so this
> > patch adds to those functions new arguments for that. Rather than that,
> > this patch gives only straightforward replacement.
> > 
> > ChangeLog v3:
> >  - fixed argument of huge_pmd_lockptr() in copy_huge_pmd()
> >  - added missing declaration of ptl in do_huge_pmd_anonymous_page()
> 
> I've applied these and tested them using the same tests program that I
> used when I was working on the same issue, and I'm running into some
> bugs.  Here's a stack trace:

Thank you for helping testing. This bug is new to me.

> general protection fault:  [#1] SMP 
> Modules linked in:
> CPU: 268 PID: 32381 Comm: memscale Not tainted
> 3.11.0-medusa-03121-g757f8ca #184
> Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS
> 01/15/2013
> task: 880fbdd82180 ti: 880fc0c5a000 task.ti: 880fc0c5a000
> RIP: 0010:[]  []
> pgtable_trans_huge_withdraw+0x38/0x60
> RSP: 0018:880fc0c5bc88  EFLAGS: 00010297
> RAX: ea17cebe8838 RBX: 0015309bd000 RCX: ea01f623b028
> RDX: dead00100100 RSI: 8dcf77d84c30 RDI: 880fbda67580
> RBP: 880fc0c5bc88 R08: 0013 R09: 00014da0
> R10: 880fc0c5bc88 R11: 888f7efda000 R12: 8dcf77d84c30
> R13: 880fc0c5bdf8 R14: 85cf401ff067 R15: 8b4de5fabff8
> FS:  () GS:880fffd8()
> knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 7768b0b8 CR3: 01a0b000 CR4: 000407e0
> Stack:
>  880fc0c5bcc8 810f7643 880fc0c5bcc8 810d8297
>  ea1456237510 7fc7b0e0  7fc7b0c0
>  880fc0c5bda8 810d85ba 880fc0c5bd48 880fc0c5bd68
> Call Trace:
>  [] zap_huge_pmd+0x4c/0x101
>  [] ? tlb_flush_mmu+0x58/0x75
>  [] unmap_single_vma+0x306/0x7d6
>  [] unmap_vmas+0x4f/0x82
>  [] exit_mmap+0x8b/0x113
>  [] ? __delayacct_add_tsk+0x170/0x182
>  [] mmput+0x3e/0xc4
>  [] do_exit+0x380/0x907
>  [] ? vfs_write+0x149/0x1a3
>  [] do_group_exit+0x72/0x9b
>  [] SyS_exit_group+0x12/0x16
>  [] system_call_fastpath+0x16/0x1b
> Code: 51 20 48 8d 41 20 48 39 c2 75 0d 48 c7 87 28 03 00 00 00 00 00 00
> eb 36 48 8d 42 e0 48 89 87 28 03 00 00 48 8b 51 20 48 8b 41 28 <48> 89
> 42 08 48 89 10 48 ba 00 01 10 00 00 00 ad de 48 b8 00 02 
> RIP  [] pgtable_trans_huge_withdraw+0x38/0x60
>  RSP 
> ---[ end trace e5413b388b6ea448 ]---
> Fixing recursive fault but reboot is needed!
> general protection fault:  [#2] SMP 
> Modules linked in:
> CPU: 268 PID: 1722 Comm: kworker/268:1 Tainted: G  D
> 3.11.0-medusa-03121-g757f8ca #184
> Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS
> 01/15/2013
> Workqueue: events vmstat_update
> task: 880fc1a74280 ti: 880fc1a76000 task.ti: 880fc1a76000
> RIP: 0010:[]  []
> free_pcppages_bulk+0x97/0x329
> RSP: 0018:880fc1a77c98  EFLAGS: 00010082
> RAX: 880fffd94d68 RBX: dead002001e0 RCX: 880fffd94d50
> RDX: 880fffd94d68 RSI: 001f RDI: 888f7efdac68
> RBP: 880fc1a77cf8 R08: 0400 R09: 81a8bf00
> R10: 884f7efdac00 R11: 81009bae R12: dead00200200
> R13: 888f7efdac00 R14: 001f R15: 
> FS:  () GS:880fffd8()
> knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 7768b0b8 CR3: 01a0b000 CR4: 000407e0
> Stack:
>  880fc1a77ce8 880fffd94d68 0010 880fffd94d50
>  001ff9276a68 880fffd94d60  001f
>  880fffd94d50 0292 880fc1a77d38 880fffd95d05
> Call Trace:
>  [] drain_zone_pages+0x33/0x42
>  [] refresh_cpu_vm_stats+0xcc/0x11e
>  [] vmstat_update+0x11/0x43
>  [] process_one_work+0x260/0x389
>  [] worker_thread+0x1e2/0x332
>  [] ? process_one_work+0x389/0x389
>  [] kthread+0xb3/0xbd
>  [] ? process_one_work+0x389/0x389
>  [] ? kthread_freezable_should_stop+0x5b/0x5b
>  [] ret_from_fork+0x7c/0xb0
>  [] ? kthread_freezable_should_stop+0x5b/0x5b
> Code: 48 89 55 c8 48 39 14 08 74 ce 41 83 fe 03 44 0f 44 75 c4 48 83 c2
> 08 48 89 45 b0 48 89 55 a8 48 8b 45 a8 4c 8b 20 49 8d 5c 24 e0 <48> 8b
> 53 20 48 8b 43 28 48 89 42 08 48 89 10 48 ba 00 01 10 00 
> RIP  [] free_pcppages_bulk+0x97/0x329
>  RSP 
> ---[ end trace e5413b388b6ea449 ]---
> BUG: unable to handle kernel paging request at ffd8
> IP: [] kthread_data+0xb/0x11
> PGD 1a0c067 

Re: [PATCH 2/2] thp: support split page table lock

2013-09-06 Thread Naoya Horiguchi
On Fri, Sep 06, 2013 at 07:46:25PM +0300, Kirill A. Shutemov wrote:
> Naoya Horiguchi wrote:
> > Hi Kirill,
> > 
> > On Fri, Sep 06, 2013 at 01:48:03PM +0300, Kirill A. Shutemov wrote:
> > > Naoya Horiguchi wrote:
> > > > Thp related code also uses per process mm->page_table_lock now.
> > > > So making it fine-grained can provide better performance.
> > > > 
> > > > This patch makes thp support split page table lock by using page->ptl
> > > > of the pages storing "pmd_trans_huge" pmds.
> > > > 
> > > > Some functions like pmd_trans_huge_lock() and page_check_address_pmd()
> > > > are expected by their caller to pass back the pointer of ptl, so this
> > > > patch adds to those functions new arguments for that. Rather than that,
> > > > this patch gives only straightforward replacement.
> > > > 
> > > > ChangeLog v3:
> > > >  - fixed argument of huge_pmd_lockptr() in copy_huge_pmd()
> > > >  - added missing declaration of ptl in do_huge_pmd_anonymous_page()
> > > > 
> > > > Signed-off-by: Naoya Horiguchi 
> > > 
> > > Generally, looks good. Few notes:
> > > 
> > > I believe you need to convert __pte_alloc() to new locking. Not sure about
> > > __pte_alloc_kernel().
> > > Have you check all rest mm->page_table_lock, that they shouldn't be
> > > converted to new locking?
> > 
> > I thought that keeping __pte_alloc() using mm->page_table_lock was safe
> > because it uses bare mm->page_table_lock instead of pte_lockptr() even
> > before this patchset, but not 100% sure.
> > __pte_alloc() (and its family) are used in normal page path, so if it's
> > not safe, we've lived with unsafe code for very long (maybe since 2005).
> > Anyway, converting __pte_alloc() into split ptl could improve performance
> > (though we need testing to know what amount), so I'll try that.
> 
> No, before the patch mm->page_table_lock is what we need: it serializes
> setting up pmd, not adding pte to pmd and it's subject to change with new
> locking model.

OK, I see. This is a todo for the next post.
Thank you for explanation.

Naoya
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] thp: support split page table lock

2013-09-06 Thread Kirill A. Shutemov
Naoya Horiguchi wrote:
> Hi Kirill,
> 
> On Fri, Sep 06, 2013 at 01:48:03PM +0300, Kirill A. Shutemov wrote:
> > Naoya Horiguchi wrote:
> > > Thp related code also uses per process mm->page_table_lock now.
> > > So making it fine-grained can provide better performance.
> > > 
> > > This patch makes thp support split page table lock by using page->ptl
> > > of the pages storing "pmd_trans_huge" pmds.
> > > 
> > > Some functions like pmd_trans_huge_lock() and page_check_address_pmd()
> > > are expected by their caller to pass back the pointer of ptl, so this
> > > patch adds to those functions new arguments for that. Rather than that,
> > > this patch gives only straightforward replacement.
> > > 
> > > ChangeLog v3:
> > >  - fixed argument of huge_pmd_lockptr() in copy_huge_pmd()
> > >  - added missing declaration of ptl in do_huge_pmd_anonymous_page()
> > > 
> > > Signed-off-by: Naoya Horiguchi 
> > 
> > Generally, looks good. Few notes:
> > 
> > I believe you need to convert __pte_alloc() to new locking. Not sure about
> > __pte_alloc_kernel().
> > Have you check all rest mm->page_table_lock, that they shouldn't be
> > converted to new locking?
> 
> I thought that keeping __pte_alloc() using mm->page_table_lock was safe
> because it uses bare mm->page_table_lock instead of pte_lockptr() even
> before this patchset, but not 100% sure.
> __pte_alloc() (and its family) are used in normal page path, so if it's
> not safe, we've lived with unsafe code for very long (maybe since 2005).
> Anyway, converting __pte_alloc() into split ptl could improve performance
> (though we need testing to know what amount), so I'll try that.

No, before the patch mm->page_table_lock is what we need: it serializes
setting up pmd, not adding pte to pmd and it's subject to change with new
locking model.

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] thp: support split page table lock

2013-09-06 Thread Naoya Horiguchi
Hi Kirill,

On Fri, Sep 06, 2013 at 01:48:03PM +0300, Kirill A. Shutemov wrote:
> Naoya Horiguchi wrote:
> > Thp related code also uses per process mm->page_table_lock now.
> > So making it fine-grained can provide better performance.
> > 
> > This patch makes thp support split page table lock by using page->ptl
> > of the pages storing "pmd_trans_huge" pmds.
> > 
> > Some functions like pmd_trans_huge_lock() and page_check_address_pmd()
> > are expected by their caller to pass back the pointer of ptl, so this
> > patch adds to those functions new arguments for that. Rather than that,
> > this patch gives only straightforward replacement.
> > 
> > ChangeLog v3:
> >  - fixed argument of huge_pmd_lockptr() in copy_huge_pmd()
> >  - added missing declaration of ptl in do_huge_pmd_anonymous_page()
> > 
> > Signed-off-by: Naoya Horiguchi 
> 
> Generally, looks good. Few notes:
> 
> I believe you need to convert __pte_alloc() to new locking. Not sure about
> __pte_alloc_kernel().
> Have you check all rest mm->page_table_lock, that they shouldn't be
> converted to new locking?

I thought that keeping __pte_alloc() using mm->page_table_lock was safe
because it uses bare mm->page_table_lock instead of pte_lockptr() even
before this patchset, but not 100% sure.
__pte_alloc() (and its family) are used in normal page path, so if it's
not safe, we've lived with unsafe code for very long (maybe since 2005).
Anyway, converting __pte_alloc() into split ptl could improve performance
(though we need testing to know what amount), so I'll try that.

> You use uninitialized_var() a lot. It's ugly. I've check few places
> (task_mmu.c, copy_huge_pmd) and have found a reason why we need it there.
> Why?

I got a compile warning of uninitialized usage when developing and added
to suppress it, but in the final form I never get such a warning.
So I'll remove this uninitialized_var()s.

> You often do
> 
> +   ptl = huge_pmd_lockptr(mm, pmd);
> +   spin_lock(ptl);
> 
> Should we have a helper to combine them? huge_pmd_lock()?

OK, I'll do it.

Thanks,
Naoya Horiguchi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] thp: support split page table lock

2013-09-06 Thread Alex Thorlton
On Thu, Sep 05, 2013 at 05:27:46PM -0400, Naoya Horiguchi wrote:
> Thp related code also uses per process mm->page_table_lock now.
> So making it fine-grained can provide better performance.
> 
> This patch makes thp support split page table lock by using page->ptl
> of the pages storing "pmd_trans_huge" pmds.
> 
> Some functions like pmd_trans_huge_lock() and page_check_address_pmd()
> are expected by their caller to pass back the pointer of ptl, so this
> patch adds to those functions new arguments for that. Rather than that,
> this patch gives only straightforward replacement.
> 
> ChangeLog v3:
>  - fixed argument of huge_pmd_lockptr() in copy_huge_pmd()
>  - added missing declaration of ptl in do_huge_pmd_anonymous_page()

I've applied these and tested them using the same tests program that I
used when I was working on the same issue, and I'm running into some
bugs.  Here's a stack trace:

general protection fault:  [#1] SMP 
Modules linked in:
CPU: 268 PID: 32381 Comm: memscale Not tainted
3.11.0-medusa-03121-g757f8ca #184
Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS
01/15/2013
task: 880fbdd82180 ti: 880fc0c5a000 task.ti: 880fc0c5a000
RIP: 0010:[]  []
pgtable_trans_huge_withdraw+0x38/0x60
RSP: 0018:880fc0c5bc88  EFLAGS: 00010297
RAX: ea17cebe8838 RBX: 0015309bd000 RCX: ea01f623b028
RDX: dead00100100 RSI: 8dcf77d84c30 RDI: 880fbda67580
RBP: 880fc0c5bc88 R08: 0013 R09: 00014da0
R10: 880fc0c5bc88 R11: 888f7efda000 R12: 8dcf77d84c30
R13: 880fc0c5bdf8 R14: 85cf401ff067 R15: 8b4de5fabff8
FS:  () GS:880fffd8()
knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7768b0b8 CR3: 01a0b000 CR4: 000407e0
Stack:
 880fc0c5bcc8 810f7643 880fc0c5bcc8 810d8297
 ea1456237510 7fc7b0e0  7fc7b0c0
 880fc0c5bda8 810d85ba 880fc0c5bd48 880fc0c5bd68
Call Trace:
 [] zap_huge_pmd+0x4c/0x101
 [] ? tlb_flush_mmu+0x58/0x75
 [] unmap_single_vma+0x306/0x7d6
 [] unmap_vmas+0x4f/0x82
 [] exit_mmap+0x8b/0x113
 [] ? __delayacct_add_tsk+0x170/0x182
 [] mmput+0x3e/0xc4
 [] do_exit+0x380/0x907
 [] ? vfs_write+0x149/0x1a3
 [] do_group_exit+0x72/0x9b
 [] SyS_exit_group+0x12/0x16
 [] system_call_fastpath+0x16/0x1b
Code: 51 20 48 8d 41 20 48 39 c2 75 0d 48 c7 87 28 03 00 00 00 00 00 00
eb 36 48 8d 42 e0 48 89 87 28 03 00 00 48 8b 51 20 48 8b 41 28 <48> 89
42 08 48 89 10 48 ba 00 01 10 00 00 00 ad de 48 b8 00 02 
RIP  [] pgtable_trans_huge_withdraw+0x38/0x60
 RSP 
---[ end trace e5413b388b6ea448 ]---
Fixing recursive fault but reboot is needed!
general protection fault:  [#2] SMP 
Modules linked in:
CPU: 268 PID: 1722 Comm: kworker/268:1 Tainted: G  D
3.11.0-medusa-03121-g757f8ca #184
Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS
01/15/2013
Workqueue: events vmstat_update
task: 880fc1a74280 ti: 880fc1a76000 task.ti: 880fc1a76000
RIP: 0010:[]  []
free_pcppages_bulk+0x97/0x329
RSP: 0018:880fc1a77c98  EFLAGS: 00010082
RAX: 880fffd94d68 RBX: dead002001e0 RCX: 880fffd94d50
RDX: 880fffd94d68 RSI: 001f RDI: 888f7efdac68
RBP: 880fc1a77cf8 R08: 0400 R09: 81a8bf00
R10: 884f7efdac00 R11: 81009bae R12: dead00200200
R13: 888f7efdac00 R14: 001f R15: 
FS:  () GS:880fffd8()
knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7768b0b8 CR3: 01a0b000 CR4: 000407e0
Stack:
 880fc1a77ce8 880fffd94d68 0010 880fffd94d50
 001ff9276a68 880fffd94d60  001f
 880fffd94d50 0292 880fc1a77d38 880fffd95d05
Call Trace:
 [] drain_zone_pages+0x33/0x42
 [] refresh_cpu_vm_stats+0xcc/0x11e
 [] vmstat_update+0x11/0x43
 [] process_one_work+0x260/0x389
 [] worker_thread+0x1e2/0x332
 [] ? process_one_work+0x389/0x389
 [] kthread+0xb3/0xbd
 [] ? process_one_work+0x389/0x389
 [] ? kthread_freezable_should_stop+0x5b/0x5b
 [] ret_from_fork+0x7c/0xb0
 [] ? kthread_freezable_should_stop+0x5b/0x5b
Code: 48 89 55 c8 48 39 14 08 74 ce 41 83 fe 03 44 0f 44 75 c4 48 83 c2
08 48 89 45 b0 48 89 55 a8 48 8b 45 a8 4c 8b 20 49 8d 5c 24 e0 <48> 8b
53 20 48 8b 43 28 48 89 42 08 48 89 10 48 ba 00 01 10 00 
RIP  [] free_pcppages_bulk+0x97/0x329
 RSP 
---[ end trace e5413b388b6ea449 ]---
BUG: unable to handle kernel paging request at ffd8
IP: [] kthread_data+0xb/0x11
PGD 1a0c067 PUD 1a0e067 PMD 0 
Oops:  [#3] SMP 
Modules linked in:
CPU: 268 PID: 1722 Comm: kworker/268:1 Tainted: G  D
3.11.0-medusa-03121-g757f8ca #184
Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS
01/15/2013
task: 880fc1a74280 ti: 880fc1a76000 task.ti: 880fc1a76000
RIP: 0010:[]  []
kthread_data+0xb/0x11
RSP: 

RE: [PATCH 2/2] thp: support split page table lock

2013-09-06 Thread Kirill A. Shutemov
Naoya Horiguchi wrote:
> Thp related code also uses per process mm->page_table_lock now.
> So making it fine-grained can provide better performance.
> 
> This patch makes thp support split page table lock by using page->ptl
> of the pages storing "pmd_trans_huge" pmds.
> 
> Some functions like pmd_trans_huge_lock() and page_check_address_pmd()
> are expected by their caller to pass back the pointer of ptl, so this
> patch adds to those functions new arguments for that. Rather than that,
> this patch gives only straightforward replacement.
> 
> ChangeLog v3:
>  - fixed argument of huge_pmd_lockptr() in copy_huge_pmd()
>  - added missing declaration of ptl in do_huge_pmd_anonymous_page()
> 
> Signed-off-by: Naoya Horiguchi 

Generally, looks good. Few notes:

I believe you need to convert __pte_alloc() to new locking. Not sure about
__pte_alloc_kernel().
Have you check all rest mm->page_table_lock, that they shouldn't be
converted to new locking?

You use uninitialized_var() a lot. It's ugly. I've check few places
(task_mmu.c, copy_huge_pmd) and have found a reason why we need it there.
Why?

You often do

+   ptl = huge_pmd_lockptr(mm, pmd);
+   spin_lock(ptl);

Should we have a helper to combine them? huge_pmd_lock()?

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 2/2] thp: support split page table lock

2013-09-06 Thread Kirill A. Shutemov
Naoya Horiguchi wrote:
 Thp related code also uses per process mm-page_table_lock now.
 So making it fine-grained can provide better performance.
 
 This patch makes thp support split page table lock by using page-ptl
 of the pages storing pmd_trans_huge pmds.
 
 Some functions like pmd_trans_huge_lock() and page_check_address_pmd()
 are expected by their caller to pass back the pointer of ptl, so this
 patch adds to those functions new arguments for that. Rather than that,
 this patch gives only straightforward replacement.
 
 ChangeLog v3:
  - fixed argument of huge_pmd_lockptr() in copy_huge_pmd()
  - added missing declaration of ptl in do_huge_pmd_anonymous_page()
 
 Signed-off-by: Naoya Horiguchi n-horigu...@ah.jp.nec.com

Generally, looks good. Few notes:

I believe you need to convert __pte_alloc() to new locking. Not sure about
__pte_alloc_kernel().
Have you check all rest mm-page_table_lock, that they shouldn't be
converted to new locking?

You use uninitialized_var() a lot. It's ugly. I've check few places
(task_mmu.c, copy_huge_pmd) and have found a reason why we need it there.
Why?

You often do

+   ptl = huge_pmd_lockptr(mm, pmd);
+   spin_lock(ptl);

Should we have a helper to combine them? huge_pmd_lock()?

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] thp: support split page table lock

2013-09-06 Thread Alex Thorlton
On Thu, Sep 05, 2013 at 05:27:46PM -0400, Naoya Horiguchi wrote:
 Thp related code also uses per process mm-page_table_lock now.
 So making it fine-grained can provide better performance.
 
 This patch makes thp support split page table lock by using page-ptl
 of the pages storing pmd_trans_huge pmds.
 
 Some functions like pmd_trans_huge_lock() and page_check_address_pmd()
 are expected by their caller to pass back the pointer of ptl, so this
 patch adds to those functions new arguments for that. Rather than that,
 this patch gives only straightforward replacement.
 
 ChangeLog v3:
  - fixed argument of huge_pmd_lockptr() in copy_huge_pmd()
  - added missing declaration of ptl in do_huge_pmd_anonymous_page()

I've applied these and tested them using the same tests program that I
used when I was working on the same issue, and I'm running into some
bugs.  Here's a stack trace:

general protection fault:  [#1] SMP 
Modules linked in:
CPU: 268 PID: 32381 Comm: memscale Not tainted
3.11.0-medusa-03121-g757f8ca #184
Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS
01/15/2013
task: 880fbdd82180 ti: 880fc0c5a000 task.ti: 880fc0c5a000
RIP: 0010:[810e3eef]  [810e3eef]
pgtable_trans_huge_withdraw+0x38/0x60
RSP: 0018:880fc0c5bc88  EFLAGS: 00010297
RAX: ea17cebe8838 RBX: 0015309bd000 RCX: ea01f623b028
RDX: dead00100100 RSI: 8dcf77d84c30 RDI: 880fbda67580
RBP: 880fc0c5bc88 R08: 0013 R09: 00014da0
R10: 880fc0c5bc88 R11: 888f7efda000 R12: 8dcf77d84c30
R13: 880fc0c5bdf8 R14: 85cf401ff067 R15: 8b4de5fabff8
FS:  () GS:880fffd8()
knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7768b0b8 CR3: 01a0b000 CR4: 000407e0
Stack:
 880fc0c5bcc8 810f7643 880fc0c5bcc8 810d8297
 ea1456237510 7fc7b0e0  7fc7b0c0
 880fc0c5bda8 810d85ba 880fc0c5bd48 880fc0c5bd68
Call Trace:
 [810f7643] zap_huge_pmd+0x4c/0x101
 [810d8297] ? tlb_flush_mmu+0x58/0x75
 [810d85ba] unmap_single_vma+0x306/0x7d6
 [810d8ad9] unmap_vmas+0x4f/0x82
 [810dab5e] exit_mmap+0x8b/0x113
 [810a9743] ? __delayacct_add_tsk+0x170/0x182
 [8103c609] mmput+0x3e/0xc4
 [8104088c] do_exit+0x380/0x907
 [810fb89c] ? vfs_write+0x149/0x1a3
 [81040e85] do_group_exit+0x72/0x9b
 [81040ec0] SyS_exit_group+0x12/0x16
 [814f52d2] system_call_fastpath+0x16/0x1b
Code: 51 20 48 8d 41 20 48 39 c2 75 0d 48 c7 87 28 03 00 00 00 00 00 00
eb 36 48 8d 42 e0 48 89 87 28 03 00 00 48 8b 51 20 48 8b 41 28 48 89
42 08 48 89 10 48 ba 00 01 10 00 00 00 ad de 48 b8 00 02 
RIP  [810e3eef] pgtable_trans_huge_withdraw+0x38/0x60
 RSP 880fc0c5bc88
---[ end trace e5413b388b6ea448 ]---
Fixing recursive fault but reboot is needed!
general protection fault:  [#2] SMP 
Modules linked in:
CPU: 268 PID: 1722 Comm: kworker/268:1 Tainted: G  D
3.11.0-medusa-03121-g757f8ca #184
Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS
01/15/2013
Workqueue: events vmstat_update
task: 880fc1a74280 ti: 880fc1a76000 task.ti: 880fc1a76000
RIP: 0010:[810bcdcb]  [810bcdcb]
free_pcppages_bulk+0x97/0x329
RSP: 0018:880fc1a77c98  EFLAGS: 00010082
RAX: 880fffd94d68 RBX: dead002001e0 RCX: 880fffd94d50
RDX: 880fffd94d68 RSI: 001f RDI: 888f7efdac68
RBP: 880fc1a77cf8 R08: 0400 R09: 81a8bf00
R10: 884f7efdac00 R11: 81009bae R12: dead00200200
R13: 888f7efdac00 R14: 001f R15: 
FS:  () GS:880fffd8()
knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7768b0b8 CR3: 01a0b000 CR4: 000407e0
Stack:
 880fc1a77ce8 880fffd94d68 0010 880fffd94d50
 001ff9276a68 880fffd94d60  001f
 880fffd94d50 0292 880fc1a77d38 880fffd95d05
Call Trace:
 [810bd149] drain_zone_pages+0x33/0x42
 [810cd5a6] refresh_cpu_vm_stats+0xcc/0x11e
 [810cd609] vmstat_update+0x11/0x43
 [8105350f] process_one_work+0x260/0x389
 [8105381a] worker_thread+0x1e2/0x332
 [81053638] ? process_one_work+0x389/0x389
 [810579df] kthread+0xb3/0xbd
 [81053638] ? process_one_work+0x389/0x389
 [8105792c] ? kthread_freezable_should_stop+0x5b/0x5b
 [814f522c] ret_from_fork+0x7c/0xb0
 [8105792c] ? kthread_freezable_should_stop+0x5b/0x5b
Code: 48 89 55 c8 48 39 14 08 74 ce 41 83 fe 03 44 0f 44 75 c4 48 83 c2
08 48 89 45 b0 48 89 55 a8 48 8b 45 a8 4c 8b 20 49 8d 5c 24 e0 48 8b
53 20 48 8b 43 28 48 89 42 08 48 89 10 48 ba 00 01 10 00 
RIP  [810bcdcb] free_pcppages_bulk+0x97/0x329
 RSP 880fc1a77c98
---[ end 

Re: [PATCH 2/2] thp: support split page table lock

2013-09-06 Thread Naoya Horiguchi
Hi Kirill,

On Fri, Sep 06, 2013 at 01:48:03PM +0300, Kirill A. Shutemov wrote:
 Naoya Horiguchi wrote:
  Thp related code also uses per process mm-page_table_lock now.
  So making it fine-grained can provide better performance.
  
  This patch makes thp support split page table lock by using page-ptl
  of the pages storing pmd_trans_huge pmds.
  
  Some functions like pmd_trans_huge_lock() and page_check_address_pmd()
  are expected by their caller to pass back the pointer of ptl, so this
  patch adds to those functions new arguments for that. Rather than that,
  this patch gives only straightforward replacement.
  
  ChangeLog v3:
   - fixed argument of huge_pmd_lockptr() in copy_huge_pmd()
   - added missing declaration of ptl in do_huge_pmd_anonymous_page()
  
  Signed-off-by: Naoya Horiguchi n-horigu...@ah.jp.nec.com
 
 Generally, looks good. Few notes:
 
 I believe you need to convert __pte_alloc() to new locking. Not sure about
 __pte_alloc_kernel().
 Have you check all rest mm-page_table_lock, that they shouldn't be
 converted to new locking?

I thought that keeping __pte_alloc() using mm-page_table_lock was safe
because it uses bare mm-page_table_lock instead of pte_lockptr() even
before this patchset, but not 100% sure.
__pte_alloc() (and its family) are used in normal page path, so if it's
not safe, we've lived with unsafe code for very long (maybe since 2005).
Anyway, converting __pte_alloc() into split ptl could improve performance
(though we need testing to know what amount), so I'll try that.

 You use uninitialized_var() a lot. It's ugly. I've check few places
 (task_mmu.c, copy_huge_pmd) and have found a reason why we need it there.
 Why?

I got a compile warning of uninitialized usage when developing and added
to suppress it, but in the final form I never get such a warning.
So I'll remove this uninitialized_var()s.

 You often do
 
 +   ptl = huge_pmd_lockptr(mm, pmd);
 +   spin_lock(ptl);
 
 Should we have a helper to combine them? huge_pmd_lock()?

OK, I'll do it.

Thanks,
Naoya Horiguchi
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] thp: support split page table lock

2013-09-06 Thread Kirill A. Shutemov
Naoya Horiguchi wrote:
 Hi Kirill,
 
 On Fri, Sep 06, 2013 at 01:48:03PM +0300, Kirill A. Shutemov wrote:
  Naoya Horiguchi wrote:
   Thp related code also uses per process mm-page_table_lock now.
   So making it fine-grained can provide better performance.
   
   This patch makes thp support split page table lock by using page-ptl
   of the pages storing pmd_trans_huge pmds.
   
   Some functions like pmd_trans_huge_lock() and page_check_address_pmd()
   are expected by their caller to pass back the pointer of ptl, so this
   patch adds to those functions new arguments for that. Rather than that,
   this patch gives only straightforward replacement.
   
   ChangeLog v3:
- fixed argument of huge_pmd_lockptr() in copy_huge_pmd()
- added missing declaration of ptl in do_huge_pmd_anonymous_page()
   
   Signed-off-by: Naoya Horiguchi n-horigu...@ah.jp.nec.com
  
  Generally, looks good. Few notes:
  
  I believe you need to convert __pte_alloc() to new locking. Not sure about
  __pte_alloc_kernel().
  Have you check all rest mm-page_table_lock, that they shouldn't be
  converted to new locking?
 
 I thought that keeping __pte_alloc() using mm-page_table_lock was safe
 because it uses bare mm-page_table_lock instead of pte_lockptr() even
 before this patchset, but not 100% sure.
 __pte_alloc() (and its family) are used in normal page path, so if it's
 not safe, we've lived with unsafe code for very long (maybe since 2005).
 Anyway, converting __pte_alloc() into split ptl could improve performance
 (though we need testing to know what amount), so I'll try that.

No, before the patch mm-page_table_lock is what we need: it serializes
setting up pmd, not adding pte to pmd and it's subject to change with new
locking model.

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] thp: support split page table lock

2013-09-06 Thread Naoya Horiguchi
On Fri, Sep 06, 2013 at 07:46:25PM +0300, Kirill A. Shutemov wrote:
 Naoya Horiguchi wrote:
  Hi Kirill,
  
  On Fri, Sep 06, 2013 at 01:48:03PM +0300, Kirill A. Shutemov wrote:
   Naoya Horiguchi wrote:
Thp related code also uses per process mm-page_table_lock now.
So making it fine-grained can provide better performance.

This patch makes thp support split page table lock by using page-ptl
of the pages storing pmd_trans_huge pmds.

Some functions like pmd_trans_huge_lock() and page_check_address_pmd()
are expected by their caller to pass back the pointer of ptl, so this
patch adds to those functions new arguments for that. Rather than that,
this patch gives only straightforward replacement.

ChangeLog v3:
 - fixed argument of huge_pmd_lockptr() in copy_huge_pmd()
 - added missing declaration of ptl in do_huge_pmd_anonymous_page()

Signed-off-by: Naoya Horiguchi n-horigu...@ah.jp.nec.com
   
   Generally, looks good. Few notes:
   
   I believe you need to convert __pte_alloc() to new locking. Not sure about
   __pte_alloc_kernel().
   Have you check all rest mm-page_table_lock, that they shouldn't be
   converted to new locking?
  
  I thought that keeping __pte_alloc() using mm-page_table_lock was safe
  because it uses bare mm-page_table_lock instead of pte_lockptr() even
  before this patchset, but not 100% sure.
  __pte_alloc() (and its family) are used in normal page path, so if it's
  not safe, we've lived with unsafe code for very long (maybe since 2005).
  Anyway, converting __pte_alloc() into split ptl could improve performance
  (though we need testing to know what amount), so I'll try that.
 
 No, before the patch mm-page_table_lock is what we need: it serializes
 setting up pmd, not adding pte to pmd and it's subject to change with new
 locking model.

OK, I see. This is a todo for the next post.
Thank you for explanation.

Naoya
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] thp: support split page table lock

2013-09-06 Thread Naoya Horiguchi
Hi Alex,

On Fri, Sep 06, 2013 at 11:04:23AM -0500, Alex Thorlton wrote:
 On Thu, Sep 05, 2013 at 05:27:46PM -0400, Naoya Horiguchi wrote:
  Thp related code also uses per process mm-page_table_lock now.
  So making it fine-grained can provide better performance.
  
  This patch makes thp support split page table lock by using page-ptl
  of the pages storing pmd_trans_huge pmds.
  
  Some functions like pmd_trans_huge_lock() and page_check_address_pmd()
  are expected by their caller to pass back the pointer of ptl, so this
  patch adds to those functions new arguments for that. Rather than that,
  this patch gives only straightforward replacement.
  
  ChangeLog v3:
   - fixed argument of huge_pmd_lockptr() in copy_huge_pmd()
   - added missing declaration of ptl in do_huge_pmd_anonymous_page()
 
 I've applied these and tested them using the same tests program that I
 used when I was working on the same issue, and I'm running into some
 bugs.  Here's a stack trace:

Thank you for helping testing. This bug is new to me.

 general protection fault:  [#1] SMP 
 Modules linked in:
 CPU: 268 PID: 32381 Comm: memscale Not tainted
 3.11.0-medusa-03121-g757f8ca #184
 Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS
 01/15/2013
 task: 880fbdd82180 ti: 880fc0c5a000 task.ti: 880fc0c5a000
 RIP: 0010:[810e3eef]  [810e3eef]
 pgtable_trans_huge_withdraw+0x38/0x60
 RSP: 0018:880fc0c5bc88  EFLAGS: 00010297
 RAX: ea17cebe8838 RBX: 0015309bd000 RCX: ea01f623b028
 RDX: dead00100100 RSI: 8dcf77d84c30 RDI: 880fbda67580
 RBP: 880fc0c5bc88 R08: 0013 R09: 00014da0
 R10: 880fc0c5bc88 R11: 888f7efda000 R12: 8dcf77d84c30
 R13: 880fc0c5bdf8 R14: 85cf401ff067 R15: 8b4de5fabff8
 FS:  () GS:880fffd8()
 knlGS:
 CS:  0010 DS:  ES:  CR0: 80050033
 CR2: 7768b0b8 CR3: 01a0b000 CR4: 000407e0
 Stack:
  880fc0c5bcc8 810f7643 880fc0c5bcc8 810d8297
  ea1456237510 7fc7b0e0  7fc7b0c0
  880fc0c5bda8 810d85ba 880fc0c5bd48 880fc0c5bd68
 Call Trace:
  [810f7643] zap_huge_pmd+0x4c/0x101
  [810d8297] ? tlb_flush_mmu+0x58/0x75
  [810d85ba] unmap_single_vma+0x306/0x7d6
  [810d8ad9] unmap_vmas+0x4f/0x82
  [810dab5e] exit_mmap+0x8b/0x113
  [810a9743] ? __delayacct_add_tsk+0x170/0x182
  [8103c609] mmput+0x3e/0xc4
  [8104088c] do_exit+0x380/0x907
  [810fb89c] ? vfs_write+0x149/0x1a3
  [81040e85] do_group_exit+0x72/0x9b
  [81040ec0] SyS_exit_group+0x12/0x16
  [814f52d2] system_call_fastpath+0x16/0x1b
 Code: 51 20 48 8d 41 20 48 39 c2 75 0d 48 c7 87 28 03 00 00 00 00 00 00
 eb 36 48 8d 42 e0 48 89 87 28 03 00 00 48 8b 51 20 48 8b 41 28 48 89
 42 08 48 89 10 48 ba 00 01 10 00 00 00 ad de 48 b8 00 02 
 RIP  [810e3eef] pgtable_trans_huge_withdraw+0x38/0x60
  RSP 880fc0c5bc88
 ---[ end trace e5413b388b6ea448 ]---
 Fixing recursive fault but reboot is needed!
 general protection fault:  [#2] SMP 
 Modules linked in:
 CPU: 268 PID: 1722 Comm: kworker/268:1 Tainted: G  D
 3.11.0-medusa-03121-g757f8ca #184
 Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS
 01/15/2013
 Workqueue: events vmstat_update
 task: 880fc1a74280 ti: 880fc1a76000 task.ti: 880fc1a76000
 RIP: 0010:[810bcdcb]  [810bcdcb]
 free_pcppages_bulk+0x97/0x329
 RSP: 0018:880fc1a77c98  EFLAGS: 00010082
 RAX: 880fffd94d68 RBX: dead002001e0 RCX: 880fffd94d50
 RDX: 880fffd94d68 RSI: 001f RDI: 888f7efdac68
 RBP: 880fc1a77cf8 R08: 0400 R09: 81a8bf00
 R10: 884f7efdac00 R11: 81009bae R12: dead00200200
 R13: 888f7efdac00 R14: 001f R15: 
 FS:  () GS:880fffd8()
 knlGS:
 CS:  0010 DS:  ES:  CR0: 80050033
 CR2: 7768b0b8 CR3: 01a0b000 CR4: 000407e0
 Stack:
  880fc1a77ce8 880fffd94d68 0010 880fffd94d50
  001ff9276a68 880fffd94d60  001f
  880fffd94d50 0292 880fc1a77d38 880fffd95d05
 Call Trace:
  [810bd149] drain_zone_pages+0x33/0x42
  [810cd5a6] refresh_cpu_vm_stats+0xcc/0x11e
  [810cd609] vmstat_update+0x11/0x43
  [8105350f] process_one_work+0x260/0x389
  [8105381a] worker_thread+0x1e2/0x332
  [81053638] ? process_one_work+0x389/0x389
  [810579df] kthread+0xb3/0xbd
  [81053638] ? process_one_work+0x389/0x389
  [8105792c] ? kthread_freezable_should_stop+0x5b/0x5b
  [814f522c] ret_from_fork+0x7c/0xb0
  [8105792c] ? kthread_freezable_should_stop+0x5b/0x5b
 Code: 48 89 55 c8 48 39 14 08 74 ce 41 83 fe 03 44 0f 44 75 

Re: [PATCH 2/2] thp: support split page table lock

2013-09-03 Thread Naoya Horiguchi
> diff --git v3.11-rc3.orig/mm/huge_memory.c v3.11-rc3/mm/huge_memory.c
> index 243e710..3cb29e1 100644
> --- v3.11-rc3.orig/mm/huge_memory.c
> +++ v3.11-rc3/mm/huge_memory.c
...
> @@ -864,14 +868,17 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct 
> mm_struct *src_mm,
>   pmd_t pmd;
>   pgtable_t pgtable;
>   int ret;
> + spinlock_t *uninitialized_var(dst_ptl), *uninitialized_var(src_ptl);
>  
>   ret = -ENOMEM;
>   pgtable = pte_alloc_one(dst_mm, addr);
>   if (unlikely(!pgtable))
>   goto out;
>  
> - spin_lock(_mm->page_table_lock);
> - spin_lock_nested(_mm->page_table_lock, SINGLE_DEPTH_NESTING);
> + dst_ptl = huge_pmd_lockptr(dst_mm, dst_ptl);
> + src_ptl = huge_pmd_lockptr(src_mm, src_ptl);

I found one mistake. This should be:

+   dst_ptl = huge_pmd_lockptr(dst_mm, dst_pmd);
+   src_ptl = huge_pmd_lockptr(src_mm, src_pmd);

Thanks,
Naoya Horiguchi

> + spin_lock(dst_ptl);
> + spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
>  
>   ret = -EAGAIN;
>   pmd = *src_pmd;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] thp: support split page table lock

2013-09-03 Thread Naoya Horiguchi
 diff --git v3.11-rc3.orig/mm/huge_memory.c v3.11-rc3/mm/huge_memory.c
 index 243e710..3cb29e1 100644
 --- v3.11-rc3.orig/mm/huge_memory.c
 +++ v3.11-rc3/mm/huge_memory.c
...
 @@ -864,14 +868,17 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct 
 mm_struct *src_mm,
   pmd_t pmd;
   pgtable_t pgtable;
   int ret;
 + spinlock_t *uninitialized_var(dst_ptl), *uninitialized_var(src_ptl);
  
   ret = -ENOMEM;
   pgtable = pte_alloc_one(dst_mm, addr);
   if (unlikely(!pgtable))
   goto out;
  
 - spin_lock(dst_mm-page_table_lock);
 - spin_lock_nested(src_mm-page_table_lock, SINGLE_DEPTH_NESTING);
 + dst_ptl = huge_pmd_lockptr(dst_mm, dst_ptl);
 + src_ptl = huge_pmd_lockptr(src_mm, src_ptl);

I found one mistake. This should be:

+   dst_ptl = huge_pmd_lockptr(dst_mm, dst_pmd);
+   src_ptl = huge_pmd_lockptr(src_mm, src_pmd);

Thanks,
Naoya Horiguchi

 + spin_lock(dst_ptl);
 + spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
  
   ret = -EAGAIN;
   pmd = *src_pmd;
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] thp: support split page table lock

2013-09-02 Thread Naoya Horiguchi
Kirill, thank you for the comment.

On Mon, Sep 02, 2013 at 01:53:27PM +0300, Kirill A. Shutemov wrote:
> Naoya Horiguchi wrote:
> > Thp related code also uses per process mm->page_table_lock now. So making
> > it fine-grained can provide better performance.
> > 
> > This patch makes thp support split page table lock which makes us use
> > page->ptl of the pages storing "pmd_trans_huge" pmds.
> 
> Hm. So, you use page->ptl only when you deal with thp pages, otherwise
> mm->page_table_lock, right?

Maybe it's not enough.
We use page->ptl for both of thp and normal depending on USE_SPLIT_PTLOCKS.
And regardless of USE_SPLIT_PTLOCKS, mm->page_table_lock is still used
by other contexts like memory initialization code or driver code for their
specific usage.

> It looks inconsistent to me. Does it mean we have to take both locks on
> split and collapse paths?

This patch includes the replacement with page->ptl for split/collapse path.

> I'm not sure if it's safe to take only
> page->ptl for alloc path. Probably not.

Right, it's not safe.

> Why not to use new locking for pmd everywhere?

So I already do this.

Thanks,
Naoya Horiguchi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 2/2] thp: support split page table lock

2013-09-02 Thread Kirill A. Shutemov
Naoya Horiguchi wrote:
> Thp related code also uses per process mm->page_table_lock now. So making
> it fine-grained can provide better performance.
> 
> This patch makes thp support split page table lock which makes us use
> page->ptl of the pages storing "pmd_trans_huge" pmds.

Hm. So, you use page->ptl only when you deal with thp pages, otherwise
mm->page_table_lock, right?

It looks inconsistent to me. Does it mean we have to take both locks on
split and collapse paths? I'm not sure if it's safe to take only
page->ptl for alloc path. Probably not.

Why not to use new locking for pmd everywhere?

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 2/2] thp: support split page table lock

2013-09-02 Thread Kirill A. Shutemov
Naoya Horiguchi wrote:
 Thp related code also uses per process mm-page_table_lock now. So making
 it fine-grained can provide better performance.
 
 This patch makes thp support split page table lock which makes us use
 page-ptl of the pages storing pmd_trans_huge pmds.

Hm. So, you use page-ptl only when you deal with thp pages, otherwise
mm-page_table_lock, right?

It looks inconsistent to me. Does it mean we have to take both locks on
split and collapse paths? I'm not sure if it's safe to take only
page-ptl for alloc path. Probably not.

Why not to use new locking for pmd everywhere?

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] thp: support split page table lock

2013-09-02 Thread Naoya Horiguchi
Kirill, thank you for the comment.

On Mon, Sep 02, 2013 at 01:53:27PM +0300, Kirill A. Shutemov wrote:
 Naoya Horiguchi wrote:
  Thp related code also uses per process mm-page_table_lock now. So making
  it fine-grained can provide better performance.
  
  This patch makes thp support split page table lock which makes us use
  page-ptl of the pages storing pmd_trans_huge pmds.
 
 Hm. So, you use page-ptl only when you deal with thp pages, otherwise
 mm-page_table_lock, right?

Maybe it's not enough.
We use page-ptl for both of thp and normal depending on USE_SPLIT_PTLOCKS.
And regardless of USE_SPLIT_PTLOCKS, mm-page_table_lock is still used
by other contexts like memory initialization code or driver code for their
specific usage.

 It looks inconsistent to me. Does it mean we have to take both locks on
 split and collapse paths?

This patch includes the replacement with page-ptl for split/collapse path.

 I'm not sure if it's safe to take only
 page-ptl for alloc path. Probably not.

Right, it's not safe.

 Why not to use new locking for pmd everywhere?

So I already do this.

Thanks,
Naoya Horiguchi
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/