Re: [PATCH 2/2] thp: support split page table lock
On Mon, Sep 09, 2013 at 10:34:45AM +0800, Daniel J Blueman wrote: > On Saturday, 7 September 2013 02:10:02 UTC+8, Naoya Horiguchi wrote: > >Hi Alex, > > > >On Fri, Sep 06, 2013 at 11:04:23AM -0500, Alex Thorlton wrote: > >> On Thu, Sep 05, 2013 at 05:27:46PM -0400, Naoya Horiguchi wrote: > >> > Thp related code also uses per process mm->page_table_lock now. > >> > So making it fine-grained can provide better performance. > >> > > >> > This patch makes thp support split page table lock by using page->ptl > >> > of the pages storing "pmd_trans_huge" pmds. > >> > > >> > Some functions like pmd_trans_huge_lock() and > page_check_address_pmd() > >> > are expected by their caller to pass back the pointer of ptl, so this > >> > patch adds to those functions new arguments for that. Rather than > that, > >> > this patch gives only straightforward replacement. > >> > > >> > ChangeLog v3: > >> > - fixed argument of huge_pmd_lockptr() in copy_huge_pmd() > >> > - added missing declaration of ptl in do_huge_pmd_anonymous_page() > >> > >> I've applied these and tested them using the same tests program that I > >> used when I was working on the same issue, and I'm running into some > >> bugs. Here's a stack trace: > > > >Thank you for helping testing. This bug is new to me. > > With 3.11, this patch series and CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS, > I consistently hit the same failure when exiting one of my > stress-testers [1] when using eg 24 cores. > > Doesn't happen with 8 cores, so likely needs enough virtual memory > to use multiple split locks. Otherwise, this is very promising work! Daniel, Hillf Danton (dhi...@gmail.com) suggested putting the big page_table_lock back into the two functions seen below. I re-tested with this change and it seems to resolve the issue. - Alex --- a/mm/pgtable-generic.c Sat Sep 7 15:17:52 2013 +++ b/mm/pgtable-generic.c Sat Sep 7 15:20:28 2013 @@ -127,12 +127,14 @@ void pmdp_splitting_flush(struct vm_area void pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp, pgtable_t pgtable) { + spin_lock(>page_table_lock); /* FIFO */ if (!mm->pmd_huge_pte) INIT_LIST_HEAD(>lru); else list_add(>lru, >pmd_huge_pte->lru); mm->pmd_huge_pte = pgtable; + spin_unlock(>page_table_lock); } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ #endif @@ -144,6 +146,7 @@ pgtable_t pgtable_trans_huge_withdraw(st { pgtable_t pgtable; + spin_lock(>page_table_lock); /* FIFO */ pgtable = mm->pmd_huge_pte; if (list_empty(>lru)) @@ -153,6 +156,7 @@ pgtable_t pgtable_trans_huge_withdraw(st struct page, lru); list_del(>lru); } + spin_unlock(>page_table_lock); return pgtable; } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] thp: support split page table lock
On Mon, Sep 09, 2013 at 10:34:45AM +0800, Daniel J Blueman wrote: On Saturday, 7 September 2013 02:10:02 UTC+8, Naoya Horiguchi wrote: Hi Alex, On Fri, Sep 06, 2013 at 11:04:23AM -0500, Alex Thorlton wrote: On Thu, Sep 05, 2013 at 05:27:46PM -0400, Naoya Horiguchi wrote: Thp related code also uses per process mm-page_table_lock now. So making it fine-grained can provide better performance. This patch makes thp support split page table lock by using page-ptl of the pages storing pmd_trans_huge pmds. Some functions like pmd_trans_huge_lock() and page_check_address_pmd() are expected by their caller to pass back the pointer of ptl, so this patch adds to those functions new arguments for that. Rather than that, this patch gives only straightforward replacement. ChangeLog v3: - fixed argument of huge_pmd_lockptr() in copy_huge_pmd() - added missing declaration of ptl in do_huge_pmd_anonymous_page() I've applied these and tested them using the same tests program that I used when I was working on the same issue, and I'm running into some bugs. Here's a stack trace: Thank you for helping testing. This bug is new to me. With 3.11, this patch series and CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS, I consistently hit the same failure when exiting one of my stress-testers [1] when using eg 24 cores. Doesn't happen with 8 cores, so likely needs enough virtual memory to use multiple split locks. Otherwise, this is very promising work! Daniel, Hillf Danton (dhi...@gmail.com) suggested putting the big page_table_lock back into the two functions seen below. I re-tested with this change and it seems to resolve the issue. - Alex --- a/mm/pgtable-generic.c Sat Sep 7 15:17:52 2013 +++ b/mm/pgtable-generic.c Sat Sep 7 15:20:28 2013 @@ -127,12 +127,14 @@ void pmdp_splitting_flush(struct vm_area void pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp, pgtable_t pgtable) { + spin_lock(mm-page_table_lock); /* FIFO */ if (!mm-pmd_huge_pte) INIT_LIST_HEAD(pgtable-lru); else list_add(pgtable-lru, mm-pmd_huge_pte-lru); mm-pmd_huge_pte = pgtable; + spin_unlock(mm-page_table_lock); } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ #endif @@ -144,6 +146,7 @@ pgtable_t pgtable_trans_huge_withdraw(st { pgtable_t pgtable; + spin_lock(mm-page_table_lock); /* FIFO */ pgtable = mm-pmd_huge_pte; if (list_empty(pgtable-lru)) @@ -153,6 +156,7 @@ pgtable_t pgtable_trans_huge_withdraw(st struct page, lru); list_del(pgtable-lru); } + spin_unlock(mm-page_table_lock); return pgtable; } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ -- -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] thp: support split page table lock
On Saturday, 7 September 2013 02:10:02 UTC+8, Naoya Horiguchi wrote: Hi Alex, On Fri, Sep 06, 2013 at 11:04:23AM -0500, Alex Thorlton wrote: > On Thu, Sep 05, 2013 at 05:27:46PM -0400, Naoya Horiguchi wrote: > > Thp related code also uses per process mm->page_table_lock now. > > So making it fine-grained can provide better performance. > > > > This patch makes thp support split page table lock by using page->ptl > > of the pages storing "pmd_trans_huge" pmds. > > > > Some functions like pmd_trans_huge_lock() and page_check_address_pmd() > > are expected by their caller to pass back the pointer of ptl, so this > > patch adds to those functions new arguments for that. Rather than that, > > this patch gives only straightforward replacement. > > > > ChangeLog v3: > > - fixed argument of huge_pmd_lockptr() in copy_huge_pmd() > > - added missing declaration of ptl in do_huge_pmd_anonymous_page() > > I've applied these and tested them using the same tests program that I > used when I was working on the same issue, and I'm running into some > bugs. Here's a stack trace: Thank you for helping testing. This bug is new to me. With 3.11, this patch series and CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS, I consistently hit the same failure when exiting one of my stress-testers [1] when using eg 24 cores. Doesn't happen with 8 cores, so likely needs enough virtual memory to use multiple split locks. Otherwise, this is very promising work! [1] http://quora.org/2013/fft3d.c -- Daniel J Blueman Principal Software Engineer, Numascale -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 2/2] thp: support split page table lock
"Kirill A. Shutemov" writes: > Naoya Horiguchi wrote: >> Thp related code also uses per process mm->page_table_lock now. >> So making it fine-grained can provide better performance. >> >> This patch makes thp support split page table lock by using page->ptl >> of the pages storing "pmd_trans_huge" pmds. >> >> Some functions like pmd_trans_huge_lock() and page_check_address_pmd() >> are expected by their caller to pass back the pointer of ptl, so this >> patch adds to those functions new arguments for that. Rather than that, >> this patch gives only straightforward replacement. >> >> ChangeLog v3: >> - fixed argument of huge_pmd_lockptr() in copy_huge_pmd() >> - added missing declaration of ptl in do_huge_pmd_anonymous_page() >> >> Signed-off-by: Naoya Horiguchi > > Generally, looks good. Few notes: > > I believe you need to convert __pte_alloc() to new locking. Not sure about > __pte_alloc_kernel(). > Have you check all rest mm->page_table_lock, that they shouldn't be > converted to new locking? May be we can have a CONFIG_DEBUG_VM version of pmd_populate that check check with assert_spin_locked ? -aneesh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 2/2] thp: support split page table lock
Kirill A. Shutemov kirill.shute...@linux.intel.com writes: Naoya Horiguchi wrote: Thp related code also uses per process mm-page_table_lock now. So making it fine-grained can provide better performance. This patch makes thp support split page table lock by using page-ptl of the pages storing pmd_trans_huge pmds. Some functions like pmd_trans_huge_lock() and page_check_address_pmd() are expected by their caller to pass back the pointer of ptl, so this patch adds to those functions new arguments for that. Rather than that, this patch gives only straightforward replacement. ChangeLog v3: - fixed argument of huge_pmd_lockptr() in copy_huge_pmd() - added missing declaration of ptl in do_huge_pmd_anonymous_page() Signed-off-by: Naoya Horiguchi n-horigu...@ah.jp.nec.com Generally, looks good. Few notes: I believe you need to convert __pte_alloc() to new locking. Not sure about __pte_alloc_kernel(). Have you check all rest mm-page_table_lock, that they shouldn't be converted to new locking? May be we can have a CONFIG_DEBUG_VM version of pmd_populate that check check with assert_spin_locked ? -aneesh -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] thp: support split page table lock
On Saturday, 7 September 2013 02:10:02 UTC+8, Naoya Horiguchi wrote: Hi Alex, On Fri, Sep 06, 2013 at 11:04:23AM -0500, Alex Thorlton wrote: On Thu, Sep 05, 2013 at 05:27:46PM -0400, Naoya Horiguchi wrote: Thp related code also uses per process mm-page_table_lock now. So making it fine-grained can provide better performance. This patch makes thp support split page table lock by using page-ptl of the pages storing pmd_trans_huge pmds. Some functions like pmd_trans_huge_lock() and page_check_address_pmd() are expected by their caller to pass back the pointer of ptl, so this patch adds to those functions new arguments for that. Rather than that, this patch gives only straightforward replacement. ChangeLog v3: - fixed argument of huge_pmd_lockptr() in copy_huge_pmd() - added missing declaration of ptl in do_huge_pmd_anonymous_page() I've applied these and tested them using the same tests program that I used when I was working on the same issue, and I'm running into some bugs. Here's a stack trace: Thank you for helping testing. This bug is new to me. With 3.11, this patch series and CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS, I consistently hit the same failure when exiting one of my stress-testers [1] when using eg 24 cores. Doesn't happen with 8 cores, so likely needs enough virtual memory to use multiple split locks. Otherwise, this is very promising work! [1] http://quora.org/2013/fft3d.c -- Daniel J Blueman Principal Software Engineer, Numascale -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] thp: support split page table lock
Hi Alex, On Fri, Sep 06, 2013 at 11:04:23AM -0500, Alex Thorlton wrote: > On Thu, Sep 05, 2013 at 05:27:46PM -0400, Naoya Horiguchi wrote: > > Thp related code also uses per process mm->page_table_lock now. > > So making it fine-grained can provide better performance. > > > > This patch makes thp support split page table lock by using page->ptl > > of the pages storing "pmd_trans_huge" pmds. > > > > Some functions like pmd_trans_huge_lock() and page_check_address_pmd() > > are expected by their caller to pass back the pointer of ptl, so this > > patch adds to those functions new arguments for that. Rather than that, > > this patch gives only straightforward replacement. > > > > ChangeLog v3: > > - fixed argument of huge_pmd_lockptr() in copy_huge_pmd() > > - added missing declaration of ptl in do_huge_pmd_anonymous_page() > > I've applied these and tested them using the same tests program that I > used when I was working on the same issue, and I'm running into some > bugs. Here's a stack trace: Thank you for helping testing. This bug is new to me. > general protection fault: [#1] SMP > Modules linked in: > CPU: 268 PID: 32381 Comm: memscale Not tainted > 3.11.0-medusa-03121-g757f8ca #184 > Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS > 01/15/2013 > task: 880fbdd82180 ti: 880fc0c5a000 task.ti: 880fc0c5a000 > RIP: 0010:[] [] > pgtable_trans_huge_withdraw+0x38/0x60 > RSP: 0018:880fc0c5bc88 EFLAGS: 00010297 > RAX: ea17cebe8838 RBX: 0015309bd000 RCX: ea01f623b028 > RDX: dead00100100 RSI: 8dcf77d84c30 RDI: 880fbda67580 > RBP: 880fc0c5bc88 R08: 0013 R09: 00014da0 > R10: 880fc0c5bc88 R11: 888f7efda000 R12: 8dcf77d84c30 > R13: 880fc0c5bdf8 R14: 85cf401ff067 R15: 8b4de5fabff8 > FS: () GS:880fffd8() > knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 7768b0b8 CR3: 01a0b000 CR4: 000407e0 > Stack: > 880fc0c5bcc8 810f7643 880fc0c5bcc8 810d8297 > ea1456237510 7fc7b0e0 7fc7b0c0 > 880fc0c5bda8 810d85ba 880fc0c5bd48 880fc0c5bd68 > Call Trace: > [] zap_huge_pmd+0x4c/0x101 > [] ? tlb_flush_mmu+0x58/0x75 > [] unmap_single_vma+0x306/0x7d6 > [] unmap_vmas+0x4f/0x82 > [] exit_mmap+0x8b/0x113 > [] ? __delayacct_add_tsk+0x170/0x182 > [] mmput+0x3e/0xc4 > [] do_exit+0x380/0x907 > [] ? vfs_write+0x149/0x1a3 > [] do_group_exit+0x72/0x9b > [] SyS_exit_group+0x12/0x16 > [] system_call_fastpath+0x16/0x1b > Code: 51 20 48 8d 41 20 48 39 c2 75 0d 48 c7 87 28 03 00 00 00 00 00 00 > eb 36 48 8d 42 e0 48 89 87 28 03 00 00 48 8b 51 20 48 8b 41 28 <48> 89 > 42 08 48 89 10 48 ba 00 01 10 00 00 00 ad de 48 b8 00 02 > RIP [] pgtable_trans_huge_withdraw+0x38/0x60 > RSP > ---[ end trace e5413b388b6ea448 ]--- > Fixing recursive fault but reboot is needed! > general protection fault: [#2] SMP > Modules linked in: > CPU: 268 PID: 1722 Comm: kworker/268:1 Tainted: G D > 3.11.0-medusa-03121-g757f8ca #184 > Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS > 01/15/2013 > Workqueue: events vmstat_update > task: 880fc1a74280 ti: 880fc1a76000 task.ti: 880fc1a76000 > RIP: 0010:[] [] > free_pcppages_bulk+0x97/0x329 > RSP: 0018:880fc1a77c98 EFLAGS: 00010082 > RAX: 880fffd94d68 RBX: dead002001e0 RCX: 880fffd94d50 > RDX: 880fffd94d68 RSI: 001f RDI: 888f7efdac68 > RBP: 880fc1a77cf8 R08: 0400 R09: 81a8bf00 > R10: 884f7efdac00 R11: 81009bae R12: dead00200200 > R13: 888f7efdac00 R14: 001f R15: > FS: () GS:880fffd8() > knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 7768b0b8 CR3: 01a0b000 CR4: 000407e0 > Stack: > 880fc1a77ce8 880fffd94d68 0010 880fffd94d50 > 001ff9276a68 880fffd94d60 001f > 880fffd94d50 0292 880fc1a77d38 880fffd95d05 > Call Trace: > [] drain_zone_pages+0x33/0x42 > [] refresh_cpu_vm_stats+0xcc/0x11e > [] vmstat_update+0x11/0x43 > [] process_one_work+0x260/0x389 > [] worker_thread+0x1e2/0x332 > [] ? process_one_work+0x389/0x389 > [] kthread+0xb3/0xbd > [] ? process_one_work+0x389/0x389 > [] ? kthread_freezable_should_stop+0x5b/0x5b > [] ret_from_fork+0x7c/0xb0 > [] ? kthread_freezable_should_stop+0x5b/0x5b > Code: 48 89 55 c8 48 39 14 08 74 ce 41 83 fe 03 44 0f 44 75 c4 48 83 c2 > 08 48 89 45 b0 48 89 55 a8 48 8b 45 a8 4c 8b 20 49 8d 5c 24 e0 <48> 8b > 53 20 48 8b 43 28 48 89 42 08 48 89 10 48 ba 00 01 10 00 > RIP [] free_pcppages_bulk+0x97/0x329 > RSP > ---[ end trace e5413b388b6ea449 ]--- > BUG: unable to handle kernel paging request at ffd8 > IP: [] kthread_data+0xb/0x11 > PGD 1a0c067
Re: [PATCH 2/2] thp: support split page table lock
On Fri, Sep 06, 2013 at 07:46:25PM +0300, Kirill A. Shutemov wrote: > Naoya Horiguchi wrote: > > Hi Kirill, > > > > On Fri, Sep 06, 2013 at 01:48:03PM +0300, Kirill A. Shutemov wrote: > > > Naoya Horiguchi wrote: > > > > Thp related code also uses per process mm->page_table_lock now. > > > > So making it fine-grained can provide better performance. > > > > > > > > This patch makes thp support split page table lock by using page->ptl > > > > of the pages storing "pmd_trans_huge" pmds. > > > > > > > > Some functions like pmd_trans_huge_lock() and page_check_address_pmd() > > > > are expected by their caller to pass back the pointer of ptl, so this > > > > patch adds to those functions new arguments for that. Rather than that, > > > > this patch gives only straightforward replacement. > > > > > > > > ChangeLog v3: > > > > - fixed argument of huge_pmd_lockptr() in copy_huge_pmd() > > > > - added missing declaration of ptl in do_huge_pmd_anonymous_page() > > > > > > > > Signed-off-by: Naoya Horiguchi > > > > > > Generally, looks good. Few notes: > > > > > > I believe you need to convert __pte_alloc() to new locking. Not sure about > > > __pte_alloc_kernel(). > > > Have you check all rest mm->page_table_lock, that they shouldn't be > > > converted to new locking? > > > > I thought that keeping __pte_alloc() using mm->page_table_lock was safe > > because it uses bare mm->page_table_lock instead of pte_lockptr() even > > before this patchset, but not 100% sure. > > __pte_alloc() (and its family) are used in normal page path, so if it's > > not safe, we've lived with unsafe code for very long (maybe since 2005). > > Anyway, converting __pte_alloc() into split ptl could improve performance > > (though we need testing to know what amount), so I'll try that. > > No, before the patch mm->page_table_lock is what we need: it serializes > setting up pmd, not adding pte to pmd and it's subject to change with new > locking model. OK, I see. This is a todo for the next post. Thank you for explanation. Naoya -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] thp: support split page table lock
Naoya Horiguchi wrote: > Hi Kirill, > > On Fri, Sep 06, 2013 at 01:48:03PM +0300, Kirill A. Shutemov wrote: > > Naoya Horiguchi wrote: > > > Thp related code also uses per process mm->page_table_lock now. > > > So making it fine-grained can provide better performance. > > > > > > This patch makes thp support split page table lock by using page->ptl > > > of the pages storing "pmd_trans_huge" pmds. > > > > > > Some functions like pmd_trans_huge_lock() and page_check_address_pmd() > > > are expected by their caller to pass back the pointer of ptl, so this > > > patch adds to those functions new arguments for that. Rather than that, > > > this patch gives only straightforward replacement. > > > > > > ChangeLog v3: > > > - fixed argument of huge_pmd_lockptr() in copy_huge_pmd() > > > - added missing declaration of ptl in do_huge_pmd_anonymous_page() > > > > > > Signed-off-by: Naoya Horiguchi > > > > Generally, looks good. Few notes: > > > > I believe you need to convert __pte_alloc() to new locking. Not sure about > > __pte_alloc_kernel(). > > Have you check all rest mm->page_table_lock, that they shouldn't be > > converted to new locking? > > I thought that keeping __pte_alloc() using mm->page_table_lock was safe > because it uses bare mm->page_table_lock instead of pte_lockptr() even > before this patchset, but not 100% sure. > __pte_alloc() (and its family) are used in normal page path, so if it's > not safe, we've lived with unsafe code for very long (maybe since 2005). > Anyway, converting __pte_alloc() into split ptl could improve performance > (though we need testing to know what amount), so I'll try that. No, before the patch mm->page_table_lock is what we need: it serializes setting up pmd, not adding pte to pmd and it's subject to change with new locking model. -- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] thp: support split page table lock
Hi Kirill, On Fri, Sep 06, 2013 at 01:48:03PM +0300, Kirill A. Shutemov wrote: > Naoya Horiguchi wrote: > > Thp related code also uses per process mm->page_table_lock now. > > So making it fine-grained can provide better performance. > > > > This patch makes thp support split page table lock by using page->ptl > > of the pages storing "pmd_trans_huge" pmds. > > > > Some functions like pmd_trans_huge_lock() and page_check_address_pmd() > > are expected by their caller to pass back the pointer of ptl, so this > > patch adds to those functions new arguments for that. Rather than that, > > this patch gives only straightforward replacement. > > > > ChangeLog v3: > > - fixed argument of huge_pmd_lockptr() in copy_huge_pmd() > > - added missing declaration of ptl in do_huge_pmd_anonymous_page() > > > > Signed-off-by: Naoya Horiguchi > > Generally, looks good. Few notes: > > I believe you need to convert __pte_alloc() to new locking. Not sure about > __pte_alloc_kernel(). > Have you check all rest mm->page_table_lock, that they shouldn't be > converted to new locking? I thought that keeping __pte_alloc() using mm->page_table_lock was safe because it uses bare mm->page_table_lock instead of pte_lockptr() even before this patchset, but not 100% sure. __pte_alloc() (and its family) are used in normal page path, so if it's not safe, we've lived with unsafe code for very long (maybe since 2005). Anyway, converting __pte_alloc() into split ptl could improve performance (though we need testing to know what amount), so I'll try that. > You use uninitialized_var() a lot. It's ugly. I've check few places > (task_mmu.c, copy_huge_pmd) and have found a reason why we need it there. > Why? I got a compile warning of uninitialized usage when developing and added to suppress it, but in the final form I never get such a warning. So I'll remove this uninitialized_var()s. > You often do > > + ptl = huge_pmd_lockptr(mm, pmd); > + spin_lock(ptl); > > Should we have a helper to combine them? huge_pmd_lock()? OK, I'll do it. Thanks, Naoya Horiguchi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] thp: support split page table lock
On Thu, Sep 05, 2013 at 05:27:46PM -0400, Naoya Horiguchi wrote: > Thp related code also uses per process mm->page_table_lock now. > So making it fine-grained can provide better performance. > > This patch makes thp support split page table lock by using page->ptl > of the pages storing "pmd_trans_huge" pmds. > > Some functions like pmd_trans_huge_lock() and page_check_address_pmd() > are expected by their caller to pass back the pointer of ptl, so this > patch adds to those functions new arguments for that. Rather than that, > this patch gives only straightforward replacement. > > ChangeLog v3: > - fixed argument of huge_pmd_lockptr() in copy_huge_pmd() > - added missing declaration of ptl in do_huge_pmd_anonymous_page() I've applied these and tested them using the same tests program that I used when I was working on the same issue, and I'm running into some bugs. Here's a stack trace: general protection fault: [#1] SMP Modules linked in: CPU: 268 PID: 32381 Comm: memscale Not tainted 3.11.0-medusa-03121-g757f8ca #184 Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS 01/15/2013 task: 880fbdd82180 ti: 880fc0c5a000 task.ti: 880fc0c5a000 RIP: 0010:[] [] pgtable_trans_huge_withdraw+0x38/0x60 RSP: 0018:880fc0c5bc88 EFLAGS: 00010297 RAX: ea17cebe8838 RBX: 0015309bd000 RCX: ea01f623b028 RDX: dead00100100 RSI: 8dcf77d84c30 RDI: 880fbda67580 RBP: 880fc0c5bc88 R08: 0013 R09: 00014da0 R10: 880fc0c5bc88 R11: 888f7efda000 R12: 8dcf77d84c30 R13: 880fc0c5bdf8 R14: 85cf401ff067 R15: 8b4de5fabff8 FS: () GS:880fffd8() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7768b0b8 CR3: 01a0b000 CR4: 000407e0 Stack: 880fc0c5bcc8 810f7643 880fc0c5bcc8 810d8297 ea1456237510 7fc7b0e0 7fc7b0c0 880fc0c5bda8 810d85ba 880fc0c5bd48 880fc0c5bd68 Call Trace: [] zap_huge_pmd+0x4c/0x101 [] ? tlb_flush_mmu+0x58/0x75 [] unmap_single_vma+0x306/0x7d6 [] unmap_vmas+0x4f/0x82 [] exit_mmap+0x8b/0x113 [] ? __delayacct_add_tsk+0x170/0x182 [] mmput+0x3e/0xc4 [] do_exit+0x380/0x907 [] ? vfs_write+0x149/0x1a3 [] do_group_exit+0x72/0x9b [] SyS_exit_group+0x12/0x16 [] system_call_fastpath+0x16/0x1b Code: 51 20 48 8d 41 20 48 39 c2 75 0d 48 c7 87 28 03 00 00 00 00 00 00 eb 36 48 8d 42 e0 48 89 87 28 03 00 00 48 8b 51 20 48 8b 41 28 <48> 89 42 08 48 89 10 48 ba 00 01 10 00 00 00 ad de 48 b8 00 02 RIP [] pgtable_trans_huge_withdraw+0x38/0x60 RSP ---[ end trace e5413b388b6ea448 ]--- Fixing recursive fault but reboot is needed! general protection fault: [#2] SMP Modules linked in: CPU: 268 PID: 1722 Comm: kworker/268:1 Tainted: G D 3.11.0-medusa-03121-g757f8ca #184 Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS 01/15/2013 Workqueue: events vmstat_update task: 880fc1a74280 ti: 880fc1a76000 task.ti: 880fc1a76000 RIP: 0010:[] [] free_pcppages_bulk+0x97/0x329 RSP: 0018:880fc1a77c98 EFLAGS: 00010082 RAX: 880fffd94d68 RBX: dead002001e0 RCX: 880fffd94d50 RDX: 880fffd94d68 RSI: 001f RDI: 888f7efdac68 RBP: 880fc1a77cf8 R08: 0400 R09: 81a8bf00 R10: 884f7efdac00 R11: 81009bae R12: dead00200200 R13: 888f7efdac00 R14: 001f R15: FS: () GS:880fffd8() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7768b0b8 CR3: 01a0b000 CR4: 000407e0 Stack: 880fc1a77ce8 880fffd94d68 0010 880fffd94d50 001ff9276a68 880fffd94d60 001f 880fffd94d50 0292 880fc1a77d38 880fffd95d05 Call Trace: [] drain_zone_pages+0x33/0x42 [] refresh_cpu_vm_stats+0xcc/0x11e [] vmstat_update+0x11/0x43 [] process_one_work+0x260/0x389 [] worker_thread+0x1e2/0x332 [] ? process_one_work+0x389/0x389 [] kthread+0xb3/0xbd [] ? process_one_work+0x389/0x389 [] ? kthread_freezable_should_stop+0x5b/0x5b [] ret_from_fork+0x7c/0xb0 [] ? kthread_freezable_should_stop+0x5b/0x5b Code: 48 89 55 c8 48 39 14 08 74 ce 41 83 fe 03 44 0f 44 75 c4 48 83 c2 08 48 89 45 b0 48 89 55 a8 48 8b 45 a8 4c 8b 20 49 8d 5c 24 e0 <48> 8b 53 20 48 8b 43 28 48 89 42 08 48 89 10 48 ba 00 01 10 00 RIP [] free_pcppages_bulk+0x97/0x329 RSP ---[ end trace e5413b388b6ea449 ]--- BUG: unable to handle kernel paging request at ffd8 IP: [] kthread_data+0xb/0x11 PGD 1a0c067 PUD 1a0e067 PMD 0 Oops: [#3] SMP Modules linked in: CPU: 268 PID: 1722 Comm: kworker/268:1 Tainted: G D 3.11.0-medusa-03121-g757f8ca #184 Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS 01/15/2013 task: 880fc1a74280 ti: 880fc1a76000 task.ti: 880fc1a76000 RIP: 0010:[] [] kthread_data+0xb/0x11 RSP:
RE: [PATCH 2/2] thp: support split page table lock
Naoya Horiguchi wrote: > Thp related code also uses per process mm->page_table_lock now. > So making it fine-grained can provide better performance. > > This patch makes thp support split page table lock by using page->ptl > of the pages storing "pmd_trans_huge" pmds. > > Some functions like pmd_trans_huge_lock() and page_check_address_pmd() > are expected by their caller to pass back the pointer of ptl, so this > patch adds to those functions new arguments for that. Rather than that, > this patch gives only straightforward replacement. > > ChangeLog v3: > - fixed argument of huge_pmd_lockptr() in copy_huge_pmd() > - added missing declaration of ptl in do_huge_pmd_anonymous_page() > > Signed-off-by: Naoya Horiguchi Generally, looks good. Few notes: I believe you need to convert __pte_alloc() to new locking. Not sure about __pte_alloc_kernel(). Have you check all rest mm->page_table_lock, that they shouldn't be converted to new locking? You use uninitialized_var() a lot. It's ugly. I've check few places (task_mmu.c, copy_huge_pmd) and have found a reason why we need it there. Why? You often do + ptl = huge_pmd_lockptr(mm, pmd); + spin_lock(ptl); Should we have a helper to combine them? huge_pmd_lock()? -- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 2/2] thp: support split page table lock
Naoya Horiguchi wrote: Thp related code also uses per process mm-page_table_lock now. So making it fine-grained can provide better performance. This patch makes thp support split page table lock by using page-ptl of the pages storing pmd_trans_huge pmds. Some functions like pmd_trans_huge_lock() and page_check_address_pmd() are expected by their caller to pass back the pointer of ptl, so this patch adds to those functions new arguments for that. Rather than that, this patch gives only straightforward replacement. ChangeLog v3: - fixed argument of huge_pmd_lockptr() in copy_huge_pmd() - added missing declaration of ptl in do_huge_pmd_anonymous_page() Signed-off-by: Naoya Horiguchi n-horigu...@ah.jp.nec.com Generally, looks good. Few notes: I believe you need to convert __pte_alloc() to new locking. Not sure about __pte_alloc_kernel(). Have you check all rest mm-page_table_lock, that they shouldn't be converted to new locking? You use uninitialized_var() a lot. It's ugly. I've check few places (task_mmu.c, copy_huge_pmd) and have found a reason why we need it there. Why? You often do + ptl = huge_pmd_lockptr(mm, pmd); + spin_lock(ptl); Should we have a helper to combine them? huge_pmd_lock()? -- Kirill A. Shutemov -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] thp: support split page table lock
On Thu, Sep 05, 2013 at 05:27:46PM -0400, Naoya Horiguchi wrote: Thp related code also uses per process mm-page_table_lock now. So making it fine-grained can provide better performance. This patch makes thp support split page table lock by using page-ptl of the pages storing pmd_trans_huge pmds. Some functions like pmd_trans_huge_lock() and page_check_address_pmd() are expected by their caller to pass back the pointer of ptl, so this patch adds to those functions new arguments for that. Rather than that, this patch gives only straightforward replacement. ChangeLog v3: - fixed argument of huge_pmd_lockptr() in copy_huge_pmd() - added missing declaration of ptl in do_huge_pmd_anonymous_page() I've applied these and tested them using the same tests program that I used when I was working on the same issue, and I'm running into some bugs. Here's a stack trace: general protection fault: [#1] SMP Modules linked in: CPU: 268 PID: 32381 Comm: memscale Not tainted 3.11.0-medusa-03121-g757f8ca #184 Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS 01/15/2013 task: 880fbdd82180 ti: 880fc0c5a000 task.ti: 880fc0c5a000 RIP: 0010:[810e3eef] [810e3eef] pgtable_trans_huge_withdraw+0x38/0x60 RSP: 0018:880fc0c5bc88 EFLAGS: 00010297 RAX: ea17cebe8838 RBX: 0015309bd000 RCX: ea01f623b028 RDX: dead00100100 RSI: 8dcf77d84c30 RDI: 880fbda67580 RBP: 880fc0c5bc88 R08: 0013 R09: 00014da0 R10: 880fc0c5bc88 R11: 888f7efda000 R12: 8dcf77d84c30 R13: 880fc0c5bdf8 R14: 85cf401ff067 R15: 8b4de5fabff8 FS: () GS:880fffd8() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7768b0b8 CR3: 01a0b000 CR4: 000407e0 Stack: 880fc0c5bcc8 810f7643 880fc0c5bcc8 810d8297 ea1456237510 7fc7b0e0 7fc7b0c0 880fc0c5bda8 810d85ba 880fc0c5bd48 880fc0c5bd68 Call Trace: [810f7643] zap_huge_pmd+0x4c/0x101 [810d8297] ? tlb_flush_mmu+0x58/0x75 [810d85ba] unmap_single_vma+0x306/0x7d6 [810d8ad9] unmap_vmas+0x4f/0x82 [810dab5e] exit_mmap+0x8b/0x113 [810a9743] ? __delayacct_add_tsk+0x170/0x182 [8103c609] mmput+0x3e/0xc4 [8104088c] do_exit+0x380/0x907 [810fb89c] ? vfs_write+0x149/0x1a3 [81040e85] do_group_exit+0x72/0x9b [81040ec0] SyS_exit_group+0x12/0x16 [814f52d2] system_call_fastpath+0x16/0x1b Code: 51 20 48 8d 41 20 48 39 c2 75 0d 48 c7 87 28 03 00 00 00 00 00 00 eb 36 48 8d 42 e0 48 89 87 28 03 00 00 48 8b 51 20 48 8b 41 28 48 89 42 08 48 89 10 48 ba 00 01 10 00 00 00 ad de 48 b8 00 02 RIP [810e3eef] pgtable_trans_huge_withdraw+0x38/0x60 RSP 880fc0c5bc88 ---[ end trace e5413b388b6ea448 ]--- Fixing recursive fault but reboot is needed! general protection fault: [#2] SMP Modules linked in: CPU: 268 PID: 1722 Comm: kworker/268:1 Tainted: G D 3.11.0-medusa-03121-g757f8ca #184 Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS 01/15/2013 Workqueue: events vmstat_update task: 880fc1a74280 ti: 880fc1a76000 task.ti: 880fc1a76000 RIP: 0010:[810bcdcb] [810bcdcb] free_pcppages_bulk+0x97/0x329 RSP: 0018:880fc1a77c98 EFLAGS: 00010082 RAX: 880fffd94d68 RBX: dead002001e0 RCX: 880fffd94d50 RDX: 880fffd94d68 RSI: 001f RDI: 888f7efdac68 RBP: 880fc1a77cf8 R08: 0400 R09: 81a8bf00 R10: 884f7efdac00 R11: 81009bae R12: dead00200200 R13: 888f7efdac00 R14: 001f R15: FS: () GS:880fffd8() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7768b0b8 CR3: 01a0b000 CR4: 000407e0 Stack: 880fc1a77ce8 880fffd94d68 0010 880fffd94d50 001ff9276a68 880fffd94d60 001f 880fffd94d50 0292 880fc1a77d38 880fffd95d05 Call Trace: [810bd149] drain_zone_pages+0x33/0x42 [810cd5a6] refresh_cpu_vm_stats+0xcc/0x11e [810cd609] vmstat_update+0x11/0x43 [8105350f] process_one_work+0x260/0x389 [8105381a] worker_thread+0x1e2/0x332 [81053638] ? process_one_work+0x389/0x389 [810579df] kthread+0xb3/0xbd [81053638] ? process_one_work+0x389/0x389 [8105792c] ? kthread_freezable_should_stop+0x5b/0x5b [814f522c] ret_from_fork+0x7c/0xb0 [8105792c] ? kthread_freezable_should_stop+0x5b/0x5b Code: 48 89 55 c8 48 39 14 08 74 ce 41 83 fe 03 44 0f 44 75 c4 48 83 c2 08 48 89 45 b0 48 89 55 a8 48 8b 45 a8 4c 8b 20 49 8d 5c 24 e0 48 8b 53 20 48 8b 43 28 48 89 42 08 48 89 10 48 ba 00 01 10 00 RIP [810bcdcb] free_pcppages_bulk+0x97/0x329 RSP 880fc1a77c98 ---[ end
Re: [PATCH 2/2] thp: support split page table lock
Hi Kirill, On Fri, Sep 06, 2013 at 01:48:03PM +0300, Kirill A. Shutemov wrote: Naoya Horiguchi wrote: Thp related code also uses per process mm-page_table_lock now. So making it fine-grained can provide better performance. This patch makes thp support split page table lock by using page-ptl of the pages storing pmd_trans_huge pmds. Some functions like pmd_trans_huge_lock() and page_check_address_pmd() are expected by their caller to pass back the pointer of ptl, so this patch adds to those functions new arguments for that. Rather than that, this patch gives only straightforward replacement. ChangeLog v3: - fixed argument of huge_pmd_lockptr() in copy_huge_pmd() - added missing declaration of ptl in do_huge_pmd_anonymous_page() Signed-off-by: Naoya Horiguchi n-horigu...@ah.jp.nec.com Generally, looks good. Few notes: I believe you need to convert __pte_alloc() to new locking. Not sure about __pte_alloc_kernel(). Have you check all rest mm-page_table_lock, that they shouldn't be converted to new locking? I thought that keeping __pte_alloc() using mm-page_table_lock was safe because it uses bare mm-page_table_lock instead of pte_lockptr() even before this patchset, but not 100% sure. __pte_alloc() (and its family) are used in normal page path, so if it's not safe, we've lived with unsafe code for very long (maybe since 2005). Anyway, converting __pte_alloc() into split ptl could improve performance (though we need testing to know what amount), so I'll try that. You use uninitialized_var() a lot. It's ugly. I've check few places (task_mmu.c, copy_huge_pmd) and have found a reason why we need it there. Why? I got a compile warning of uninitialized usage when developing and added to suppress it, but in the final form I never get such a warning. So I'll remove this uninitialized_var()s. You often do + ptl = huge_pmd_lockptr(mm, pmd); + spin_lock(ptl); Should we have a helper to combine them? huge_pmd_lock()? OK, I'll do it. Thanks, Naoya Horiguchi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] thp: support split page table lock
Naoya Horiguchi wrote: Hi Kirill, On Fri, Sep 06, 2013 at 01:48:03PM +0300, Kirill A. Shutemov wrote: Naoya Horiguchi wrote: Thp related code also uses per process mm-page_table_lock now. So making it fine-grained can provide better performance. This patch makes thp support split page table lock by using page-ptl of the pages storing pmd_trans_huge pmds. Some functions like pmd_trans_huge_lock() and page_check_address_pmd() are expected by their caller to pass back the pointer of ptl, so this patch adds to those functions new arguments for that. Rather than that, this patch gives only straightforward replacement. ChangeLog v3: - fixed argument of huge_pmd_lockptr() in copy_huge_pmd() - added missing declaration of ptl in do_huge_pmd_anonymous_page() Signed-off-by: Naoya Horiguchi n-horigu...@ah.jp.nec.com Generally, looks good. Few notes: I believe you need to convert __pte_alloc() to new locking. Not sure about __pte_alloc_kernel(). Have you check all rest mm-page_table_lock, that they shouldn't be converted to new locking? I thought that keeping __pte_alloc() using mm-page_table_lock was safe because it uses bare mm-page_table_lock instead of pte_lockptr() even before this patchset, but not 100% sure. __pte_alloc() (and its family) are used in normal page path, so if it's not safe, we've lived with unsafe code for very long (maybe since 2005). Anyway, converting __pte_alloc() into split ptl could improve performance (though we need testing to know what amount), so I'll try that. No, before the patch mm-page_table_lock is what we need: it serializes setting up pmd, not adding pte to pmd and it's subject to change with new locking model. -- Kirill A. Shutemov -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] thp: support split page table lock
On Fri, Sep 06, 2013 at 07:46:25PM +0300, Kirill A. Shutemov wrote: Naoya Horiguchi wrote: Hi Kirill, On Fri, Sep 06, 2013 at 01:48:03PM +0300, Kirill A. Shutemov wrote: Naoya Horiguchi wrote: Thp related code also uses per process mm-page_table_lock now. So making it fine-grained can provide better performance. This patch makes thp support split page table lock by using page-ptl of the pages storing pmd_trans_huge pmds. Some functions like pmd_trans_huge_lock() and page_check_address_pmd() are expected by their caller to pass back the pointer of ptl, so this patch adds to those functions new arguments for that. Rather than that, this patch gives only straightforward replacement. ChangeLog v3: - fixed argument of huge_pmd_lockptr() in copy_huge_pmd() - added missing declaration of ptl in do_huge_pmd_anonymous_page() Signed-off-by: Naoya Horiguchi n-horigu...@ah.jp.nec.com Generally, looks good. Few notes: I believe you need to convert __pte_alloc() to new locking. Not sure about __pte_alloc_kernel(). Have you check all rest mm-page_table_lock, that they shouldn't be converted to new locking? I thought that keeping __pte_alloc() using mm-page_table_lock was safe because it uses bare mm-page_table_lock instead of pte_lockptr() even before this patchset, but not 100% sure. __pte_alloc() (and its family) are used in normal page path, so if it's not safe, we've lived with unsafe code for very long (maybe since 2005). Anyway, converting __pte_alloc() into split ptl could improve performance (though we need testing to know what amount), so I'll try that. No, before the patch mm-page_table_lock is what we need: it serializes setting up pmd, not adding pte to pmd and it's subject to change with new locking model. OK, I see. This is a todo for the next post. Thank you for explanation. Naoya -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] thp: support split page table lock
Hi Alex, On Fri, Sep 06, 2013 at 11:04:23AM -0500, Alex Thorlton wrote: On Thu, Sep 05, 2013 at 05:27:46PM -0400, Naoya Horiguchi wrote: Thp related code also uses per process mm-page_table_lock now. So making it fine-grained can provide better performance. This patch makes thp support split page table lock by using page-ptl of the pages storing pmd_trans_huge pmds. Some functions like pmd_trans_huge_lock() and page_check_address_pmd() are expected by their caller to pass back the pointer of ptl, so this patch adds to those functions new arguments for that. Rather than that, this patch gives only straightforward replacement. ChangeLog v3: - fixed argument of huge_pmd_lockptr() in copy_huge_pmd() - added missing declaration of ptl in do_huge_pmd_anonymous_page() I've applied these and tested them using the same tests program that I used when I was working on the same issue, and I'm running into some bugs. Here's a stack trace: Thank you for helping testing. This bug is new to me. general protection fault: [#1] SMP Modules linked in: CPU: 268 PID: 32381 Comm: memscale Not tainted 3.11.0-medusa-03121-g757f8ca #184 Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS 01/15/2013 task: 880fbdd82180 ti: 880fc0c5a000 task.ti: 880fc0c5a000 RIP: 0010:[810e3eef] [810e3eef] pgtable_trans_huge_withdraw+0x38/0x60 RSP: 0018:880fc0c5bc88 EFLAGS: 00010297 RAX: ea17cebe8838 RBX: 0015309bd000 RCX: ea01f623b028 RDX: dead00100100 RSI: 8dcf77d84c30 RDI: 880fbda67580 RBP: 880fc0c5bc88 R08: 0013 R09: 00014da0 R10: 880fc0c5bc88 R11: 888f7efda000 R12: 8dcf77d84c30 R13: 880fc0c5bdf8 R14: 85cf401ff067 R15: 8b4de5fabff8 FS: () GS:880fffd8() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7768b0b8 CR3: 01a0b000 CR4: 000407e0 Stack: 880fc0c5bcc8 810f7643 880fc0c5bcc8 810d8297 ea1456237510 7fc7b0e0 7fc7b0c0 880fc0c5bda8 810d85ba 880fc0c5bd48 880fc0c5bd68 Call Trace: [810f7643] zap_huge_pmd+0x4c/0x101 [810d8297] ? tlb_flush_mmu+0x58/0x75 [810d85ba] unmap_single_vma+0x306/0x7d6 [810d8ad9] unmap_vmas+0x4f/0x82 [810dab5e] exit_mmap+0x8b/0x113 [810a9743] ? __delayacct_add_tsk+0x170/0x182 [8103c609] mmput+0x3e/0xc4 [8104088c] do_exit+0x380/0x907 [810fb89c] ? vfs_write+0x149/0x1a3 [81040e85] do_group_exit+0x72/0x9b [81040ec0] SyS_exit_group+0x12/0x16 [814f52d2] system_call_fastpath+0x16/0x1b Code: 51 20 48 8d 41 20 48 39 c2 75 0d 48 c7 87 28 03 00 00 00 00 00 00 eb 36 48 8d 42 e0 48 89 87 28 03 00 00 48 8b 51 20 48 8b 41 28 48 89 42 08 48 89 10 48 ba 00 01 10 00 00 00 ad de 48 b8 00 02 RIP [810e3eef] pgtable_trans_huge_withdraw+0x38/0x60 RSP 880fc0c5bc88 ---[ end trace e5413b388b6ea448 ]--- Fixing recursive fault but reboot is needed! general protection fault: [#2] SMP Modules linked in: CPU: 268 PID: 1722 Comm: kworker/268:1 Tainted: G D 3.11.0-medusa-03121-g757f8ca #184 Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS 01/15/2013 Workqueue: events vmstat_update task: 880fc1a74280 ti: 880fc1a76000 task.ti: 880fc1a76000 RIP: 0010:[810bcdcb] [810bcdcb] free_pcppages_bulk+0x97/0x329 RSP: 0018:880fc1a77c98 EFLAGS: 00010082 RAX: 880fffd94d68 RBX: dead002001e0 RCX: 880fffd94d50 RDX: 880fffd94d68 RSI: 001f RDI: 888f7efdac68 RBP: 880fc1a77cf8 R08: 0400 R09: 81a8bf00 R10: 884f7efdac00 R11: 81009bae R12: dead00200200 R13: 888f7efdac00 R14: 001f R15: FS: () GS:880fffd8() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7768b0b8 CR3: 01a0b000 CR4: 000407e0 Stack: 880fc1a77ce8 880fffd94d68 0010 880fffd94d50 001ff9276a68 880fffd94d60 001f 880fffd94d50 0292 880fc1a77d38 880fffd95d05 Call Trace: [810bd149] drain_zone_pages+0x33/0x42 [810cd5a6] refresh_cpu_vm_stats+0xcc/0x11e [810cd609] vmstat_update+0x11/0x43 [8105350f] process_one_work+0x260/0x389 [8105381a] worker_thread+0x1e2/0x332 [81053638] ? process_one_work+0x389/0x389 [810579df] kthread+0xb3/0xbd [81053638] ? process_one_work+0x389/0x389 [8105792c] ? kthread_freezable_should_stop+0x5b/0x5b [814f522c] ret_from_fork+0x7c/0xb0 [8105792c] ? kthread_freezable_should_stop+0x5b/0x5b Code: 48 89 55 c8 48 39 14 08 74 ce 41 83 fe 03 44 0f 44 75
Re: [PATCH 2/2] thp: support split page table lock
> diff --git v3.11-rc3.orig/mm/huge_memory.c v3.11-rc3/mm/huge_memory.c > index 243e710..3cb29e1 100644 > --- v3.11-rc3.orig/mm/huge_memory.c > +++ v3.11-rc3/mm/huge_memory.c ... > @@ -864,14 +868,17 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct > mm_struct *src_mm, > pmd_t pmd; > pgtable_t pgtable; > int ret; > + spinlock_t *uninitialized_var(dst_ptl), *uninitialized_var(src_ptl); > > ret = -ENOMEM; > pgtable = pte_alloc_one(dst_mm, addr); > if (unlikely(!pgtable)) > goto out; > > - spin_lock(_mm->page_table_lock); > - spin_lock_nested(_mm->page_table_lock, SINGLE_DEPTH_NESTING); > + dst_ptl = huge_pmd_lockptr(dst_mm, dst_ptl); > + src_ptl = huge_pmd_lockptr(src_mm, src_ptl); I found one mistake. This should be: + dst_ptl = huge_pmd_lockptr(dst_mm, dst_pmd); + src_ptl = huge_pmd_lockptr(src_mm, src_pmd); Thanks, Naoya Horiguchi > + spin_lock(dst_ptl); > + spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING); > > ret = -EAGAIN; > pmd = *src_pmd; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] thp: support split page table lock
diff --git v3.11-rc3.orig/mm/huge_memory.c v3.11-rc3/mm/huge_memory.c index 243e710..3cb29e1 100644 --- v3.11-rc3.orig/mm/huge_memory.c +++ v3.11-rc3/mm/huge_memory.c ... @@ -864,14 +868,17 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, pmd_t pmd; pgtable_t pgtable; int ret; + spinlock_t *uninitialized_var(dst_ptl), *uninitialized_var(src_ptl); ret = -ENOMEM; pgtable = pte_alloc_one(dst_mm, addr); if (unlikely(!pgtable)) goto out; - spin_lock(dst_mm-page_table_lock); - spin_lock_nested(src_mm-page_table_lock, SINGLE_DEPTH_NESTING); + dst_ptl = huge_pmd_lockptr(dst_mm, dst_ptl); + src_ptl = huge_pmd_lockptr(src_mm, src_ptl); I found one mistake. This should be: + dst_ptl = huge_pmd_lockptr(dst_mm, dst_pmd); + src_ptl = huge_pmd_lockptr(src_mm, src_pmd); Thanks, Naoya Horiguchi + spin_lock(dst_ptl); + spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING); ret = -EAGAIN; pmd = *src_pmd; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] thp: support split page table lock
Kirill, thank you for the comment. On Mon, Sep 02, 2013 at 01:53:27PM +0300, Kirill A. Shutemov wrote: > Naoya Horiguchi wrote: > > Thp related code also uses per process mm->page_table_lock now. So making > > it fine-grained can provide better performance. > > > > This patch makes thp support split page table lock which makes us use > > page->ptl of the pages storing "pmd_trans_huge" pmds. > > Hm. So, you use page->ptl only when you deal with thp pages, otherwise > mm->page_table_lock, right? Maybe it's not enough. We use page->ptl for both of thp and normal depending on USE_SPLIT_PTLOCKS. And regardless of USE_SPLIT_PTLOCKS, mm->page_table_lock is still used by other contexts like memory initialization code or driver code for their specific usage. > It looks inconsistent to me. Does it mean we have to take both locks on > split and collapse paths? This patch includes the replacement with page->ptl for split/collapse path. > I'm not sure if it's safe to take only > page->ptl for alloc path. Probably not. Right, it's not safe. > Why not to use new locking for pmd everywhere? So I already do this. Thanks, Naoya Horiguchi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 2/2] thp: support split page table lock
Naoya Horiguchi wrote: > Thp related code also uses per process mm->page_table_lock now. So making > it fine-grained can provide better performance. > > This patch makes thp support split page table lock which makes us use > page->ptl of the pages storing "pmd_trans_huge" pmds. Hm. So, you use page->ptl only when you deal with thp pages, otherwise mm->page_table_lock, right? It looks inconsistent to me. Does it mean we have to take both locks on split and collapse paths? I'm not sure if it's safe to take only page->ptl for alloc path. Probably not. Why not to use new locking for pmd everywhere? -- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 2/2] thp: support split page table lock
Naoya Horiguchi wrote: Thp related code also uses per process mm-page_table_lock now. So making it fine-grained can provide better performance. This patch makes thp support split page table lock which makes us use page-ptl of the pages storing pmd_trans_huge pmds. Hm. So, you use page-ptl only when you deal with thp pages, otherwise mm-page_table_lock, right? It looks inconsistent to me. Does it mean we have to take both locks on split and collapse paths? I'm not sure if it's safe to take only page-ptl for alloc path. Probably not. Why not to use new locking for pmd everywhere? -- Kirill A. Shutemov -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] thp: support split page table lock
Kirill, thank you for the comment. On Mon, Sep 02, 2013 at 01:53:27PM +0300, Kirill A. Shutemov wrote: Naoya Horiguchi wrote: Thp related code also uses per process mm-page_table_lock now. So making it fine-grained can provide better performance. This patch makes thp support split page table lock which makes us use page-ptl of the pages storing pmd_trans_huge pmds. Hm. So, you use page-ptl only when you deal with thp pages, otherwise mm-page_table_lock, right? Maybe it's not enough. We use page-ptl for both of thp and normal depending on USE_SPLIT_PTLOCKS. And regardless of USE_SPLIT_PTLOCKS, mm-page_table_lock is still used by other contexts like memory initialization code or driver code for their specific usage. It looks inconsistent to me. Does it mean we have to take both locks on split and collapse paths? This patch includes the replacement with page-ptl for split/collapse path. I'm not sure if it's safe to take only page-ptl for alloc path. Probably not. Right, it's not safe. Why not to use new locking for pmd everywhere? So I already do this. Thanks, Naoya Horiguchi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/