Re: [PATCH 34/34] mm: dynamically allocate page->ptl if it cannot be embedded to struct page

2013-11-20 Thread Andrey Wagin
2013/11/20 Kirill A. Shutemov :
> Andrey Wagin wrote:
>> 2013/11/20 Kirill A. Shutemov :
>> > Andrey Wagin wrote:
>> >> Hi Kirill,
>> >>
>> >> Looks like this patch adds memory leaks.
>> >> [  116.188310] kmemleak: 15672 new suspected memory leaks (see
>> >> /sys/kernel/debug/kmemleak)
>> >> unreferenced object 0x8800da45a350 (size 96):
>> >>   comm "dracut-initqueu", pid 93, jiffies 4294671391 (age 362.277s)
>> >>   hex dump (first 32 bytes):
>> >> 07 00 07 00 ad 4e ad de ff ff ff ff 6b 6b 6b 6b  .N..
>> >> ff ff ff ff ff ff ff ff 80 24 b4 82 ff ff ff ff  .$..
>> >>   backtrace:
>> >> [] kmemleak_alloc+0x5e/0xc0
>> >> [] kmem_cache_alloc_trace+0x113/0x290
>> >> [] __ptlock_alloc+0x27/0x50
>> >> [] __pmd_alloc+0x59/0x170
>> >> [] copy_page_range+0x38a/0x3e0
>> >> [] dup_mm+0x313/0x540
>> >> [] copy_process+0x161a/0x1880
>> >> [] do_fork+0x8b/0x360
>> >> [] SyS_clone+0x16/0x20
>> >> [] stub_clone+0x69/0x90
>> >> [] 0x
>> >>
>> >> It's quite serious, because my test host went to panic in a few hours.
>> >
>> > Sorry for that.
>> >
>> > Could you test patch below.
>>
>> Yes, it works.
>>
>> I found this too a few minutes ago:)
>
> Nice
>
> Tested-by ?

Tested-by: Andrey Vagin 

>
> --
>  Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 34/34] mm: dynamically allocate page->ptl if it cannot be embedded to struct page

2013-11-20 Thread Kirill A. Shutemov
Andrey Wagin wrote:
> 2013/11/20 Kirill A. Shutemov :
> > Andrey Wagin wrote:
> >> Hi Kirill,
> >>
> >> Looks like this patch adds memory leaks.
> >> [  116.188310] kmemleak: 15672 new suspected memory leaks (see
> >> /sys/kernel/debug/kmemleak)
> >> unreferenced object 0x8800da45a350 (size 96):
> >>   comm "dracut-initqueu", pid 93, jiffies 4294671391 (age 362.277s)
> >>   hex dump (first 32 bytes):
> >> 07 00 07 00 ad 4e ad de ff ff ff ff 6b 6b 6b 6b  .N..
> >> ff ff ff ff ff ff ff ff 80 24 b4 82 ff ff ff ff  .$..
> >>   backtrace:
> >> [] kmemleak_alloc+0x5e/0xc0
> >> [] kmem_cache_alloc_trace+0x113/0x290
> >> [] __ptlock_alloc+0x27/0x50
> >> [] __pmd_alloc+0x59/0x170
> >> [] copy_page_range+0x38a/0x3e0
> >> [] dup_mm+0x313/0x540
> >> [] copy_process+0x161a/0x1880
> >> [] do_fork+0x8b/0x360
> >> [] SyS_clone+0x16/0x20
> >> [] stub_clone+0x69/0x90
> >> [] 0x
> >>
> >> It's quite serious, because my test host went to panic in a few hours.
> >
> > Sorry for that.
> >
> > Could you test patch below.
> 
> Yes, it works.
> 
> I found this too a few minutes ago:)

Nice

Tested-by ?

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 34/34] mm: dynamically allocate page->ptl if it cannot be embedded to struct page

2013-11-20 Thread Andrey Wagin
2013/11/20 Kirill A. Shutemov :
> Andrey Wagin wrote:
>> Hi Kirill,
>>
>> Looks like this patch adds memory leaks.
>> [  116.188310] kmemleak: 15672 new suspected memory leaks (see
>> /sys/kernel/debug/kmemleak)
>> unreferenced object 0x8800da45a350 (size 96):
>>   comm "dracut-initqueu", pid 93, jiffies 4294671391 (age 362.277s)
>>   hex dump (first 32 bytes):
>> 07 00 07 00 ad 4e ad de ff ff ff ff 6b 6b 6b 6b  .N..
>> ff ff ff ff ff ff ff ff 80 24 b4 82 ff ff ff ff  .$..
>>   backtrace:
>> [] kmemleak_alloc+0x5e/0xc0
>> [] kmem_cache_alloc_trace+0x113/0x290
>> [] __ptlock_alloc+0x27/0x50
>> [] __pmd_alloc+0x59/0x170
>> [] copy_page_range+0x38a/0x3e0
>> [] dup_mm+0x313/0x540
>> [] copy_process+0x161a/0x1880
>> [] do_fork+0x8b/0x360
>> [] SyS_clone+0x16/0x20
>> [] stub_clone+0x69/0x90
>> [] 0x
>>
>> It's quite serious, because my test host went to panic in a few hours.
>
> Sorry for that.
>
> Could you test patch below.

Yes, it works.

I found this too a few minutes ago:)

diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index a7cccb6..44c366c 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -62,6 +62,7 @@ void ___pte_free_tlb(struct mmu_gather *tlb, struct page *pte)
 void ___pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd)
 {
paravirt_release_pmd(__pa(pmd) >> PAGE_SHIFT);
+   pgtable_pmd_page_dtor(virt_to_page(pmd));
/*
 * NOTE! For PAE, any changes to the top page-directory-pointer-table
 * entries need a full cr3 reload to flush.

Thanks.

>
> diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
> index a7cccb6d7fec..7be5809754cf 100644
> --- a/arch/x86/mm/pgtable.c
> +++ b/arch/x86/mm/pgtable.c
> @@ -61,6 +61,7 @@ void ___pte_free_tlb(struct mmu_gather *tlb, struct page 
> *pte)
>  #if PAGETABLE_LEVELS > 2
>  void ___pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd)
>  {
> +   struct page *page = virt_to_page(pmd);
> paravirt_release_pmd(__pa(pmd) >> PAGE_SHIFT);
> /*
>  * NOTE! For PAE, any changes to the top page-directory-pointer-table
> @@ -69,7 +70,8 @@ void ___pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd)
>  #ifdef CONFIG_X86_PAE
> tlb->need_flush_all = 1;
>  #endif
> -   tlb_remove_page(tlb, virt_to_page(pmd));
> +   pgtable_pmd_page_dtor(page);
> +   tlb_remove_page(tlb, page);
>  }
>
>  #if PAGETABLE_LEVELS > 3
> --
>  Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 34/34] mm: dynamically allocate page->ptl if it cannot be embedded to struct page

2013-11-20 Thread Kirill A. Shutemov
Andrey Wagin wrote:
> Hi Kirill,
> 
> Looks like this patch adds memory leaks.
> [  116.188310] kmemleak: 15672 new suspected memory leaks (see
> /sys/kernel/debug/kmemleak)
> unreferenced object 0x8800da45a350 (size 96):
>   comm "dracut-initqueu", pid 93, jiffies 4294671391 (age 362.277s)
>   hex dump (first 32 bytes):
> 07 00 07 00 ad 4e ad de ff ff ff ff 6b 6b 6b 6b  .N..
> ff ff ff ff ff ff ff ff 80 24 b4 82 ff ff ff ff  .$..
>   backtrace:
> [] kmemleak_alloc+0x5e/0xc0
> [] kmem_cache_alloc_trace+0x113/0x290
> [] __ptlock_alloc+0x27/0x50
> [] __pmd_alloc+0x59/0x170
> [] copy_page_range+0x38a/0x3e0
> [] dup_mm+0x313/0x540
> [] copy_process+0x161a/0x1880
> [] do_fork+0x8b/0x360
> [] SyS_clone+0x16/0x20
> [] stub_clone+0x69/0x90
> [] 0x
> 
> It's quite serious, because my test host went to panic in a few hours.

Sorry for that.

Could you test patch below.

diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index a7cccb6d7fec..7be5809754cf 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -61,6 +61,7 @@ void ___pte_free_tlb(struct mmu_gather *tlb, struct page *pte)
 #if PAGETABLE_LEVELS > 2
 void ___pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd)
 {
+   struct page *page = virt_to_page(pmd);
paravirt_release_pmd(__pa(pmd) >> PAGE_SHIFT);
/*
 * NOTE! For PAE, any changes to the top page-directory-pointer-table
@@ -69,7 +70,8 @@ void ___pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd)
 #ifdef CONFIG_X86_PAE
tlb->need_flush_all = 1;
 #endif
-   tlb_remove_page(tlb, virt_to_page(pmd));
+   pgtable_pmd_page_dtor(page);
+   tlb_remove_page(tlb, page);
 }
 
 #if PAGETABLE_LEVELS > 3
-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 34/34] mm: dynamically allocate page->ptl if it cannot be embedded to struct page

2013-11-20 Thread Andrey Wagin
Hi Kirill,

Looks like this patch adds memory leaks.
[  116.188310] kmemleak: 15672 new suspected memory leaks (see
/sys/kernel/debug/kmemleak)
unreferenced object 0x8800da45a350 (size 96):
  comm "dracut-initqueu", pid 93, jiffies 4294671391 (age 362.277s)
  hex dump (first 32 bytes):
07 00 07 00 ad 4e ad de ff ff ff ff 6b 6b 6b 6b  .N..
ff ff ff ff ff ff ff ff 80 24 b4 82 ff ff ff ff  .$..
  backtrace:
[] kmemleak_alloc+0x5e/0xc0
[] kmem_cache_alloc_trace+0x113/0x290
[] __ptlock_alloc+0x27/0x50
[] __pmd_alloc+0x59/0x170
[] copy_page_range+0x38a/0x3e0
[] dup_mm+0x313/0x540
[] copy_process+0x161a/0x1880
[] do_fork+0x8b/0x360
[] SyS_clone+0x16/0x20
[] stub_clone+0x69/0x90
[] 0x

It's quite serious, because my test host went to panic in a few hours.

[12000.632734] kmemleak: 74155 new suspected memory leaks (see
/sys/kernel/debug/kmemleak)
[12080.734075] zombie00[29282]: segfault at 0 ip 00401862 sp
7fffc509bc20 error 6 in zombie00[40+5000]
[12619.799052] BUG: unable to handle kernel paging request at 7aa9e3a0
[12619.800044] IP: [] cpuacct_charge+0x97/0x1e0
[12619.800044] PGD 0
[12619.800044] Thread overran stack, or stack corrupted
[12619.800044] Oops:  [#1] SMP
[12619.800044] Modules linked in: binfmt_misc ip6table_filter
ip6_tables tun netlink_diag af_packet_diag udp_diag tcp_diag inet_diag
unix_diag joydev microcode pcspkr i2c_piix4 virtio_balloon virtio_net
i2c_core virtio_blk floppy
[12619.800044] CPU: 1 PID: 1324 Comm: kworker/u4:2 Not tainted 3.12.0+ #142
[12619.800044] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[12619.800044] Workqueue: writeback bdi_writeback_workfn (flush-252:0)
[12619.800044] task: 88001f1a8000 ti: 880096f26000 task.ti:
880096f26000
[12619.800044] RIP: 0010:[]  []
cpuacct_charge+0x97/0x1e0
[12619.800044] RSP: 0018:88011b403ce8  EFLAGS: 00010002
[12619.800044] RAX: d580 RBX: 000f11b1 RCX: 0003
[12619.800044] RDX: 81c49e40 RSI: 81c4bb00 RDI: 88001f1a8c68
[12619.800044] RBP: 88011b403d18 R08: 0001 R09: 0001
[12619.800044] R10: 0001 R11: 0007 R12: 88001f1a8000
[12619.800044] R13: 1f1a8000 R14: 82a86320 R15: 06b1bda1e433
[12619.800044] FS:  () GS:88011b40()
knlGS:
[12619.800044] CS:  0010 DS:  ES:  CR0: 8005003b
[12619.800044] CR2: 7aa9e3a0 CR3: 01c0b000 CR4: 06e0
[12619.800044] Stack:
[12619.800044]  810b2b70 0002 88011b5d40c0
000f11b1
[12619.800044]  88001f1a8068 88001f1a8000 88011b403d58
810a108f
[12619.800044]  88011b403d88 88001f1a8068 88011b5d40c0
88011b5d4000
[12619.800044] Call Trace:
[12619.800044]  
[12619.800044]  [] ? cpuacct_css_alloc+0xb0/0xb0
[12619.800044]  [] update_curr+0x13f/0x230
[12619.800044]  [] task_tick_fair+0x2d7/0x650
[12619.800044]  [] ? sched_clock_cpu+0xb8/0x120
[12619.800044]  [] scheduler_tick+0x6d/0xf0
[12619.800044]  [] update_process_times+0x61/0x80
[12619.800044]  [] tick_sched_handle+0x37/0x80
[12619.800044]  [] tick_sched_timer+0x54/0x90
[12619.800044]  [] __run_hrtimer+0x71/0x2d0
[12619.800044]  [] ? tick_nohz_handler+0xc0/0xc0
[12619.800044]  [] hrtimer_interrupt+0x116/0x2a0
[12619.800044]  [] ? __local_bh_enable+0x49/0x70
[12619.800044]  [] local_apic_timer_interrupt+0x3b/0x60
[12619.800044]  [] smp_apic_timer_interrupt+0x45/0x60
[12619.800044]  [] apic_timer_interrupt+0x6f/0x80
[12619.800044]  
[12619.800044]  [] ? mark_held_locks+0x90/0x150
[12619.800044]  [] ? _raw_spin_unlock_irqrestore+0x42/0x70
[12619.800044]  [] virtio_queue_rq+0xdb/0x1b0 [virtio_blk]
[12619.800044]  [] __blk_mq_run_hw_queue+0x1ca/0x520
[12619.800044]  [] blk_mq_run_hw_queue+0x35/0x40
[12619.800044]  [] blk_mq_insert_requests+0xe2/0x190
[12619.800044]  [] blk_mq_flush_plug_list+0x134/0x150
[12619.800044]  [] blk_flush_plug_list+0xbd/0x220
[12619.800044]  [] blk_mq_make_request+0x3da/0x4d0
[12619.800044]  [] generic_make_request+0xca/0x100
[12619.800044]  [] submit_bio+0x76/0x160
[12619.800044]  [] ? test_set_page_writeback+0x36/0x2b0
[12619.800044]  [] ? end_swap_bio_read+0xc0/0xc0
[12619.800044]  [] __swap_writepage+0x198/0x230
[12619.800044]  [] ? _raw_spin_unlock+0x2b/0x40
[12619.800044]  [] ? page_swapcount+0x53/0x70
[12619.800044]  [] swap_writepage+0x43/0x90
[12619.800044]  [] shrink_page_list+0x6cf/0xaa0
[12619.800044]  [] shrink_inactive_list+0x1c2/0x5b0
[12619.800044]  [] ? __lock_acquire+0x23f/0x1810
[12619.800044]  [] shrink_lruvec+0x335/0x600
[12619.800044]  [] ? mem_cgroup_iter+0x1f5/0x510
[12619.800044]  [] shrink_zone+0x96/0x1d0
[12619.800044]  [] do_try_to_free_pages+0x103/0x600
[12619.800044]  [] ? sched_clock_local+0x25/0x90
[12619.800044]  [] try_to_free_pages+0x222/0x440
[12619.800044]  [] __alloc_pages_nodemask+0x8af/0xc70

Re: [PATCH 34/34] mm: dynamically allocate page-ptl if it cannot be embedded to struct page

2013-11-20 Thread Andrey Wagin
Hi Kirill,

Looks like this patch adds memory leaks.
[  116.188310] kmemleak: 15672 new suspected memory leaks (see
/sys/kernel/debug/kmemleak)
unreferenced object 0x8800da45a350 (size 96):
  comm dracut-initqueu, pid 93, jiffies 4294671391 (age 362.277s)
  hex dump (first 32 bytes):
07 00 07 00 ad 4e ad de ff ff ff ff 6b 6b 6b 6b  .N..
ff ff ff ff ff ff ff ff 80 24 b4 82 ff ff ff ff  .$..
  backtrace:
[817152fe] kmemleak_alloc+0x5e/0xc0
[811c34f3] kmem_cache_alloc_trace+0x113/0x290
[811920f7] __ptlock_alloc+0x27/0x50
[81192849] __pmd_alloc+0x59/0x170
[81195ffa] copy_page_range+0x38a/0x3e0
[8105a013] dup_mm+0x313/0x540
[8105b9da] copy_process+0x161a/0x1880
[8105c01b] do_fork+0x8b/0x360
[8105c306] SyS_clone+0x16/0x20
[81727b79] stub_clone+0x69/0x90
[] 0x

It's quite serious, because my test host went to panic in a few hours.

[12000.632734] kmemleak: 74155 new suspected memory leaks (see
/sys/kernel/debug/kmemleak)
[12080.734075] zombie00[29282]: segfault at 0 ip 00401862 sp
7fffc509bc20 error 6 in zombie00[40+5000]
[12619.799052] BUG: unable to handle kernel paging request at 7aa9e3a0
[12619.800044] IP: [810b2c07] cpuacct_charge+0x97/0x1e0
[12619.800044] PGD 0
[12619.800044] Thread overran stack, or stack corrupted
[12619.800044] Oops:  [#1] SMP
[12619.800044] Modules linked in: binfmt_misc ip6table_filter
ip6_tables tun netlink_diag af_packet_diag udp_diag tcp_diag inet_diag
unix_diag joydev microcode pcspkr i2c_piix4 virtio_balloon virtio_net
i2c_core virtio_blk floppy
[12619.800044] CPU: 1 PID: 1324 Comm: kworker/u4:2 Not tainted 3.12.0+ #142
[12619.800044] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[12619.800044] Workqueue: writeback bdi_writeback_workfn (flush-252:0)
[12619.800044] task: 88001f1a8000 ti: 880096f26000 task.ti:
880096f26000
[12619.800044] RIP: 0010:[810b2c07]  [810b2c07]
cpuacct_charge+0x97/0x1e0
[12619.800044] RSP: 0018:88011b403ce8  EFLAGS: 00010002
[12619.800044] RAX: d580 RBX: 000f11b1 RCX: 0003
[12619.800044] RDX: 81c49e40 RSI: 81c4bb00 RDI: 88001f1a8c68
[12619.800044] RBP: 88011b403d18 R08: 0001 R09: 0001
[12619.800044] R10: 0001 R11: 0007 R12: 88001f1a8000
[12619.800044] R13: 1f1a8000 R14: 82a86320 R15: 06b1bda1e433
[12619.800044] FS:  () GS:88011b40()
knlGS:
[12619.800044] CS:  0010 DS:  ES:  CR0: 8005003b
[12619.800044] CR2: 7aa9e3a0 CR3: 01c0b000 CR4: 06e0
[12619.800044] Stack:
[12619.800044]  810b2b70 0002 88011b5d40c0
000f11b1
[12619.800044]  88001f1a8068 88001f1a8000 88011b403d58
810a108f
[12619.800044]  88011b403d88 88001f1a8068 88011b5d40c0
88011b5d4000
[12619.800044] Call Trace:
[12619.800044]  IRQ
[12619.800044]  [810b2b70] ? cpuacct_css_alloc+0xb0/0xb0
[12619.800044]  [810a108f] update_curr+0x13f/0x230
[12619.800044]  [810a9e57] task_tick_fair+0x2d7/0x650
[12619.800044]  [8109dcc8] ? sched_clock_cpu+0xb8/0x120
[12619.800044]  [8109482d] scheduler_tick+0x6d/0xf0
[12619.800044]  [8106afd1] update_process_times+0x61/0x80
[12619.800044]  [810e38c7] tick_sched_handle+0x37/0x80
[12619.800044]  [810e3e74] tick_sched_timer+0x54/0x90
[12619.800044]  [8108bd21] __run_hrtimer+0x71/0x2d0
[12619.800044]  [810e3e20] ? tick_nohz_handler+0xc0/0xc0
[12619.800044]  [8108c246] hrtimer_interrupt+0x116/0x2a0
[12619.800044]  [81062959] ? __local_bh_enable+0x49/0x70
[12619.800044]  [81033dcb] local_apic_timer_interrupt+0x3b/0x60
[12619.800044]  [81727c05] smp_apic_timer_interrupt+0x45/0x60
[12619.800044]  [8172686f] apic_timer_interrupt+0x6f/0x80
[12619.800044]  EOI
[12619.800044]  [810b8e10] ? mark_held_locks+0x90/0x150
[12619.800044]  [8171c6f2] ? _raw_spin_unlock_irqrestore+0x42/0x70
[12619.800044]  [a001b71b] virtio_queue_rq+0xdb/0x1b0 [virtio_blk]
[12619.800044]  [8134647a] __blk_mq_run_hw_queue+0x1ca/0x520
[12619.800044]  [81346b35] blk_mq_run_hw_queue+0x35/0x40
[12619.800044]  [813470f2] blk_mq_insert_requests+0xe2/0x190
[12619.800044]  [813472d4] blk_mq_flush_plug_list+0x134/0x150
[12619.800044]  [8133d0cd] blk_flush_plug_list+0xbd/0x220
[12619.800044]  [81346f1a] blk_mq_make_request+0x3da/0x4d0
[12619.800044]  [813397aa] generic_make_request+0xca/0x100
[12619.800044]  [81339856] submit_bio+0x76/0x160
[12619.800044]  [81173c66] ? test_set_page_writeback+0x36/0x2b0
[12619.800044]  [811a9ae0] ? 

Re: [PATCH 34/34] mm: dynamically allocate page-ptl if it cannot be embedded to struct page

2013-11-20 Thread Kirill A. Shutemov
Andrey Wagin wrote:
 Hi Kirill,
 
 Looks like this patch adds memory leaks.
 [  116.188310] kmemleak: 15672 new suspected memory leaks (see
 /sys/kernel/debug/kmemleak)
 unreferenced object 0x8800da45a350 (size 96):
   comm dracut-initqueu, pid 93, jiffies 4294671391 (age 362.277s)
   hex dump (first 32 bytes):
 07 00 07 00 ad 4e ad de ff ff ff ff 6b 6b 6b 6b  .N..
 ff ff ff ff ff ff ff ff 80 24 b4 82 ff ff ff ff  .$..
   backtrace:
 [817152fe] kmemleak_alloc+0x5e/0xc0
 [811c34f3] kmem_cache_alloc_trace+0x113/0x290
 [811920f7] __ptlock_alloc+0x27/0x50
 [81192849] __pmd_alloc+0x59/0x170
 [81195ffa] copy_page_range+0x38a/0x3e0
 [8105a013] dup_mm+0x313/0x540
 [8105b9da] copy_process+0x161a/0x1880
 [8105c01b] do_fork+0x8b/0x360
 [8105c306] SyS_clone+0x16/0x20
 [81727b79] stub_clone+0x69/0x90
 [] 0x
 
 It's quite serious, because my test host went to panic in a few hours.

Sorry for that.

Could you test patch below.

diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index a7cccb6d7fec..7be5809754cf 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -61,6 +61,7 @@ void ___pte_free_tlb(struct mmu_gather *tlb, struct page *pte)
 #if PAGETABLE_LEVELS  2
 void ___pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd)
 {
+   struct page *page = virt_to_page(pmd);
paravirt_release_pmd(__pa(pmd)  PAGE_SHIFT);
/*
 * NOTE! For PAE, any changes to the top page-directory-pointer-table
@@ -69,7 +70,8 @@ void ___pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd)
 #ifdef CONFIG_X86_PAE
tlb-need_flush_all = 1;
 #endif
-   tlb_remove_page(tlb, virt_to_page(pmd));
+   pgtable_pmd_page_dtor(page);
+   tlb_remove_page(tlb, page);
 }
 
 #if PAGETABLE_LEVELS  3
-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 34/34] mm: dynamically allocate page-ptl if it cannot be embedded to struct page

2013-11-20 Thread Andrey Wagin
2013/11/20 Kirill A. Shutemov kirill.shute...@linux.intel.com:
 Andrey Wagin wrote:
 Hi Kirill,

 Looks like this patch adds memory leaks.
 [  116.188310] kmemleak: 15672 new suspected memory leaks (see
 /sys/kernel/debug/kmemleak)
 unreferenced object 0x8800da45a350 (size 96):
   comm dracut-initqueu, pid 93, jiffies 4294671391 (age 362.277s)
   hex dump (first 32 bytes):
 07 00 07 00 ad 4e ad de ff ff ff ff 6b 6b 6b 6b  .N..
 ff ff ff ff ff ff ff ff 80 24 b4 82 ff ff ff ff  .$..
   backtrace:
 [817152fe] kmemleak_alloc+0x5e/0xc0
 [811c34f3] kmem_cache_alloc_trace+0x113/0x290
 [811920f7] __ptlock_alloc+0x27/0x50
 [81192849] __pmd_alloc+0x59/0x170
 [81195ffa] copy_page_range+0x38a/0x3e0
 [8105a013] dup_mm+0x313/0x540
 [8105b9da] copy_process+0x161a/0x1880
 [8105c01b] do_fork+0x8b/0x360
 [8105c306] SyS_clone+0x16/0x20
 [81727b79] stub_clone+0x69/0x90
 [] 0x

 It's quite serious, because my test host went to panic in a few hours.

 Sorry for that.

 Could you test patch below.

Yes, it works.

I found this too a few minutes ago:)

diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index a7cccb6..44c366c 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -62,6 +62,7 @@ void ___pte_free_tlb(struct mmu_gather *tlb, struct page *pte)
 void ___pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd)
 {
paravirt_release_pmd(__pa(pmd)  PAGE_SHIFT);
+   pgtable_pmd_page_dtor(virt_to_page(pmd));
/*
 * NOTE! For PAE, any changes to the top page-directory-pointer-table
 * entries need a full cr3 reload to flush.

Thanks.


 diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
 index a7cccb6d7fec..7be5809754cf 100644
 --- a/arch/x86/mm/pgtable.c
 +++ b/arch/x86/mm/pgtable.c
 @@ -61,6 +61,7 @@ void ___pte_free_tlb(struct mmu_gather *tlb, struct page 
 *pte)
  #if PAGETABLE_LEVELS  2
  void ___pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd)
  {
 +   struct page *page = virt_to_page(pmd);
 paravirt_release_pmd(__pa(pmd)  PAGE_SHIFT);
 /*
  * NOTE! For PAE, any changes to the top page-directory-pointer-table
 @@ -69,7 +70,8 @@ void ___pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd)
  #ifdef CONFIG_X86_PAE
 tlb-need_flush_all = 1;
  #endif
 -   tlb_remove_page(tlb, virt_to_page(pmd));
 +   pgtable_pmd_page_dtor(page);
 +   tlb_remove_page(tlb, page);
  }

  #if PAGETABLE_LEVELS  3
 --
  Kirill A. Shutemov
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 34/34] mm: dynamically allocate page-ptl if it cannot be embedded to struct page

2013-11-20 Thread Kirill A. Shutemov
Andrey Wagin wrote:
 2013/11/20 Kirill A. Shutemov kirill.shute...@linux.intel.com:
  Andrey Wagin wrote:
  Hi Kirill,
 
  Looks like this patch adds memory leaks.
  [  116.188310] kmemleak: 15672 new suspected memory leaks (see
  /sys/kernel/debug/kmemleak)
  unreferenced object 0x8800da45a350 (size 96):
comm dracut-initqueu, pid 93, jiffies 4294671391 (age 362.277s)
hex dump (first 32 bytes):
  07 00 07 00 ad 4e ad de ff ff ff ff 6b 6b 6b 6b  .N..
  ff ff ff ff ff ff ff ff 80 24 b4 82 ff ff ff ff  .$..
backtrace:
  [817152fe] kmemleak_alloc+0x5e/0xc0
  [811c34f3] kmem_cache_alloc_trace+0x113/0x290
  [811920f7] __ptlock_alloc+0x27/0x50
  [81192849] __pmd_alloc+0x59/0x170
  [81195ffa] copy_page_range+0x38a/0x3e0
  [8105a013] dup_mm+0x313/0x540
  [8105b9da] copy_process+0x161a/0x1880
  [8105c01b] do_fork+0x8b/0x360
  [8105c306] SyS_clone+0x16/0x20
  [81727b79] stub_clone+0x69/0x90
  [] 0x
 
  It's quite serious, because my test host went to panic in a few hours.
 
  Sorry for that.
 
  Could you test patch below.
 
 Yes, it works.
 
 I found this too a few minutes ago:)

Nice

Tested-by ?

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 34/34] mm: dynamically allocate page-ptl if it cannot be embedded to struct page

2013-11-20 Thread Andrey Wagin
2013/11/20 Kirill A. Shutemov kirill.shute...@linux.intel.com:
 Andrey Wagin wrote:
 2013/11/20 Kirill A. Shutemov kirill.shute...@linux.intel.com:
  Andrey Wagin wrote:
  Hi Kirill,
 
  Looks like this patch adds memory leaks.
  [  116.188310] kmemleak: 15672 new suspected memory leaks (see
  /sys/kernel/debug/kmemleak)
  unreferenced object 0x8800da45a350 (size 96):
comm dracut-initqueu, pid 93, jiffies 4294671391 (age 362.277s)
hex dump (first 32 bytes):
  07 00 07 00 ad 4e ad de ff ff ff ff 6b 6b 6b 6b  .N..
  ff ff ff ff ff ff ff ff 80 24 b4 82 ff ff ff ff  .$..
backtrace:
  [817152fe] kmemleak_alloc+0x5e/0xc0
  [811c34f3] kmem_cache_alloc_trace+0x113/0x290
  [811920f7] __ptlock_alloc+0x27/0x50
  [81192849] __pmd_alloc+0x59/0x170
  [81195ffa] copy_page_range+0x38a/0x3e0
  [8105a013] dup_mm+0x313/0x540
  [8105b9da] copy_process+0x161a/0x1880
  [8105c01b] do_fork+0x8b/0x360
  [8105c306] SyS_clone+0x16/0x20
  [81727b79] stub_clone+0x69/0x90
  [] 0x
 
  It's quite serious, because my test host went to panic in a few hours.
 
  Sorry for that.
 
  Could you test patch below.

 Yes, it works.

 I found this too a few minutes ago:)

 Nice

 Tested-by ?

Tested-by: Andrey Vagin ava...@openvz.org


 --
  Kirill A. Shutemov
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 34/34] mm: dynamically allocate page->ptl if it cannot be embedded to struct page

2013-10-14 Thread Kirill A. Shutemov
Christoph Lameter wrote:
> On Tue, 15 Oct 2013, Kirill A. Shutemov wrote:
> 
> > Feel free to propose a patch. I don't see much point.
> 
> Right now you are using a long to stand in for a spinlock_t or a pointer
> to a spinlock_t. An #ifdef would allow to define the proper type and
> therefore the compiler to check that the ptl is correctly used.

You should not use it directly anyway: page->ptl is not there at all if
USE_SPLIT_PTE_PTLOCKS is 0. Compiler checks limited to few helpers and use
a kbuild hack is overkill to me.

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 34/34] mm: dynamically allocate page->ptl if it cannot be embedded to struct page

2013-10-14 Thread Christoph Lameter
On Tue, 15 Oct 2013, Kirill A. Shutemov wrote:

> Feel free to propose a patch. I don't see much point.

Right now you are using a long to stand in for a spinlock_t or a pointer
to a spinlock_t. An #ifdef would allow to define the proper type and
therefore the compiler to check that the ptl is correctly used.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 34/34] mm: dynamically allocate page->ptl if it cannot be embedded to struct page

2013-10-14 Thread Kirill A. Shutemov
Christoph Lameter wrote:
> On Mon, 14 Oct 2013, Kirill A. Shutemov wrote:
> 
> > > > > Could you make the check a CONFIG option? 
> > > > > CONFIG_PTLOCK_DOES_NOT_FIT_IN_PAGE_STRUCT or
> > > > > so?
> > > >
> > > > No. We will have to track what affects sizeof(spinlock_t) manually.
> > > > Not a fun and error prune.
> > >
> > > You can generate a config option depending on the size of the object via
> > > Kbuild. Kbuild will determine the setting before building the kernel as a
> > > whole by runing some small C program.
> >
> > I don't think it's any better than what we have there now.
> 
> Well with the CONFIG options we can then create macros etc that handle
> things differently depending on the ptl being in the page or not.

Feel free to propose a patch. I don't see much point.

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 34/34] mm: dynamically allocate page->ptl if it cannot be embedded to struct page

2013-10-14 Thread Christoph Lameter
On Mon, 14 Oct 2013, Kirill A. Shutemov wrote:

> > > > Could you make the check a CONFIG option? 
> > > > CONFIG_PTLOCK_DOES_NOT_FIT_IN_PAGE_STRUCT or
> > > > so?
> > >
> > > No. We will have to track what affects sizeof(spinlock_t) manually.
> > > Not a fun and error prune.
> >
> > You can generate a config option depending on the size of the object via
> > Kbuild. Kbuild will determine the setting before building the kernel as a
> > whole by runing some small C program.
>
> I don't think it's any better than what we have there now.

Well with the CONFIG options we can then create macros etc that handle
things differently depending on the ptl being in the page or not.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 34/34] mm: dynamically allocate page->ptl if it cannot be embedded to struct page

2013-10-14 Thread Kirill A. Shutemov
Christoph Lameter wrote:
> On Thu, 10 Oct 2013, Kirill A. Shutemov wrote:
> 
> > Christoph Lameter wrote:
> > > On Thu, 10 Oct 2013, Kirill A. Shutemov wrote:
> > >
> > > > +static inline bool ptlock_alloc(struct page *page)
> > > > +{
> > > > +   if (sizeof(spinlock_t) > sizeof(page->ptl))
> > > > +   return __ptlock_alloc(page);
> > > > +   return true;
> > > > +}
> > >
> > > Could you make the check a CONFIG option? 
> > > CONFIG_PTLOCK_DOES_NOT_FIT_IN_PAGE_STRUCT or
> > > so?
> >
> > No. We will have to track what affects sizeof(spinlock_t) manually.
> > Not a fun and error prune.
> 
> You can generate a config option depending on the size of the object via
> Kbuild. Kbuild will determine the setting before building the kernel as a
> whole by runing some small C program.

I don't think it's any better than what we have there now.

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 34/34] mm: dynamically allocate page-ptl if it cannot be embedded to struct page

2013-10-14 Thread Kirill A. Shutemov
Christoph Lameter wrote:
 On Thu, 10 Oct 2013, Kirill A. Shutemov wrote:
 
  Christoph Lameter wrote:
   On Thu, 10 Oct 2013, Kirill A. Shutemov wrote:
  
+static inline bool ptlock_alloc(struct page *page)
+{
+   if (sizeof(spinlock_t)  sizeof(page-ptl))
+   return __ptlock_alloc(page);
+   return true;
+}
  
   Could you make the check a CONFIG option? 
   CONFIG_PTLOCK_DOES_NOT_FIT_IN_PAGE_STRUCT or
   so?
 
  No. We will have to track what affects sizeof(spinlock_t) manually.
  Not a fun and error prune.
 
 You can generate a config option depending on the size of the object via
 Kbuild. Kbuild will determine the setting before building the kernel as a
 whole by runing some small C program.

I don't think it's any better than what we have there now.

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 34/34] mm: dynamically allocate page-ptl if it cannot be embedded to struct page

2013-10-14 Thread Christoph Lameter
On Mon, 14 Oct 2013, Kirill A. Shutemov wrote:

Could you make the check a CONFIG option? 
CONFIG_PTLOCK_DOES_NOT_FIT_IN_PAGE_STRUCT or
so?
  
   No. We will have to track what affects sizeof(spinlock_t) manually.
   Not a fun and error prune.
 
  You can generate a config option depending on the size of the object via
  Kbuild. Kbuild will determine the setting before building the kernel as a
  whole by runing some small C program.

 I don't think it's any better than what we have there now.

Well with the CONFIG options we can then create macros etc that handle
things differently depending on the ptl being in the page or not.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 34/34] mm: dynamically allocate page-ptl if it cannot be embedded to struct page

2013-10-14 Thread Kirill A. Shutemov
Christoph Lameter wrote:
 On Mon, 14 Oct 2013, Kirill A. Shutemov wrote:
 
 Could you make the check a CONFIG option? 
 CONFIG_PTLOCK_DOES_NOT_FIT_IN_PAGE_STRUCT or
 so?
   
No. We will have to track what affects sizeof(spinlock_t) manually.
Not a fun and error prune.
  
   You can generate a config option depending on the size of the object via
   Kbuild. Kbuild will determine the setting before building the kernel as a
   whole by runing some small C program.
 
  I don't think it's any better than what we have there now.
 
 Well with the CONFIG options we can then create macros etc that handle
 things differently depending on the ptl being in the page or not.

Feel free to propose a patch. I don't see much point.

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 34/34] mm: dynamically allocate page-ptl if it cannot be embedded to struct page

2013-10-14 Thread Christoph Lameter
On Tue, 15 Oct 2013, Kirill A. Shutemov wrote:

 Feel free to propose a patch. I don't see much point.

Right now you are using a long to stand in for a spinlock_t or a pointer
to a spinlock_t. An #ifdef would allow to define the proper type and
therefore the compiler to check that the ptl is correctly used.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 34/34] mm: dynamically allocate page-ptl if it cannot be embedded to struct page

2013-10-14 Thread Kirill A. Shutemov
Christoph Lameter wrote:
 On Tue, 15 Oct 2013, Kirill A. Shutemov wrote:
 
  Feel free to propose a patch. I don't see much point.
 
 Right now you are using a long to stand in for a spinlock_t or a pointer
 to a spinlock_t. An #ifdef would allow to define the proper type and
 therefore the compiler to check that the ptl is correctly used.

You should not use it directly anyway: page-ptl is not there at all if
USE_SPLIT_PTE_PTLOCKS is 0. Compiler checks limited to few helpers and use
a kbuild hack is overkill to me.

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 34/34] mm: dynamically allocate page->ptl if it cannot be embedded to struct page

2013-10-11 Thread Christoph Lameter
On Thu, 10 Oct 2013, Kirill A. Shutemov wrote:

> Christoph Lameter wrote:
> > On Thu, 10 Oct 2013, Kirill A. Shutemov wrote:
> >
> > > +static inline bool ptlock_alloc(struct page *page)
> > > +{
> > > + if (sizeof(spinlock_t) > sizeof(page->ptl))
> > > + return __ptlock_alloc(page);
> > > + return true;
> > > +}
> >
> > Could you make the check a CONFIG option? 
> > CONFIG_PTLOCK_DOES_NOT_FIT_IN_PAGE_STRUCT or
> > so?
>
> No. We will have to track what affects sizeof(spinlock_t) manually.
> Not a fun and error prune.

You can generate a config option depending on the size of the object via
Kbuild. Kbuild will determine the setting before building the kernel as a
whole by runing some small C program.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 34/34] mm: dynamically allocate page-ptl if it cannot be embedded to struct page

2013-10-11 Thread Christoph Lameter
On Thu, 10 Oct 2013, Kirill A. Shutemov wrote:

 Christoph Lameter wrote:
  On Thu, 10 Oct 2013, Kirill A. Shutemov wrote:
 
   +static inline bool ptlock_alloc(struct page *page)
   +{
   + if (sizeof(spinlock_t)  sizeof(page-ptl))
   + return __ptlock_alloc(page);
   + return true;
   +}
 
  Could you make the check a CONFIG option? 
  CONFIG_PTLOCK_DOES_NOT_FIT_IN_PAGE_STRUCT or
  so?

 No. We will have to track what affects sizeof(spinlock_t) manually.
 Not a fun and error prune.

You can generate a config option depending on the size of the object via
Kbuild. Kbuild will determine the setting before building the kernel as a
whole by runing some small C program.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 34/34] mm: dynamically allocate page->ptl if it cannot be embedded to struct page

2013-10-10 Thread Kirill A. Shutemov
Christoph Lameter wrote:
> On Thu, 10 Oct 2013, Kirill A. Shutemov wrote:
> 
> > +static inline bool ptlock_alloc(struct page *page)
> > +{
> > +   if (sizeof(spinlock_t) > sizeof(page->ptl))
> > +   return __ptlock_alloc(page);
> > +   return true;
> > +}
> 
> Could you make the check a CONFIG option? 
> CONFIG_PTLOCK_DOES_NOT_FIT_IN_PAGE_STRUCT or
> so?

No. We will have to track what affects sizeof(spinlock_t) manually.
Not a fun and error prune.

C sucks. ;)

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 34/34] mm: dynamically allocate page->ptl if it cannot be embedded to struct page

2013-10-10 Thread Christoph Lameter
On Thu, 10 Oct 2013, Kirill A. Shutemov wrote:

> +static inline bool ptlock_alloc(struct page *page)
> +{
> + if (sizeof(spinlock_t) > sizeof(page->ptl))
> + return __ptlock_alloc(page);
> + return true;
> +}

Could you make the check a CONFIG option? 
CONFIG_PTLOCK_DOES_NOT_FIT_IN_PAGE_STRUCT or
so?

> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -147,7 +147,10 @@ struct page {
>* system if PG_buddy is set.
>*/
>  #if USE_SPLIT_PTE_PTLOCKS
> - spinlock_t ptl;
> + unsigned long ptl; /* It's spinlock_t if it fits to long,
> + * otherwise it's pointer to dynamicaly
> + * allocated spinlock_t.
> + */

If you had such a CONFIG option then you could use the proper type here.

#ifdef CONFIG_PTLOCK_NOT_FITTING
spinlock_t *ptl;
#else
spinlock_t ptl;
#endif

Or some such thing?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 34/34] mm: dynamically allocate page->ptl if it cannot be embedded to struct page

2013-10-10 Thread Kirill A. Shutemov
If split page table lock is in use, we embed the lock into struct page
of table's page. We have to disable split lock, if spinlock_t is too big
be to be embedded, like when DEBUG_SPINLOCK or DEBUG_LOCK_ALLOC enabled.

This patch add support for dynamic allocation of split page table lock
if we can't embed it to struct page.

page->ptl is unsigned long now and we use it as spinlock_t if
sizeof(spinlock_t) <= sizeof(long), otherwise it's pointer to
spinlock_t.

The spinlock_t allocated in pgtable_page_ctor() for PTE table and in
pgtable_pmd_page_ctor() for PMD table. All other helpers converted to
support dynamically allocated page->ptl.

Signed-off-by: Kirill A. Shutemov 
---
 Documentation/vm/split_page_table_lock | 90 ++
 arch/x86/xen/mmu.c |  2 +-
 include/linux/mm.h | 72 +++
 include/linux/mm_types.h   |  5 +-
 mm/Kconfig |  2 -
 mm/memory.c| 19 +++
 6 files changed, 166 insertions(+), 24 deletions(-)
 create mode 100644 Documentation/vm/split_page_table_lock

diff --git a/Documentation/vm/split_page_table_lock 
b/Documentation/vm/split_page_table_lock
new file mode 100644
index 00..e2f617b732
--- /dev/null
+++ b/Documentation/vm/split_page_table_lock
@@ -0,0 +1,90 @@
+Split page table lock
+=
+
+Originally, mm->page_table_lock spinlock protected all page tables of the
+mm_struct. But this approach leads to poor page fault scalability of
+multi-threaded applications due high contention on the lock. To improve
+scalability, split page table lock was introduced.
+
+With split page table lock we have separate per-table lock to serialize
+access to the table. At the moment we use split lock for PTE and PMD
+tables. Access to higher level tables protected by mm->page_table_lock.
+
+There are helpers to lock/unlock a table and other accessor functions:
+ - pte_offset_map_lock()
+   maps pte and takes PTE table lock, returns pointer to the taken
+   lock;
+ - pte_unmap_unlock()
+   unlocks and unmaps PTE table;
+ - pte_alloc_map_lock()
+   allocates PTE table if needed and take the lock, returns pointer
+   to taken lock or NULL if allocation failed;
+ - pte_lockptr()
+   returns pointer to PTE table lock;
+ - pmd_lock()
+   takes PMD table lock, returns pointer to taken lock;
+ - pmd_lockptr()
+   returns pointer to PMD table lock;
+
+Split page table lock for PTE tables is enabled compile-time if
+CONFIG_SPLIT_PTLOCK_CPUS (usually 4) is less or equal to NR_CPUS.
+If split lock is disabled, all tables guaded by mm->page_table_lock.
+
+Split page table lock for PMD tables is enabled, if it's enabled for PTE
+tables and the architecture supports it (see below).
+
+Hugetlb and split page table lock
+-
+
+Hugetlb can support several page sizes. We use split lock only for PMD
+level, but not for PUD.
+
+Hugetlb-specific helpers:
+ - huge_pte_lock()
+   takes pmd split lock for PMD_SIZE page, mm->page_table_lock
+   otherwise;
+ - huge_pte_lockptr()
+   returns pointer to table lock;
+
+Support of split page table lock by an architecture
+---
+
+There's no need in special enabling of PTE split page table lock:
+everything required is done by pgtable_page_ctor() and pgtable_page_dtor(),
+which must be called on PTE table allocation / freeing.
+
+PMD split lock only makes sense if you have more than two page table
+levels.
+
+PMD split lock enabling requires pgtable_pmd_page_ctor() call on PMD table
+allocation and pgtable_pmd_page_dtor() on freeing.
+
+Allocation usually happens in pmd_alloc_one(), freeing in pmd_free(), but
+make sure you cover all PMD table allocation / freeing paths: i.e X86_PAE
+preallocate few PMDs on pgd_alloc().
+
+With everything in place you can set CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK.
+
+NOTE: pgtable_page_ctor() and pgtable_pmd_page_ctor() can fail -- it must
+be handled properly.
+
+page->ptl
+-
+
+page->ptl is used to access split page table lock, where 'page' is struct
+page of page containing the table. It shares storage with page->private
+(and few other fields in union).
+
+To avoid increasing size of struct page and have best performance, we use a
+trick:
+ - if spinlock_t fits into long, we use page->ptr as spinlock, so we
+   can avoid indirect access and save a cache line.
+ - if size of spinlock_t is bigger then size of long, we use page->ptl as
+   pointer to spinlock_t and allocate it dynamically. This allows to use
+   split lock with enabled DEBUG_SPINLOCK or DEBUG_LOCK_ALLOC, but costs
+   one more cache line for indirect access;
+
+The spinlock_t allocated in pgtable_page_ctor() for PTE table and in
+pgtable_pmd_page_ctor() for PMD table.
+
+Please, never access page->ptl directly -- use appropriate helper.
diff --git a/arch/x86/xen/mmu.c 

[PATCH 34/34] mm: dynamically allocate page-ptl if it cannot be embedded to struct page

2013-10-10 Thread Kirill A. Shutemov
If split page table lock is in use, we embed the lock into struct page
of table's page. We have to disable split lock, if spinlock_t is too big
be to be embedded, like when DEBUG_SPINLOCK or DEBUG_LOCK_ALLOC enabled.

This patch add support for dynamic allocation of split page table lock
if we can't embed it to struct page.

page-ptl is unsigned long now and we use it as spinlock_t if
sizeof(spinlock_t) = sizeof(long), otherwise it's pointer to
spinlock_t.

The spinlock_t allocated in pgtable_page_ctor() for PTE table and in
pgtable_pmd_page_ctor() for PMD table. All other helpers converted to
support dynamically allocated page-ptl.

Signed-off-by: Kirill A. Shutemov kirill.shute...@linux.intel.com
---
 Documentation/vm/split_page_table_lock | 90 ++
 arch/x86/xen/mmu.c |  2 +-
 include/linux/mm.h | 72 +++
 include/linux/mm_types.h   |  5 +-
 mm/Kconfig |  2 -
 mm/memory.c| 19 +++
 6 files changed, 166 insertions(+), 24 deletions(-)
 create mode 100644 Documentation/vm/split_page_table_lock

diff --git a/Documentation/vm/split_page_table_lock 
b/Documentation/vm/split_page_table_lock
new file mode 100644
index 00..e2f617b732
--- /dev/null
+++ b/Documentation/vm/split_page_table_lock
@@ -0,0 +1,90 @@
+Split page table lock
+=
+
+Originally, mm-page_table_lock spinlock protected all page tables of the
+mm_struct. But this approach leads to poor page fault scalability of
+multi-threaded applications due high contention on the lock. To improve
+scalability, split page table lock was introduced.
+
+With split page table lock we have separate per-table lock to serialize
+access to the table. At the moment we use split lock for PTE and PMD
+tables. Access to higher level tables protected by mm-page_table_lock.
+
+There are helpers to lock/unlock a table and other accessor functions:
+ - pte_offset_map_lock()
+   maps pte and takes PTE table lock, returns pointer to the taken
+   lock;
+ - pte_unmap_unlock()
+   unlocks and unmaps PTE table;
+ - pte_alloc_map_lock()
+   allocates PTE table if needed and take the lock, returns pointer
+   to taken lock or NULL if allocation failed;
+ - pte_lockptr()
+   returns pointer to PTE table lock;
+ - pmd_lock()
+   takes PMD table lock, returns pointer to taken lock;
+ - pmd_lockptr()
+   returns pointer to PMD table lock;
+
+Split page table lock for PTE tables is enabled compile-time if
+CONFIG_SPLIT_PTLOCK_CPUS (usually 4) is less or equal to NR_CPUS.
+If split lock is disabled, all tables guaded by mm-page_table_lock.
+
+Split page table lock for PMD tables is enabled, if it's enabled for PTE
+tables and the architecture supports it (see below).
+
+Hugetlb and split page table lock
+-
+
+Hugetlb can support several page sizes. We use split lock only for PMD
+level, but not for PUD.
+
+Hugetlb-specific helpers:
+ - huge_pte_lock()
+   takes pmd split lock for PMD_SIZE page, mm-page_table_lock
+   otherwise;
+ - huge_pte_lockptr()
+   returns pointer to table lock;
+
+Support of split page table lock by an architecture
+---
+
+There's no need in special enabling of PTE split page table lock:
+everything required is done by pgtable_page_ctor() and pgtable_page_dtor(),
+which must be called on PTE table allocation / freeing.
+
+PMD split lock only makes sense if you have more than two page table
+levels.
+
+PMD split lock enabling requires pgtable_pmd_page_ctor() call on PMD table
+allocation and pgtable_pmd_page_dtor() on freeing.
+
+Allocation usually happens in pmd_alloc_one(), freeing in pmd_free(), but
+make sure you cover all PMD table allocation / freeing paths: i.e X86_PAE
+preallocate few PMDs on pgd_alloc().
+
+With everything in place you can set CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK.
+
+NOTE: pgtable_page_ctor() and pgtable_pmd_page_ctor() can fail -- it must
+be handled properly.
+
+page-ptl
+-
+
+page-ptl is used to access split page table lock, where 'page' is struct
+page of page containing the table. It shares storage with page-private
+(and few other fields in union).
+
+To avoid increasing size of struct page and have best performance, we use a
+trick:
+ - if spinlock_t fits into long, we use page-ptr as spinlock, so we
+   can avoid indirect access and save a cache line.
+ - if size of spinlock_t is bigger then size of long, we use page-ptl as
+   pointer to spinlock_t and allocate it dynamically. This allows to use
+   split lock with enabled DEBUG_SPINLOCK or DEBUG_LOCK_ALLOC, but costs
+   one more cache line for indirect access;
+
+The spinlock_t allocated in pgtable_page_ctor() for PTE table and in
+pgtable_pmd_page_ctor() for PMD table.
+
+Please, never access page-ptl directly -- use appropriate helper.
diff --git 

Re: [PATCH 34/34] mm: dynamically allocate page-ptl if it cannot be embedded to struct page

2013-10-10 Thread Christoph Lameter
On Thu, 10 Oct 2013, Kirill A. Shutemov wrote:

 +static inline bool ptlock_alloc(struct page *page)
 +{
 + if (sizeof(spinlock_t)  sizeof(page-ptl))
 + return __ptlock_alloc(page);
 + return true;
 +}

Could you make the check a CONFIG option? 
CONFIG_PTLOCK_DOES_NOT_FIT_IN_PAGE_STRUCT or
so?

 --- a/include/linux/mm_types.h
 +++ b/include/linux/mm_types.h
 @@ -147,7 +147,10 @@ struct page {
* system if PG_buddy is set.
*/
  #if USE_SPLIT_PTE_PTLOCKS
 - spinlock_t ptl;
 + unsigned long ptl; /* It's spinlock_t if it fits to long,
 + * otherwise it's pointer to dynamicaly
 + * allocated spinlock_t.
 + */

If you had such a CONFIG option then you could use the proper type here.

#ifdef CONFIG_PTLOCK_NOT_FITTING
spinlock_t *ptl;
#else
spinlock_t ptl;
#endif

Or some such thing?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 34/34] mm: dynamically allocate page-ptl if it cannot be embedded to struct page

2013-10-10 Thread Kirill A. Shutemov
Christoph Lameter wrote:
 On Thu, 10 Oct 2013, Kirill A. Shutemov wrote:
 
  +static inline bool ptlock_alloc(struct page *page)
  +{
  +   if (sizeof(spinlock_t)  sizeof(page-ptl))
  +   return __ptlock_alloc(page);
  +   return true;
  +}
 
 Could you make the check a CONFIG option? 
 CONFIG_PTLOCK_DOES_NOT_FIT_IN_PAGE_STRUCT or
 so?

No. We will have to track what affects sizeof(spinlock_t) manually.
Not a fun and error prune.

C sucks. ;)

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/