RE: Error in frreing hugepages with preemption enabled
-Original Message- From: Andrea Arcangeli [mailto:aarca...@redhat.com] Sent: Wednesday, December 04, 2013 3:51 AM To: Alexander Graf Cc: Bhushan Bharat-R65777; linuxppc-dev@lists.ozlabs.org; kvm- p...@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Ben Herrenschmidt Subject: Re: Error in frreing hugepages with preemption enabled Hi everyone, On Fri, Nov 29, 2013 at 12:13:03PM +0100, Alexander Graf wrote: On 29.11.2013, at 05:38, Bharat Bhushan bharat.bhus...@freescale.com wrote: Hi Alex, I am running KVM guest with host kernel having CONFIG_PREEMPT enabled. With allocated pages things seems to work fine but I uses hugepages for guest I see below prints when quit from qemu. (qemu) QEMU waiting for connection on: telnet:0.0.0.0:,server qemu-system-ppc64: pci_add_option_rom: failed to find romfile efi- virtio.rom q debug_smp_processor_id: 15 callbacks suppressed BUG: using smp_processor_id() in preemptible [] code: qemu-system-ppc/2504 caller is .free_hugepd_range+0xb0/0x21c CPU: 1 PID: 2504 Comm: qemu-system-ppc Not tainted 3.12.0-rc3-07733-gabf4907 #175 Call Trace: [c000fb433400] [c0007d38] .show_stack+0x7c/0x1cc (unreliable) [c000fb4334d0] [c05e8ce0] .dump_stack+0x9c/0xf4 [c000fb433560] [c02de5ec] .debug_smp_processor_id+0x108/0x11c [c000fb4335f0] [c0025e10] .free_hugepd_range+0xb0/0x21c [c000fb433680] [c00265bc] .hugetlb_free_pgd_range+0x2c8/0x3b0 [c000fb4337a0] [c00e428c] .free_pgtables+0x14c/0x158 [c000fb433840] [c00ef320] .exit_mmap+0xec/0x194 [c000fb433960] [c004d780] .mmput+0x64/0x124 [c000fb4339e0] [c0051f40] .do_exit+0x29c/0x9c8 [c000fb433ae0] [c00527c8] .do_group_exit+0x50/0xc4 [c000fb433b70] [c00606a0] .get_signal_to_deliver+0x21c/0x5d8 [c000fb433c70] [c0009b08] .do_signal+0x54/0x278 [c000fb433db0] [c0009e50] .do_notify_resume+0x64/0x78 [c000fb433e30] [cb44] .ret_from_except_lite+0x70/0x74 This mean that free_hugepd_range() must be called with preemption enabled. with preemption disabled. I tried below change and this seems to work fine (I am not having expertise in this area so not sure this is correct way) Not sure - the scope looks odd to me. Let's ask Andrea - I'm sure he knows what to do :). :) So I had a look at the top of this function (0xb0) in the upstream kernel and no smp_processor_id() call is apparent, is this stock git or a ppc tree? The first few calls seem not to call it but I may have overlooked something. It's just quicker if somebody with vmlinux finds the location of it. static void free_hugepd_range(struct mmu_gather *tlb, hugepd_t *hpdp, int pdshift, unsigned long start, unsigned long end, unsigned long floor, unsigned long ceiling) { pte_t *hugepte = hugepd_page(*hpdp); int i; unsigned long pdmask = ~((1UL pdshift) - 1); unsigned int num_hugepd = 1; #ifdef CONFIG_PPC_FSL_BOOK3E /* Note: On fsl the hpdp may be the first of several */ num_hugepd = (1 (hugepd_shift(*hpdp) - pdshift)); #else unsigned int shift = hugepd_shift(*hpdp); #endif start = pdmask; if (start floor) return; if (ceiling) { ceiling = pdmask; if (! ceiling) return; } if (end - 1 ceiling - 1) return; for (i = 0; i num_hugepd; i++, hpdp++) hpdp-pd = 0; tlb-need_flush = 1; #ifdef CONFIG_PPC_FSL_BOOK3E hugepd_free(tlb, hugepte); #else pgtable_free_tlb(tlb, hugepte, pdshift - shift); #endif } Generally smp_processor_id should never be used, exactly to avoid problems like above with preempion enabled in .config. Instead it should be replaced with a get_cpu()/put_cpu() pair that is exactly meant to fix bugs like this and define proper critical sections around the per- cpu variables. #define get_cpu() ({ preempt_disable(); smp_processor_id(); }) #define put_cpu() preempt_enable() After you find where that smp_processor_id() is located, you should simply replace it with a get_cpu() and then you should insert a put_cpu immediately after the cpu info is not used anymore. That will define a proper and strict critical section around the use of the per-cpu variables. With a ppc vmlinux it should be immediate to find the location of smp_processor_id but I don't have the ppc vmlinux here. Thanks Andrea for the details description. It is really helpful I will look into this. Thanks -Bharat Thanks! Andrea Alex diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index d67db4b..6bf8459
Re: Error in frreing hugepages with preemption enabled
Hi everyone, On Fri, Nov 29, 2013 at 12:13:03PM +0100, Alexander Graf wrote: On 29.11.2013, at 05:38, Bharat Bhushan bharat.bhus...@freescale.com wrote: Hi Alex, I am running KVM guest with host kernel having CONFIG_PREEMPT enabled. With allocated pages things seems to work fine but I uses hugepages for guest I see below prints when quit from qemu. (qemu) QEMU waiting for connection on: telnet:0.0.0.0:,server qemu-system-ppc64: pci_add_option_rom: failed to find romfile efi-virtio.rom q debug_smp_processor_id: 15 callbacks suppressed BUG: using smp_processor_id() in preemptible [] code: qemu-system-ppc/2504 caller is .free_hugepd_range+0xb0/0x21c CPU: 1 PID: 2504 Comm: qemu-system-ppc Not tainted 3.12.0-rc3-07733-gabf4907 #175 Call Trace: [c000fb433400] [c0007d38] .show_stack+0x7c/0x1cc (unreliable) [c000fb4334d0] [c05e8ce0] .dump_stack+0x9c/0xf4 [c000fb433560] [c02de5ec] .debug_smp_processor_id+0x108/0x11c [c000fb4335f0] [c0025e10] .free_hugepd_range+0xb0/0x21c [c000fb433680] [c00265bc] .hugetlb_free_pgd_range+0x2c8/0x3b0 [c000fb4337a0] [c00e428c] .free_pgtables+0x14c/0x158 [c000fb433840] [c00ef320] .exit_mmap+0xec/0x194 [c000fb433960] [c004d780] .mmput+0x64/0x124 [c000fb4339e0] [c0051f40] .do_exit+0x29c/0x9c8 [c000fb433ae0] [c00527c8] .do_group_exit+0x50/0xc4 [c000fb433b70] [c00606a0] .get_signal_to_deliver+0x21c/0x5d8 [c000fb433c70] [c0009b08] .do_signal+0x54/0x278 [c000fb433db0] [c0009e50] .do_notify_resume+0x64/0x78 [c000fb433e30] [cb44] .ret_from_except_lite+0x70/0x74 This mean that free_hugepd_range() must be called with preemption enabled. with preemption disabled. I tried below change and this seems to work fine (I am not having expertise in this area so not sure this is correct way) Not sure - the scope looks odd to me. Let's ask Andrea - I'm sure he knows what to do :). :) So I had a look at the top of this function (0xb0) in the upstream kernel and no smp_processor_id() call is apparent, is this stock git or a ppc tree? The first few calls seem not to call it but I may have overlooked something. It's just quicker if somebody with vmlinux finds the location of it. static void free_hugepd_range(struct mmu_gather *tlb, hugepd_t *hpdp, int pdshift, unsigned long start, unsigned long end, unsigned long floor, unsigned long ceiling) { pte_t *hugepte = hugepd_page(*hpdp); int i; unsigned long pdmask = ~((1UL pdshift) - 1); unsigned int num_hugepd = 1; #ifdef CONFIG_PPC_FSL_BOOK3E /* Note: On fsl the hpdp may be the first of several */ num_hugepd = (1 (hugepd_shift(*hpdp) - pdshift)); #else unsigned int shift = hugepd_shift(*hpdp); #endif start = pdmask; if (start floor) return; if (ceiling) { ceiling = pdmask; if (! ceiling) return; } if (end - 1 ceiling - 1) return; for (i = 0; i num_hugepd; i++, hpdp++) hpdp-pd = 0; tlb-need_flush = 1; #ifdef CONFIG_PPC_FSL_BOOK3E hugepd_free(tlb, hugepte); #else pgtable_free_tlb(tlb, hugepte, pdshift - shift); #endif } Generally smp_processor_id should never be used, exactly to avoid problems like above with preempion enabled in .config. Instead it should be replaced with a get_cpu()/put_cpu() pair that is exactly meant to fix bugs like this and define proper critical sections around the per-cpu variables. #define get_cpu() ({ preempt_disable(); smp_processor_id(); }) #define put_cpu() preempt_enable() After you find where that smp_processor_id() is located, you should simply replace it with a get_cpu() and then you should insert a put_cpu immediately after the cpu info is not used anymore. That will define a proper and strict critical section around the use of the per-cpu variables. With a ppc vmlinux it should be immediate to find the location of smp_processor_id but I don't have the ppc vmlinux here. Thanks! Andrea Alex diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index d67db4b..6bf8459 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -563,8 +563,10 @@ static void hugetlb_free_pmd_range(struct mmu_gather *tlb, pud_t *pud, */ next = addr + (1 hugepd_shift(*(hugepd_t *)pmd)); #endif + preempt_disable(); free_hugepd_range(tlb, (hugepd_t *)pmd, PMD_SHIFT, addr, next, floor, ceiling); + preempt_enable(); } while (addr = next, addr != end);
Re: Error in frreing hugepages with preemption enabled
On Tue, 2013-12-03 at 23:21 +0100, Andrea Arcangeli wrote: #ifdef CONFIG_PPC_FSL_BOOK3E hugepd_free(tlb, hugepte); ^^ This is the culprit (Alex, you didn't specify this was embedded or did I miss it ?) #else pgtable_free_tlb(tlb, hugepte, pdshift - shift); #endif } That function does: batchp = __get_cpu_var(hugepd_freelist_cur); IE, it tries to use a per-CPU batch. Basically, it's duplicating the logic in mm/memory.c for RCU freeing using a per-cpu freelist. I suppose it assumes being called under something like the page table lock ? This code also never flushes the batch, which is a concern... Alex, this is Freescale stuff, can you followup with them ? Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: Error in frreing hugepages with preemption enabled
On 29.11.2013, at 05:38, Bharat Bhushan bharat.bhus...@freescale.com wrote: Hi Alex, I am running KVM guest with host kernel having CONFIG_PREEMPT enabled. With allocated pages things seems to work fine but I uses hugepages for guest I see below prints when quit from qemu. (qemu) QEMU waiting for connection on: telnet:0.0.0.0:,server qemu-system-ppc64: pci_add_option_rom: failed to find romfile efi-virtio.rom q debug_smp_processor_id: 15 callbacks suppressed BUG: using smp_processor_id() in preemptible [] code: qemu-system-ppc/2504 caller is .free_hugepd_range+0xb0/0x21c CPU: 1 PID: 2504 Comm: qemu-system-ppc Not tainted 3.12.0-rc3-07733-gabf4907 #175 Call Trace: [c000fb433400] [c0007d38] .show_stack+0x7c/0x1cc (unreliable) [c000fb4334d0] [c05e8ce0] .dump_stack+0x9c/0xf4 [c000fb433560] [c02de5ec] .debug_smp_processor_id+0x108/0x11c [c000fb4335f0] [c0025e10] .free_hugepd_range+0xb0/0x21c [c000fb433680] [c00265bc] .hugetlb_free_pgd_range+0x2c8/0x3b0 [c000fb4337a0] [c00e428c] .free_pgtables+0x14c/0x158 [c000fb433840] [c00ef320] .exit_mmap+0xec/0x194 [c000fb433960] [c004d780] .mmput+0x64/0x124 [c000fb4339e0] [c0051f40] .do_exit+0x29c/0x9c8 [c000fb433ae0] [c00527c8] .do_group_exit+0x50/0xc4 [c000fb433b70] [c00606a0] .get_signal_to_deliver+0x21c/0x5d8 [c000fb433c70] [c0009b08] .do_signal+0x54/0x278 [c000fb433db0] [c0009e50] .do_notify_resume+0x64/0x78 [c000fb433e30] [cb44] .ret_from_except_lite+0x70/0x74 This mean that free_hugepd_range() must be called with preemption enabled. with preemption disabled. I tried below change and this seems to work fine (I am not having expertise in this area so not sure this is correct way) Not sure - the scope looks odd to me. Let's ask Andrea - I'm sure he knows what to do :). Alex diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index d67db4b..6bf8459 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -563,8 +563,10 @@ static void hugetlb_free_pmd_range(struct mmu_gather *tlb, pud_t *pud, */ next = addr + (1 hugepd_shift(*(hugepd_t *)pmd)); #endif + preempt_disable(); free_hugepd_range(tlb, (hugepd_t *)pmd, PMD_SHIFT, addr, next, floor, ceiling); + preempt_enable(); } while (addr = next, addr != end); start = PUD_MASK; Thanks -Bharat -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Error in frreing hugepages with preemption enabled
Hi Alex, I am running KVM guest with host kernel having CONFIG_PREEMPT enabled. With allocated pages things seems to work fine but I uses hugepages for guest I see below prints when quit from qemu. (qemu) QEMU waiting for connection on: telnet:0.0.0.0:,server qemu-system-ppc64: pci_add_option_rom: failed to find romfile efi-virtio.rom q debug_smp_processor_id: 15 callbacks suppressed BUG: using smp_processor_id() in preemptible [] code: qemu-system-ppc/2504 caller is .free_hugepd_range+0xb0/0x21c CPU: 1 PID: 2504 Comm: qemu-system-ppc Not tainted 3.12.0-rc3-07733-gabf4907 #175 Call Trace: [c000fb433400] [c0007d38] .show_stack+0x7c/0x1cc (unreliable) [c000fb4334d0] [c05e8ce0] .dump_stack+0x9c/0xf4 [c000fb433560] [c02de5ec] .debug_smp_processor_id+0x108/0x11c [c000fb4335f0] [c0025e10] .free_hugepd_range+0xb0/0x21c [c000fb433680] [c00265bc] .hugetlb_free_pgd_range+0x2c8/0x3b0 [c000fb4337a0] [c00e428c] .free_pgtables+0x14c/0x158 [c000fb433840] [c00ef320] .exit_mmap+0xec/0x194 [c000fb433960] [c004d780] .mmput+0x64/0x124 [c000fb4339e0] [c0051f40] .do_exit+0x29c/0x9c8 [c000fb433ae0] [c00527c8] .do_group_exit+0x50/0xc4 [c000fb433b70] [c00606a0] .get_signal_to_deliver+0x21c/0x5d8 [c000fb433c70] [c0009b08] .do_signal+0x54/0x278 [c000fb433db0] [c0009e50] .do_notify_resume+0x64/0x78 [c000fb433e30] [cb44] .ret_from_except_lite+0x70/0x74 This mean that free_hugepd_range() must be called with preemption enabled. I tried below change and this seems to work fine (I am not having expertise in this area so not sure this is correct way) diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index d67db4b..6bf8459 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -563,8 +563,10 @@ static void hugetlb_free_pmd_range(struct mmu_gather *tlb, pud_t *pud, */ next = addr + (1 hugepd_shift(*(hugepd_t *)pmd)); #endif + preempt_disable(); free_hugepd_range(tlb, (hugepd_t *)pmd, PMD_SHIFT, addr, next, floor, ceiling); + preempt_enable(); } while (addr = next, addr != end); start = PUD_MASK; Thanks -Bharat ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev