RE: Error in frreing hugepages with preemption enabled

2013-12-05 Thread Bharat Bhushan


 -Original Message-
 From: Andrea Arcangeli [mailto:aarca...@redhat.com]
 Sent: Wednesday, December 04, 2013 3:51 AM
 To: Alexander Graf
 Cc: Bhushan Bharat-R65777; linuxppc-dev@lists.ozlabs.org; kvm-
 p...@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Ben 
 Herrenschmidt
 Subject: Re: Error in frreing hugepages with preemption enabled
 
 Hi everyone,
 
 On Fri, Nov 29, 2013 at 12:13:03PM +0100, Alexander Graf wrote:
 
  On 29.11.2013, at 05:38, Bharat Bhushan bharat.bhus...@freescale.com 
  wrote:
 
   Hi Alex,
  
   I am running KVM guest with host kernel having CONFIG_PREEMPT enabled. 
   With
 allocated pages things seems to work fine but I uses hugepages for guest I see
 below prints when quit from qemu.
  
   (qemu) QEMU waiting for connection on: telnet:0.0.0.0:,server
   qemu-system-ppc64: pci_add_option_rom: failed to find romfile efi-
 virtio.rom
   q
   debug_smp_processor_id: 15 callbacks suppressed
   BUG: using smp_processor_id() in preemptible [] code:
   qemu-system-ppc/2504 caller is .free_hugepd_range+0xb0/0x21c
   CPU: 1 PID: 2504 Comm: qemu-system-ppc Not tainted
   3.12.0-rc3-07733-gabf4907 #175 Call Trace:
   [c000fb433400] [c0007d38] .show_stack+0x7c/0x1cc
   (unreliable) [c000fb4334d0] [c05e8ce0]
   .dump_stack+0x9c/0xf4 [c000fb433560] [c02de5ec]
   .debug_smp_processor_id+0x108/0x11c
   [c000fb4335f0] [c0025e10] .free_hugepd_range+0xb0/0x21c
   [c000fb433680] [c00265bc]
   .hugetlb_free_pgd_range+0x2c8/0x3b0
   [c000fb4337a0] [c00e428c] .free_pgtables+0x14c/0x158
   [c000fb433840] [c00ef320] .exit_mmap+0xec/0x194
   [c000fb433960] [c004d780] .mmput+0x64/0x124
   [c000fb4339e0] [c0051f40] .do_exit+0x29c/0x9c8
   [c000fb433ae0] [c00527c8] .do_group_exit+0x50/0xc4
   [c000fb433b70] [c00606a0]
   .get_signal_to_deliver+0x21c/0x5d8
   [c000fb433c70] [c0009b08] .do_signal+0x54/0x278
   [c000fb433db0] [c0009e50] .do_notify_resume+0x64/0x78
   [c000fb433e30] [cb44]
   .ret_from_except_lite+0x70/0x74
  
  
   This mean that free_hugepd_range() must be called with preemption enabled.
 
  with preemption disabled.
 
   I tried below change and this seems to work fine (I am not having
   expertise in this area so not sure this is correct way)
 
  Not sure - the scope looks odd to me. Let's ask Andrea - I'm sure he knows
 what to do :).
 
 :) So I had a look at the top of this function (0xb0) in the upstream kernel 
 and
 no smp_processor_id() call is apparent, is this stock git or a ppc tree? The
 first few calls seem not to call it but I may have overlooked something. It's
 just quicker if somebody with vmlinux finds the location of it.
 
 static void free_hugepd_range(struct mmu_gather *tlb, hugepd_t *hpdp, int
 pdshift,
 unsigned long start, unsigned long end,
 unsigned long floor, unsigned long ceiling) {
   pte_t *hugepte = hugepd_page(*hpdp);
   int i;
 
   unsigned long pdmask = ~((1UL  pdshift) - 1);
   unsigned int num_hugepd = 1;
 
 #ifdef CONFIG_PPC_FSL_BOOK3E
   /* Note: On fsl the hpdp may be the first of several */
   num_hugepd = (1  (hugepd_shift(*hpdp) - pdshift)); #else
   unsigned int shift = hugepd_shift(*hpdp); #endif
 
   start = pdmask;
   if (start  floor)
   return;
   if (ceiling) {
   ceiling = pdmask;
   if (! ceiling)
   return;
   }
   if (end - 1  ceiling - 1)
   return;
 
   for (i = 0; i  num_hugepd; i++, hpdp++)
   hpdp-pd = 0;
 
   tlb-need_flush = 1;
 
 #ifdef CONFIG_PPC_FSL_BOOK3E
   hugepd_free(tlb, hugepte);
 #else
   pgtable_free_tlb(tlb, hugepte, pdshift - shift); #endif }
 
 Generally smp_processor_id should never be used, exactly to avoid problems 
 like
 above with preempion enabled in .config.
 
 Instead it should be replaced with a get_cpu()/put_cpu() pair that is exactly
 meant to fix bugs like this and define proper critical sections around the 
 per-
 cpu variables.
 
 #define get_cpu() ({ preempt_disable(); smp_processor_id(); })
 #define put_cpu() preempt_enable()
 
 After you find where that smp_processor_id() is located, you should simply
 replace it with a get_cpu() and then you should insert a put_cpu immediately
 after the cpu info is not used anymore. That will define a proper and strict
 critical section around the use of the per-cpu variables.
 
 With a ppc vmlinux it should be immediate to find the location of
 smp_processor_id but I don't have the ppc vmlinux here.

Thanks Andrea for the details description. It is really helpful

I will look into this.

Thanks
-Bharat

 
 Thanks!
 Andrea
 
 
 
  Alex
 
  
   diff --git a/arch/powerpc/mm/hugetlbpage.c
   b/arch/powerpc/mm/hugetlbpage.c index d67db4b..6bf8459

Re: Error in frreing hugepages with preemption enabled

2013-12-03 Thread Andrea Arcangeli
Hi everyone,

On Fri, Nov 29, 2013 at 12:13:03PM +0100, Alexander Graf wrote:
 
 On 29.11.2013, at 05:38, Bharat Bhushan bharat.bhus...@freescale.com wrote:
 
  Hi Alex,
  
  I am running KVM guest with host kernel having CONFIG_PREEMPT enabled. With 
  allocated pages things seems to work fine but I uses hugepages for guest I 
  see below prints when quit from qemu.
  
  (qemu) QEMU waiting for connection on: telnet:0.0.0.0:,server
  qemu-system-ppc64: pci_add_option_rom: failed to find romfile 
  efi-virtio.rom
  q
  debug_smp_processor_id: 15 callbacks suppressed
  BUG: using smp_processor_id() in preemptible [] code: 
  qemu-system-ppc/2504
  caller is .free_hugepd_range+0xb0/0x21c
  CPU: 1 PID: 2504 Comm: qemu-system-ppc Not tainted 
  3.12.0-rc3-07733-gabf4907 #175
  Call Trace:
  [c000fb433400] [c0007d38] .show_stack+0x7c/0x1cc (unreliable)
  [c000fb4334d0] [c05e8ce0] .dump_stack+0x9c/0xf4
  [c000fb433560] [c02de5ec] .debug_smp_processor_id+0x108/0x11c
  [c000fb4335f0] [c0025e10] .free_hugepd_range+0xb0/0x21c
  [c000fb433680] [c00265bc] .hugetlb_free_pgd_range+0x2c8/0x3b0
  [c000fb4337a0] [c00e428c] .free_pgtables+0x14c/0x158
  [c000fb433840] [c00ef320] .exit_mmap+0xec/0x194
  [c000fb433960] [c004d780] .mmput+0x64/0x124
  [c000fb4339e0] [c0051f40] .do_exit+0x29c/0x9c8
  [c000fb433ae0] [c00527c8] .do_group_exit+0x50/0xc4
  [c000fb433b70] [c00606a0] .get_signal_to_deliver+0x21c/0x5d8
  [c000fb433c70] [c0009b08] .do_signal+0x54/0x278
  [c000fb433db0] [c0009e50] .do_notify_resume+0x64/0x78
  [c000fb433e30] [cb44] .ret_from_except_lite+0x70/0x74
  
  
  This mean that free_hugepd_range() must be called with preemption enabled.
 
 with preemption disabled.
 
  I tried below change and this seems to work fine (I am not having expertise 
  in this area so not sure this is correct way)
 
 Not sure - the scope looks odd to me. Let's ask Andrea - I'm sure he knows 
 what to do :).

:) So I had a look at the top of this function (0xb0) in the upstream
kernel and no smp_processor_id() call is apparent, is this stock git
or a ppc tree? The first few calls seem not to call it but I may have
overlooked something. It's just quicker if somebody with vmlinux finds
the location of it.

static void free_hugepd_range(struct mmu_gather *tlb, hugepd_t *hpdp, int 
pdshift,
  unsigned long start, unsigned long end,
  unsigned long floor, unsigned long ceiling)
{
pte_t *hugepte = hugepd_page(*hpdp);
int i;

unsigned long pdmask = ~((1UL  pdshift) - 1);
unsigned int num_hugepd = 1;

#ifdef CONFIG_PPC_FSL_BOOK3E
/* Note: On fsl the hpdp may be the first of several */
num_hugepd = (1  (hugepd_shift(*hpdp) - pdshift));
#else
unsigned int shift = hugepd_shift(*hpdp);
#endif

start = pdmask;
if (start  floor)
return;
if (ceiling) {
ceiling = pdmask;
if (! ceiling)
return;
}
if (end - 1  ceiling - 1)
return;

for (i = 0; i  num_hugepd; i++, hpdp++)
hpdp-pd = 0;

tlb-need_flush = 1;

#ifdef CONFIG_PPC_FSL_BOOK3E
hugepd_free(tlb, hugepte);
#else
pgtable_free_tlb(tlb, hugepte, pdshift - shift);
#endif
}

Generally smp_processor_id should never be used, exactly to avoid
problems like above with preempion enabled in .config.

Instead it should be replaced with a get_cpu()/put_cpu() pair that is
exactly meant to fix bugs like this and define proper critical
sections around the per-cpu variables.

#define get_cpu()   ({ preempt_disable(); smp_processor_id(); })
#define put_cpu()   preempt_enable()

After you find where that smp_processor_id() is located, you should
simply replace it with a get_cpu() and then you should insert a
put_cpu immediately after the cpu info is not used anymore. That
will define a proper and strict critical section around the use of the
per-cpu variables.

With a ppc vmlinux it should be immediate to find the location of
smp_processor_id but I don't have the ppc vmlinux here.

Thanks!
Andrea

 
 
 Alex
 
  
  diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
  index d67db4b..6bf8459 100644
  --- a/arch/powerpc/mm/hugetlbpage.c
  +++ b/arch/powerpc/mm/hugetlbpage.c
  @@ -563,8 +563,10 @@ static void hugetlb_free_pmd_range(struct mmu_gather 
  *tlb, pud_t *pud,
  */
 next = addr + (1  hugepd_shift(*(hugepd_t *)pmd));
  #endif
  +   preempt_disable();
 free_hugepd_range(tlb, (hugepd_t *)pmd, PMD_SHIFT,
   addr, next, floor, ceiling);
  +   preempt_enable();
 } while (addr = next, addr != end);
  

Re: Error in frreing hugepages with preemption enabled

2013-12-03 Thread Benjamin Herrenschmidt
On Tue, 2013-12-03 at 23:21 +0100, Andrea Arcangeli wrote:
 #ifdef CONFIG_PPC_FSL_BOOK3E
 hugepd_free(tlb, hugepte);
 ^^

This is the culprit

(Alex, you didn't specify this was embedded or did I miss it ?)

 #else
 pgtable_free_tlb(tlb, hugepte, pdshift - shift);
 #endif
 }

That function does:

batchp = __get_cpu_var(hugepd_freelist_cur);

IE, it tries to use a per-CPU batch. Basically, it's duplicating the
logic in mm/memory.c for RCU freeing using a per-cpu freelist. I suppose
it assumes being called under something like the page table lock ?

This code also never flushes the batch, which is a concern...

Alex, this is Freescale stuff, can you followup with them ?

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Error in frreing hugepages with preemption enabled

2013-11-29 Thread Alexander Graf

On 29.11.2013, at 05:38, Bharat Bhushan bharat.bhus...@freescale.com wrote:

 Hi Alex,
 
 I am running KVM guest with host kernel having CONFIG_PREEMPT enabled. With 
 allocated pages things seems to work fine but I uses hugepages for guest I 
 see below prints when quit from qemu.
 
 (qemu) QEMU waiting for connection on: telnet:0.0.0.0:,server
 qemu-system-ppc64: pci_add_option_rom: failed to find romfile efi-virtio.rom
 q
 debug_smp_processor_id: 15 callbacks suppressed
 BUG: using smp_processor_id() in preemptible [] code: 
 qemu-system-ppc/2504
 caller is .free_hugepd_range+0xb0/0x21c
 CPU: 1 PID: 2504 Comm: qemu-system-ppc Not tainted 3.12.0-rc3-07733-gabf4907 
 #175
 Call Trace:
 [c000fb433400] [c0007d38] .show_stack+0x7c/0x1cc (unreliable)
 [c000fb4334d0] [c05e8ce0] .dump_stack+0x9c/0xf4
 [c000fb433560] [c02de5ec] .debug_smp_processor_id+0x108/0x11c
 [c000fb4335f0] [c0025e10] .free_hugepd_range+0xb0/0x21c
 [c000fb433680] [c00265bc] .hugetlb_free_pgd_range+0x2c8/0x3b0
 [c000fb4337a0] [c00e428c] .free_pgtables+0x14c/0x158
 [c000fb433840] [c00ef320] .exit_mmap+0xec/0x194
 [c000fb433960] [c004d780] .mmput+0x64/0x124
 [c000fb4339e0] [c0051f40] .do_exit+0x29c/0x9c8
 [c000fb433ae0] [c00527c8] .do_group_exit+0x50/0xc4
 [c000fb433b70] [c00606a0] .get_signal_to_deliver+0x21c/0x5d8
 [c000fb433c70] [c0009b08] .do_signal+0x54/0x278
 [c000fb433db0] [c0009e50] .do_notify_resume+0x64/0x78
 [c000fb433e30] [cb44] .ret_from_except_lite+0x70/0x74
 
 
 This mean that free_hugepd_range() must be called with preemption enabled.

with preemption disabled.

 I tried below change and this seems to work fine (I am not having expertise 
 in this area so not sure this is correct way)

Not sure - the scope looks odd to me. Let's ask Andrea - I'm sure he knows what 
to do :).


Alex

 
 diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
 index d67db4b..6bf8459 100644
 --- a/arch/powerpc/mm/hugetlbpage.c
 +++ b/arch/powerpc/mm/hugetlbpage.c
 @@ -563,8 +563,10 @@ static void hugetlb_free_pmd_range(struct mmu_gather 
 *tlb, pud_t *pud,
 */
next = addr + (1  hugepd_shift(*(hugepd_t *)pmd));
 #endif
 +   preempt_disable();
free_hugepd_range(tlb, (hugepd_t *)pmd, PMD_SHIFT,
  addr, next, floor, ceiling);
 +   preempt_enable();
} while (addr = next, addr != end);
 
start = PUD_MASK;
 
 
 Thanks
 -Bharat
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm-ppc in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Error in frreing hugepages with preemption enabled

2013-11-28 Thread Bharat Bhushan
Hi Alex,

I am running KVM guest with host kernel having CONFIG_PREEMPT enabled. With 
allocated pages things seems to work fine but I uses hugepages for guest I see 
below prints when quit from qemu.

(qemu) QEMU waiting for connection on: telnet:0.0.0.0:,server
qemu-system-ppc64: pci_add_option_rom: failed to find romfile efi-virtio.rom
q
debug_smp_processor_id: 15 callbacks suppressed
BUG: using smp_processor_id() in preemptible [] code: 
qemu-system-ppc/2504
caller is .free_hugepd_range+0xb0/0x21c
CPU: 1 PID: 2504 Comm: qemu-system-ppc Not tainted 3.12.0-rc3-07733-gabf4907 
#175
Call Trace:
[c000fb433400] [c0007d38] .show_stack+0x7c/0x1cc (unreliable)
[c000fb4334d0] [c05e8ce0] .dump_stack+0x9c/0xf4
[c000fb433560] [c02de5ec] .debug_smp_processor_id+0x108/0x11c
[c000fb4335f0] [c0025e10] .free_hugepd_range+0xb0/0x21c
[c000fb433680] [c00265bc] .hugetlb_free_pgd_range+0x2c8/0x3b0
[c000fb4337a0] [c00e428c] .free_pgtables+0x14c/0x158
[c000fb433840] [c00ef320] .exit_mmap+0xec/0x194
[c000fb433960] [c004d780] .mmput+0x64/0x124
[c000fb4339e0] [c0051f40] .do_exit+0x29c/0x9c8
[c000fb433ae0] [c00527c8] .do_group_exit+0x50/0xc4
[c000fb433b70] [c00606a0] .get_signal_to_deliver+0x21c/0x5d8
[c000fb433c70] [c0009b08] .do_signal+0x54/0x278
[c000fb433db0] [c0009e50] .do_notify_resume+0x64/0x78
[c000fb433e30] [cb44] .ret_from_except_lite+0x70/0x74


This mean that free_hugepd_range() must be called with preemption enabled.
I tried below change and this seems to work fine (I am not having expertise in 
this area so not sure this is correct way)

diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index d67db4b..6bf8459 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -563,8 +563,10 @@ static void hugetlb_free_pmd_range(struct mmu_gather *tlb, 
pud_t *pud,
 */
next = addr + (1  hugepd_shift(*(hugepd_t *)pmd));
 #endif
+   preempt_disable();
free_hugepd_range(tlb, (hugepd_t *)pmd, PMD_SHIFT,
  addr, next, floor, ceiling);
+   preempt_enable();
} while (addr = next, addr != end);
 
start = PUD_MASK;


Thanks
-Bharat

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev