Re: KVM guests freeze under upstream kernel

2017-07-27 Thread Michael Ellerman
Suraj Jitindar Singh  writes:
>
...
> kernel BUG at 
> /scratch/surajjs/linux/arch/powerpc/include/asm/book3s/64/radix.h:260!

Next thing to try would be something like below.

cheers

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index d1da415e283c..c749a757738e 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -1016,6 +1016,7 @@ static inline unsigned long
 pmd_hugepage_update(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp,
unsigned long clr, unsigned long set)
 {
+   BUG_ON(set & PAGE_DEVMAP);
if (radix_enabled())
return radix__pmd_hugepage_update(mm, addr, pmdp, clr, set);
return hash__pmd_hugepage_update(mm, addr, pmdp, clr, set);
diff --git a/arch/powerpc/mm/pgtable-book3s64.c 
b/arch/powerpc/mm/pgtable-book3s64.c
index 31eed8fa8e99..55c443a3dd5b 100644
--- a/arch/powerpc/mm/pgtable-book3s64.c
+++ b/arch/powerpc/mm/pgtable-book3s64.c
@@ -31,6 +31,7 @@ int pmdp_set_access_flags(struct vm_area_struct *vma, 
unsigned long address,
  pmd_t *pmdp, pmd_t entry, int dirty)
 {
int changed;
+   BUG_ON(pmd & PAGE_DEVMAP);
 #ifdef CONFIG_DEBUG_VM
WARN_ON(!pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
assert_spin_locked(>vm_mm->page_table_lock);
@@ -56,6 +57,7 @@ int pmdp_test_and_clear_young(struct vm_area_struct *vma,
 void set_pmd_at(struct mm_struct *mm, unsigned long addr,
pmd_t *pmdp, pmd_t pmd)
 {
+   BUG_ON(pmd & PAGE_DEVMAP);
 #ifdef CONFIG_DEBUG_VM
WARN_ON(pte_present(pmd_pte(*pmdp)) && !pte_protnone(pmd_pte(*pmdp)));
assert_spin_locked(>page_table_lock);




Re: KVM guests freeze under upstream kernel

2017-07-27 Thread Suraj Jitindar Singh
On Thu, 2017-07-27 at 13:14 +1000, Michael Ellerman wrote:
> jos...@linux.vnet.ibm.com writes:
> > On Thu, Jul 20, 2017 at 10:18:18PM -0300, jos...@linux.vnet.ibm.com
> >  wrote:
> > > On Thu, Jul 20, 2017 at 03:21:59PM +1000, Paul Mackerras wrote:
> > > > 
> > > > Did you check the host kernel logs for any oops messages?
> > > 
> > > dmesg was clean but after sometime waiting (I forgot QEMU running
> > > in
> > > another terminal) I got the oops below (after rebooting the host
> > > I 
> > > couldn't reproduce it again).
> > > 
> > > Another test that I did was:
> > > Compile with transparent huge pages disabled: KVM works fine
> > > Compile with transparent huge pages enabled: doesn't work
> > >   + disabling it in /sys/kernel/mm/transparent_hugepage: doesn't
> > > work
> > > 
> > > Just out of my own curiosity I made this small change:
> > > 
> > > diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h
> > > b/arch/powerpc/include
> > > index c0737c8..f94a3b6 100644
> > > --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> > > +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> > > @@ -80,7 +80,7 @@
> > >  
> > >   #define _PAGE_SOFT_DIRTY   _RPAGE_SW3 /* software: software
> > > dirty
> > >   tracking 
> > >    #define _PAGE_SPECIAL  _RPAGE_SW2 /* software: special
> > > page */
> > >    -#define _PAGE_DEVMAP   _RPAGE_SW1 /* software:
> > > ZONE_DEVICE page */
> > >    +#define _PAGE_DEVMAP   _RPAGE_RSV3
> > > #define __HAVE_ARCH_PTE_DEVMAP
> > > 
> > > and it works. I chose _RPAGE_RSV3 because it uses the same value
> > > that
> > > x86 uses (0x0400UL) but I don't if it could have any
> > > side
> > > effect
> > > 
> > 
> > Does this change make any sense to you people?
> 
> No :)
> 
> I think it's just hiding the bug somehow. Presumably we have some
> code
> somewhere that is getting confused by _RPAGE_SW1 being set, or
> setting
> that bit incorrectly.

kernel BUG at 
/scratch/surajjs/linux/arch/powerpc/include/asm/book3s/64/radix.h:260!
Oops: Exception in kernel mode, sig: 5 [#1]
SMP NR_CPUS=2048 
NUMA 
PowerNV
Modules linked in:
CPU: 3 PID: 2050 Comm: qemu-system-ppc Not tainted 
4.13.0-rc2-1-g2f3013c-dirty #1
task: c00f1ebc task.stack: c00f1ec0
NIP: c0070fd4 LR: c00e2120 CTR: c00e20d0
REGS: c00f1ec036b0 TRAP: 0700   Not tainted  
(4.13.0-rc2-1-g2f3013c-dirty)
MSR: 9282b033 
  CR: 22244824  XER: 
CFAR: c0070e74 SOFTE: 1 
GPR00: 0009 c00f1ec03930 c1067400 19cf0a05 
GPR04: c000 050acf190f80 0005 0800 
GPR08: 0015 800f19cf0a05 c00f1eb64368 0009 
GPR12: 0009 cfd80f00 c00f1eca7a30 4000 
GPR16: 5f9f1780 40002000 7fff5fff 7fff879700a6 
GPR20: 8108 c110bce0 0f61 c00e20d0 
GPR24:  c00f1c7a6008 7fff6f60 7fff5fff 
GPR28: c00f19fd 0da0  c00f1ec03990 
NIP [c0070fd4] __find_linux_pte_or_hugepte+0x1d4/0x350
LR [c00e2120] kvm_unmap_radix+0x50/0x1d0
Call Trace:
[c00f1ec03930] [c00b2554] mark_page_dirty+0x34/0xa0 (unreliable)
[c00f1ec03970] [c00e2120] kvm_unmap_radix+0x50/0x1d0
[c00f1ec039c0] [c00dbea0] kvm_handle_hva_range+0x100/0x170
[c00f1ec03a30] [c00df43c] kvm_unmap_hva_range_hv+0x6c/0x80
[c00f1ec03a70] [c00c7588] kvm_unmap_hva_range+0x48/0x60
[c00f1ec03ab0] [c00bb77c] 
kvm_mmu_notifier_invalidate_range_start+0x8c/0x130
[c00f1ec03b10] [c0316f10] 
__mmu_notifier_invalidate_range_start+0xa0/0xf0
[c00f1ec03b60] [c02e95f0] change_protection+0x840/0xe20
[c00f1ec03cb0] [c0313050] change_prot_numa+0x50/0xd0
[c00f1ec03d00] [c0143f24] task_numa_work+0x2b4/0x3b0
[c00f1ec03dc0] [c0128738] task_work_run+0xf8/0x160
[c00f1ec03e00] [c001db94] do_notify_resume+0xe4/0xf0
[c00f1ec03e30] [c000b744] ret_from_except_lite+0x70/0x74
Instruction dump:
419e00ec 6000 78a70022 54a9403e 50a9c00e 54e3403e 50a9c42e 50e3c00e 
50e3c42e 792907c6 7d291b78 55270528 <0b07> 3ce04000 3c804000 78e707c6 
---[ end trace aecf406c356566bb ]---


The bug on added was:

arch/powerpc/include/asm/book3s/64/radix.h:260:
258 static inline int radix__pmd_trans_huge(pmd_t pmd)
259 {
260 BUG_ON(pmd_val(pmd) & _PAGE_DEVMAP);
261 return (pmd_val(pmd) & (_PAGE_PTE | _PAGE_DEVMAP)) == _PAGE_PTE;
262 }

> 
> cheers


Re: KVM guests freeze under upstream kernel

2017-07-26 Thread Michael Ellerman
jos...@linux.vnet.ibm.com writes:
> On Thu, Jul 20, 2017 at 10:18:18PM -0300, jos...@linux.vnet.ibm.com wrote:
>> On Thu, Jul 20, 2017 at 03:21:59PM +1000, Paul Mackerras wrote:
>> > 
>> > Did you check the host kernel logs for any oops messages?
>> 
>> dmesg was clean but after sometime waiting (I forgot QEMU running in
>> another terminal) I got the oops below (after rebooting the host I 
>> couldn't reproduce it again).
>> 
>> Another test that I did was:
>> Compile with transparent huge pages disabled: KVM works fine
>> Compile with transparent huge pages enabled: doesn't work
>>   + disabling it in /sys/kernel/mm/transparent_hugepage: doesn't work
>> 
>> Just out of my own curiosity I made this small change:
>> 
>> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h
>> b/arch/powerpc/include
>> index c0737c8..f94a3b6 100644
>> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
>> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> @@ -80,7 +80,7 @@
>>  
>>   #define _PAGE_SOFT_DIRTY   _RPAGE_SW3 /* software: software dirty
>>   tracking 
>>#define _PAGE_SPECIAL  _RPAGE_SW2 /* software: special page */
>>-#define _PAGE_DEVMAP   _RPAGE_SW1 /* software: ZONE_DEVICE page 
>> */
>>+#define _PAGE_DEVMAP   _RPAGE_RSV3
>> #define __HAVE_ARCH_PTE_DEVMAP
>> 
>> and it works. I chose _RPAGE_RSV3 because it uses the same value that
>> x86 uses (0x0400UL) but I don't if it could have any side
>> effect
>> 
>
> Does this change make any sense to you people?

No :)

I think it's just hiding the bug somehow. Presumably we have some code
somewhere that is getting confused by _RPAGE_SW1 being set, or setting
that bit incorrectly.

cheers


Re: KVM guests freeze under upstream kernel

2017-07-26 Thread joserz
On Thu, Jul 20, 2017 at 10:18:18PM -0300, jos...@linux.vnet.ibm.com wrote:
> On Thu, Jul 20, 2017 at 03:21:59PM +1000, Paul Mackerras wrote:
> > On Thu, Jul 20, 2017 at 12:02:23AM -0300, jos...@linux.vnet.ibm.com wrote:
> > > On Thu, Jul 20, 2017 at 09:42:50AM +1000, Benjamin Herrenschmidt wrote:
> > > > On Wed, 2017-07-19 at 16:46 -0300, jos...@linux.vnet.ibm.com wrote:
> > > > > Hello!
> > > > > 
> > > > > We're not able to boot any KVM guest using upstream kernel 
> > > > > (cb8c65ccff7f77d0285f1b126c72d37b2572c865 - 4.13.0-rc1+).
> > > > > After reaching the SLOF initial counting, the guest simply freezes:
> > > > 
> > > > Can you send our .config ?
> > > 
> > > Sure,
> > > 
> > > Answering Michael as well:
> > > 
> > > It's a P9 with RHEL kernel 4.11.0-10.el7a.ppc64le installed. The problem
> > > was noticed with kernel > 4.13 (I'm currently running 4.13.0-rc1+).
> > > 
> > > QEMU is https://github.com/dgibson/qemu (ppc-for-2.10) but I gave the
> > > default packaged Qemu a try.
> > > 
> > > For the guest, I tried both a vanilla Ubuntu 17.04 and the host kernel.
> > > But they had never a chance to run since the freezing happened in SLOF.
> > > 
> > > Note that using the 4.11.0-10.el7a.ppc64le kernel it works fine
> > > (for any of these Qemu/Guest setup). With 4.13.0-rc1 I have it run after
> > > reverting that referred commit.
> > 
> > Is the host kernel running in radix mode?
> 
> yes
> 
> > 
> > Did you check the host kernel logs for any oops messages?
> 
> dmesg was clean but after sometime waiting (I forgot QEMU running in
> another terminal) I got the oops below (after rebooting the host I 
> couldn't reproduce it again).
> 
> Another test that I did was:
> Compile with transparent huge pages disabled: KVM works fine
> Compile with transparent huge pages enabled: doesn't work
>   + disabling it in /sys/kernel/mm/transparent_hugepage: doesn't work
> 
> Just out of my own curiosity I made this small change:
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h
> b/arch/powerpc/include
> index c0737c8..f94a3b6 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -80,7 +80,7 @@
>  
>   #define _PAGE_SOFT_DIRTY   _RPAGE_SW3 /* software: software dirty
>   tracking 
>#define _PAGE_SPECIAL  _RPAGE_SW2 /* software: special page */
>-#define _PAGE_DEVMAP   _RPAGE_SW1 /* software: ZONE_DEVICE page */
>+#define _PAGE_DEVMAP   _RPAGE_RSV3
> #define __HAVE_ARCH_PTE_DEVMAP
> 
> and it works. I chose _RPAGE_RSV3 because it uses the same value that
> x86 uses (0x0400UL) but I don't if it could have any side
> effect
> 

Does this change make any sense to you people?
I didn't see any side effect expect that devices backed memory will have
a bigger address space in transparent huge pages IF I understand that
correctly.

If so I can send a patch with this change.

Thank you!!



Re: KVM guests freeze under upstream kernel

2017-07-20 Thread joserz
On Thu, Jul 20, 2017 at 03:21:59PM +1000, Paul Mackerras wrote:
> On Thu, Jul 20, 2017 at 12:02:23AM -0300, jos...@linux.vnet.ibm.com wrote:
> > On Thu, Jul 20, 2017 at 09:42:50AM +1000, Benjamin Herrenschmidt wrote:
> > > On Wed, 2017-07-19 at 16:46 -0300, jos...@linux.vnet.ibm.com wrote:
> > > > Hello!
> > > > 
> > > > We're not able to boot any KVM guest using upstream kernel 
> > > > (cb8c65ccff7f77d0285f1b126c72d37b2572c865 - 4.13.0-rc1+).
> > > > After reaching the SLOF initial counting, the guest simply freezes:
> > > 
> > > Can you send our .config ?
> > 
> > Sure,
> > 
> > Answering Michael as well:
> > 
> > It's a P9 with RHEL kernel 4.11.0-10.el7a.ppc64le installed. The problem
> > was noticed with kernel > 4.13 (I'm currently running 4.13.0-rc1+).
> > 
> > QEMU is https://github.com/dgibson/qemu (ppc-for-2.10) but I gave the
> > default packaged Qemu a try.
> > 
> > For the guest, I tried both a vanilla Ubuntu 17.04 and the host kernel.
> > But they had never a chance to run since the freezing happened in SLOF.
> > 
> > Note that using the 4.11.0-10.el7a.ppc64le kernel it works fine
> > (for any of these Qemu/Guest setup). With 4.13.0-rc1 I have it run after
> > reverting that referred commit.
> 
> Is the host kernel running in radix mode?

yes

> 
> Did you check the host kernel logs for any oops messages?

dmesg was clean but after sometime waiting (I forgot QEMU running in
another terminal) I got the oops below (after rebooting the host I 
couldn't reproduce it again).

Another test that I did was:
Compile with transparent huge pages disabled: KVM works fine
Compile with transparent huge pages enabled: doesn't work
  + disabling it in /sys/kernel/mm/transparent_hugepage: doesn't work

Just out of my own curiosity I made this small change:

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h
b/arch/powerpc/include
index c0737c8..f94a3b6 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -80,7 +80,7 @@
 
  #define _PAGE_SOFT_DIRTY   _RPAGE_SW3 /* software: software dirty
  tracking 
   #define _PAGE_SPECIAL  _RPAGE_SW2 /* software: special page */
   -#define _PAGE_DEVMAP   _RPAGE_SW1 /* software: ZONE_DEVICE page */
   +#define _PAGE_DEVMAP   _RPAGE_RSV3
#define __HAVE_ARCH_PTE_DEVMAP

and it works. I chose _RPAGE_RSV3 because it uses the same value that
x86 uses (0x0400UL) but I don't if it could have any side
effect


SLOF
**
QEMU Starting
 Build Date = Mar  3 2017 13:29:19
  FW Version = git-66d250ef0fd06bb8
   Press "s" to enter Open Firmware.

   [  105.604333] Unable to handle kernel paging request for data at
   address 0x
   [  105.604448] Faulting instruction address: 0xc0910b28
   [  105.604526] Oops: Kernel access of bad area, sig: 11 [#1]
   [  105.604585] SMP NR_CPUS=2048 
   [  105.604588] NUMA 
   [  105.604633] PowerNV
   [  105.604697] Modules linked in: xt_CHECKSUM ipt_MASQUERADE
   nf_nat_masquerade_ipv4 tun ip6t_rpfilter ipt_REJECT nf_reject_ipv4
   ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat
   ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6
   nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security
   ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
   nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw
   ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter
   kvm_hv kvm i2c_dev at24 ghash_generic ses enclosure gf128mul
   scsi_transport_sas xts sg ctr ipmi_powernv ipmi_devintf shpchp
   opal_prd vmx_crypto ipmi_msghandler uio_pdrv_genirq uio ofpart
   powernv_flash i2c_opal ibmpowernv mtd nfsd auth_rpcgss nfs_acl lockd
   grace sunrpc ip_tables xfs libcrc32c
   [  105.605561]  sd_mod ast i2c_algo_bit drm_kms_helper syscopyarea
   sysfillrect sysimgblt fb_sys_fops ttm drm i40e i2c_core aacraid ptp
   pps_core dm_mirror dm_region_hash dm_log dm_mod
   [  105.605759] CPU: 0 PID: 6 Comm: kworker/u32:0 Not tainted
   4.13.0-rc1+ #57
   [  105.605836] Workqueue: netns cleanup_net
   [  105.605880] task: c00ff6404200 task.stack: c00ff648c000
   [  105.605947] NIP: c0910b28 LR: c07cd6ec CTR:
   c07cd5d0
   [  105.606026] REGS: c00ff648f7d0 TRAP: 0300   Not tainted
   (4.13.0-rc1+)
   [  105.606090] MSR: 90009033 
   [  105.606111]   CR: 88002048  XER: 2000
   [  105.606203] CFAR: c07cd6e8 DAR:  DSISR:
   4000 SOFTE: 1 
   [  105.606203] GPR00: c07cd6ec c00ff648fa50
   c0f5c600  
   [  105.606203] GPR04: c00ff6404cc0 c00ff6404280
   782ccd5c cc908fe7 
   [  105.606203] GPR08:  c00ff648c000
   8000  
   [  105.606203] GPR12: c07cd5d0 cfb0
   c01050f8 c00ffa150ec0 

Re: KVM guests freeze under upstream kernel

2017-07-19 Thread Paul Mackerras
On Thu, Jul 20, 2017 at 12:02:23AM -0300, jos...@linux.vnet.ibm.com wrote:
> On Thu, Jul 20, 2017 at 09:42:50AM +1000, Benjamin Herrenschmidt wrote:
> > On Wed, 2017-07-19 at 16:46 -0300, jos...@linux.vnet.ibm.com wrote:
> > > Hello!
> > > 
> > > We're not able to boot any KVM guest using upstream kernel 
> > > (cb8c65ccff7f77d0285f1b126c72d37b2572c865 - 4.13.0-rc1+).
> > > After reaching the SLOF initial counting, the guest simply freezes:
> > 
> > Can you send our .config ?
> 
> Sure,
> 
> Answering Michael as well:
> 
> It's a P9 with RHEL kernel 4.11.0-10.el7a.ppc64le installed. The problem
> was noticed with kernel > 4.13 (I'm currently running 4.13.0-rc1+).
> 
> QEMU is https://github.com/dgibson/qemu (ppc-for-2.10) but I gave the
> default packaged Qemu a try.
> 
> For the guest, I tried both a vanilla Ubuntu 17.04 and the host kernel.
> But they had never a chance to run since the freezing happened in SLOF.
> 
> Note that using the 4.11.0-10.el7a.ppc64le kernel it works fine
> (for any of these Qemu/Guest setup). With 4.13.0-rc1 I have it run after
> reverting that referred commit.

Is the host kernel running in radix mode?

Did you check the host kernel logs for any oops messages?

Paul.


Re: KVM guests freeze under upstream kernel

2017-07-19 Thread joserz
On Thu, Jul 20, 2017 at 09:42:50AM +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2017-07-19 at 16:46 -0300, jos...@linux.vnet.ibm.com wrote:
> > Hello!
> > 
> > We're not able to boot any KVM guest using upstream kernel 
> > (cb8c65ccff7f77d0285f1b126c72d37b2572c865 - 4.13.0-rc1+).
> > After reaching the SLOF initial counting, the guest simply freezes:
> 
> Can you send our .config ?

Sure,

Answering Michael as well:

It's a P9 with RHEL kernel 4.11.0-10.el7a.ppc64le installed. The problem
was noticed with kernel > 4.13 (I'm currently running 4.13.0-rc1+).

QEMU is https://github.com/dgibson/qemu (ppc-for-2.10) but I gave the
default packaged Qemu a try.

For the guest, I tried both a vanilla Ubuntu 17.04 and the host kernel.
But they had never a chance to run since the freezing happened in SLOF.

Note that using the 4.11.0-10.el7a.ppc64le kernel it works fine
(for any of these Qemu/Guest setup). With 4.13.0-rc1 I have it run after
reverting that referred commit.

Thanks!

(config attached)
--
#
# Automatically generated file; DO NOT EDIT.
# Linux/powerpc 4.13.0-rc1 Kernel Configuration
#
CONFIG_PPC64=y

#
# Processor support
#
CONFIG_PPC_BOOK3S_64=y
# CONFIG_PPC_BOOK3E_64 is not set
# CONFIG_POWER7_CPU is not set
CONFIG_POWER8_CPU=y
CONFIG_PPC_BOOK3S=y
CONFIG_PPC_FPU=y
CONFIG_ALTIVEC=y
CONFIG_VSX=y
CONFIG_PPC_ICSWX=y
# CONFIG_PPC_ICSWX_PID is not set
# CONFIG_PPC_ICSWX_USE_SIGILL is not set
CONFIG_PPC_STD_MMU=y
CONFIG_PPC_STD_MMU_64=y
CONFIG_PPC_RADIX_MMU=y
CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=y
CONFIG_PPC_MM_SLICES=y
CONFIG_PPC_HAVE_PMU_SUPPORT=y
CONFIG_PPC_PERF_CTRS=y
CONFIG_FORCE_SMP=y
CONFIG_SMP=y
CONFIG_NR_CPUS=2048
CONFIG_PPC_DOORBELL=y
# CONFIG_CPU_BIG_ENDIAN is not set
CONFIG_CPU_LITTLE_ENDIAN=y
CONFIG_PPC64_BOOT_WRAPPER=y
CONFIG_64BIT=y
CONFIG_ARCH_PHYS_ADDR_T_64BIT=y
CONFIG_ARCH_DMA_ADDR_T_64BIT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MAX=29
CONFIG_ARCH_MMAP_RND_BITS_MIN=14
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=13
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=7
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NR_IRQS=512
CONFIG_NMI_IPI=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_HAS_DMA_SET_COHERENT_MASK=y
CONFIG_PPC=y
# CONFIG_GENERIC_CSUM is not set
CONFIG_EARLY_PRINTK=y
CONFIG_PANIC_TIMEOUT=180
CONFIG_COMPAT=y
CONFIG_SYSVIPC_COMPAT=y
CONFIG_SCHED_OMIT_FRAME_POINTER=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_PPC_UDBG_16550=y
# CONFIG_GENERIC_TBSYNC is not set
CONFIG_AUDIT_ARCH=y
CONFIG_GENERIC_BUG=y
CONFIG_EPAPR_BOOT=y
# CONFIG_DEFAULT_UIMAGE is not set
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
# CONFIG_PPC_DCR_NATIVE is not set
# CONFIG_PPC_DCR_MMIO is not set
# CONFIG_PPC_OF_PLATFORM_PCI is not set
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_PPC_EMULATE_SSTEP=y
CONFIG_ZONE_DMA32=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_XZ is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_CROSS_MEMORY_ATTACH=y
CONFIG_FHANDLE=y
# CONFIG_USELIB is not set
CONFIG_AUDIT=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_WATCH=y
CONFIG_AUDIT_TREE=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_IRQ_SHOW_LEVEL=y
CONFIG_GENERIC_IRQ_MIGRATION=y
CONFIG_HARDIRQS_SW_RESEND=y
CONFIG_IRQ_DOMAIN=y
CONFIG_GENERIC_MSI_IRQ=y
# CONFIG_IRQ_DOMAIN_DEBUG is not set
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
# CONFIG_GENERIC_IRQ_DEBUGFS is not set
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_ARCH_HAS_TICK_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
# CONFIG_NO_HZ_IDLE is not set
CONFIG_NO_HZ_FULL=y
# CONFIG_NO_HZ_FULL_ALL is not set
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y

#
# CPU/Task time and stats accounting
#
CONFIG_VIRT_CPU_ACCOUNTING=y
CONFIG_VIRT_CPU_ACCOUNTING_GEN=y
# CONFIG_IRQ_TIME_ACCOUNTING is not set
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y

#
# RCU Subsystem
#
CONFIG_TREE_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y
CONFIG_TREE_SRCU=y
# CONFIG_TASKS_RCU is not set
CONFIG_RCU_STALL_COMMON=y
CONFIG_RCU_NEED_SEGCBLIST=y
CONFIG_CONTEXT_TRACKING=y
# CONFIG_CONTEXT_TRACKING_FORCE is not set
CONFIG_RCU_NOCB_CPU=y
# CONFIG_BUILD_BIN2C is not set
# 

Re: KVM guests freeze under upstream kernel

2017-07-19 Thread Benjamin Herrenschmidt
On Wed, 2017-07-19 at 16:46 -0300, jos...@linux.vnet.ibm.com wrote:
> Hello!
> 
> We're not able to boot any KVM guest using upstream kernel 
> (cb8c65ccff7f77d0285f1b126c72d37b2572c865 - 4.13.0-rc1+).
> After reaching the SLOF initial counting, the guest simply freezes:

Can you send our .config ?

> SLOF
> **
> QEMU Starting
>  Build Date = Mar  3 2017 13:29:19
>   FW Version = git-66d250ef0fd06bb8
>Press "s" to enter Open Firmware.
> 
>C0360
> 
> After bisecting I found the commit:
> 
> https://github.com/torvalds/linux/commit/ebd3119
> 
> powerpc/mm: Add devmap support for ppc64
> 
> Add support for the devmap bit on PTEs and PMDs for PPC64 Book3S.  This
> is used to differentiate device backed memory from transparent huge
> pages since they are handled in more or less the same manner by the core
> mm code.
> 
> Reverting the commit and rebuilding 4.13.0-rc1+ was enough to make a 
> workaround.
> But I'll need some help from you guys in order to solve it.
> 
> Thanks!
> 
> Jose Ziviani


Re: KVM guests freeze under upstream kernel

2017-07-19 Thread Michael Ellerman
Thanks for the report.

My Jenkins does a test boot of a KVM guest, so there must be something 
different in our setups. Can you tell us all the details of your setup, eg. 
Hardware, host kernel, qemu, guest kernel etc. Thanks.

cheers

On 20 July 2017 05:46:34 GMT+10:00, jos...@linux.vnet.ibm.com wrote:
>Hello!
>
>We're not able to boot any KVM guest using upstream kernel
>(cb8c65ccff7f77d0285f1b126c72d37b2572c865 - 4.13.0-rc1+).
>After reaching the SLOF initial counting, the guest simply freezes:
>
>SLOF
>**
>QEMU Starting
> Build Date = Mar  3 2017 13:29:19
>  FW Version = git-66d250ef0fd06bb8
>   Press "s" to enter Open Firmware.
>
>   C0360
>
>After bisecting I found the commit:
>
>https://github.com/torvalds/linux/commit/ebd3119
>
>powerpc/mm: Add devmap support for ppc64
>
>Add support for the devmap bit on PTEs and PMDs for PPC64 Book3S.  This
>is used to differentiate device backed memory from transparent huge
>pages since they are handled in more or less the same manner by the
>core
>mm code.
>
>Reverting the commit and rebuilding 4.13.0-rc1+ was enough to make a
>workaround.
>But I'll need some help from you guys in order to solve it.
>
>Thanks!
>
>Jose Ziviani

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.