Re: sun4v+DMA related boot crash on 4.13-git

2017-07-11 Thread Tushar Dave



On 07/11/2017 09:02 PM, David Miller wrote:

From: Tushar Dave 
Date: Tue, 11 Jul 2017 20:43:39 -0700


Yes, indeed the bug is in Linus's tree. However, 'sparc' tree doesn't
have DMA API change (e.g. commit b02c2b0bfd7ae) yet that introduced
the panic.


You can simply make a note of this when you send the bug fix to me.:( yeah, I 
should have mentioned this when I sent patch to you. My bad.

Will make sure to left you a note for this kid of occurrence in future!

-Tushar


--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



Re: sun4v+DMA related boot crash on 4.13-git

2017-07-11 Thread Tushar Dave



On 07/11/2017 09:02 PM, David Miller wrote:

From: Tushar Dave 
Date: Tue, 11 Jul 2017 20:43:39 -0700


Yes, indeed the bug is in Linus's tree. However, 'sparc' tree doesn't
have DMA API change (e.g. commit b02c2b0bfd7ae) yet that introduced
the panic.


You can simply make a note of this when you send the bug fix to me.:( yeah, I 
should have mentioned this when I sent patch to you. My bad.

Will make sure to left you a note for this kid of occurrence in future!

-Tushar


--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



Re: sun4v+DMA related boot crash on 4.13-git

2017-07-11 Thread David Miller
From: Tushar Dave 
Date: Tue, 11 Jul 2017 20:43:39 -0700

> Yes, indeed the bug is in Linus's tree. However, 'sparc' tree doesn't
> have DMA API change (e.g. commit b02c2b0bfd7ae) yet that introduced
> the panic.

You can simply make a note of this when you send the bug fix to me.



Re: sun4v+DMA related boot crash on 4.13-git

2017-07-11 Thread David Miller
From: Tushar Dave 
Date: Tue, 11 Jul 2017 20:43:39 -0700

> Yes, indeed the bug is in Linus's tree. However, 'sparc' tree doesn't
> have DMA API change (e.g. commit b02c2b0bfd7ae) yet that introduced
> the panic.

You can simply make a note of this when you send the bug fix to me.



Re: sun4v+DMA related boot crash on 4.13-git

2017-07-11 Thread Tushar Dave



On 07/11/2017 05:34 PM, David Miller wrote:

From: Tushar Dave 
Date: Tue, 11 Jul 2017 15:38:21 -0700




On 07/11/2017 02:48 PM, Meelis Roos wrote:

I tested yesterdayd 4.12+git on sparc64 to see if the sparc merge
works
fine, and on all of my sun4v machines (T1000, T2000, T5120) it crashed
on boot with DMA-related stacktrace (below). Allt he machines are
sun4v
physical machines, not VM-s. Older sun4 machines do not exhibit this
problem.

Maybae DMA APi realted, maybe sparc64. Will try to bisect when I get
time.

I see whats going on with panic. I will reproduce locally. Will get
back
soon.

This patch should fix panic. Please give it a try.


Yes, this patch fixes it. Thank you for fixing it quickly!

Thanks for testing. Patch sent for sparc-next.

Why sparc-next - it should go into 4.13 since 4.13 would break all
niagara1 and niagara2 systems otherwise?This is sparc arch fix so I
used sparc tree(in this case for sparc-next).

I am open to maintainers suggestions. Thanks.


If the bug is in Linus's tree the fix must target 'sparc' not
'sparc-next'.


Dave,

Yes, indeed the bug is in Linus's tree. However, 'sparc' tree doesn't 
have DMA API change (e.g. commit b02c2b0bfd7ae) yet that introduced the 
panic. Looks like the DMA API changes have not merged into 'sparc' tree 
yet. In other words, 'sparc' tree doesn't have mentioned panic issue, 
nothing to fix there!
However, 'sparc-next' is up to date (or more close to) linus tree and 
has DMA API change that cause mentioned panic issue. So I have send 
patch targeted for sparc-next.


Let me know what should be the best tree to get this fix in and I will 
send v2.


Thanks.

-Tushar

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



Re: sun4v+DMA related boot crash on 4.13-git

2017-07-11 Thread Tushar Dave



On 07/11/2017 05:34 PM, David Miller wrote:

From: Tushar Dave 
Date: Tue, 11 Jul 2017 15:38:21 -0700




On 07/11/2017 02:48 PM, Meelis Roos wrote:

I tested yesterdayd 4.12+git on sparc64 to see if the sparc merge
works
fine, and on all of my sun4v machines (T1000, T2000, T5120) it crashed
on boot with DMA-related stacktrace (below). Allt he machines are
sun4v
physical machines, not VM-s. Older sun4 machines do not exhibit this
problem.

Maybae DMA APi realted, maybe sparc64. Will try to bisect when I get
time.

I see whats going on with panic. I will reproduce locally. Will get
back
soon.

This patch should fix panic. Please give it a try.


Yes, this patch fixes it. Thank you for fixing it quickly!

Thanks for testing. Patch sent for sparc-next.

Why sparc-next - it should go into 4.13 since 4.13 would break all
niagara1 and niagara2 systems otherwise?This is sparc arch fix so I
used sparc tree(in this case for sparc-next).

I am open to maintainers suggestions. Thanks.


If the bug is in Linus's tree the fix must target 'sparc' not
'sparc-next'.


Dave,

Yes, indeed the bug is in Linus's tree. However, 'sparc' tree doesn't 
have DMA API change (e.g. commit b02c2b0bfd7ae) yet that introduced the 
panic. Looks like the DMA API changes have not merged into 'sparc' tree 
yet. In other words, 'sparc' tree doesn't have mentioned panic issue, 
nothing to fix there!
However, 'sparc-next' is up to date (or more close to) linus tree and 
has DMA API change that cause mentioned panic issue. So I have send 
patch targeted for sparc-next.


Let me know what should be the best tree to get this fix in and I will 
send v2.


Thanks.

-Tushar

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



Re: sun4v+DMA related boot crash on 4.13-git

2017-07-11 Thread David Miller
From: Tushar Dave 
Date: Tue, 11 Jul 2017 15:38:21 -0700

> 
> 
> On 07/11/2017 02:48 PM, Meelis Roos wrote:
>>> I tested yesterdayd 4.12+git on sparc64 to see if the sparc merge
>>> works
>>> fine, and on all of my sun4v machines (T1000, T2000, T5120) it crashed
>>> on boot with DMA-related stacktrace (below). Allt he machines are
>>> sun4v
>>> physical machines, not VM-s. Older sun4 machines do not exhibit this
>>> problem.
>>>
>>> Maybae DMA APi realted, maybe sparc64. Will try to bisect when I get
>>> time.
>> I see whats going on with panic. I will reproduce locally. Will get
>> back
>> soon.
> This patch should fix panic. Please give it a try.

 Yes, this patch fixes it. Thank you for fixing it quickly!
>>> Thanks for testing. Patch sent for sparc-next.
>> Why sparc-next - it should go into 4.13 since 4.13 would break all
>> niagara1 and niagara2 systems otherwise?This is sparc arch fix so I
>> used sparc tree(in this case for sparc-next).
> I am open to maintainers suggestions. Thanks.

If the bug is in Linus's tree the fix must target 'sparc' not
'sparc-next'.


Re: sun4v+DMA related boot crash on 4.13-git

2017-07-11 Thread David Miller
From: Meelis Roos 
Date: Wed, 12 Jul 2017 00:48:07 +0300 (EEST)

> Why sparc-next - it should go into 4.13 since 4.13 would break all 
> niagara1 and niagara2 systems otherwise?

Absoultely, positively, correct.


Re: sun4v+DMA related boot crash on 4.13-git

2017-07-11 Thread David Miller
From: Tushar Dave 
Date: Tue, 11 Jul 2017 15:38:21 -0700

> 
> 
> On 07/11/2017 02:48 PM, Meelis Roos wrote:
>>> I tested yesterdayd 4.12+git on sparc64 to see if the sparc merge
>>> works
>>> fine, and on all of my sun4v machines (T1000, T2000, T5120) it crashed
>>> on boot with DMA-related stacktrace (below). Allt he machines are
>>> sun4v
>>> physical machines, not VM-s. Older sun4 machines do not exhibit this
>>> problem.
>>>
>>> Maybae DMA APi realted, maybe sparc64. Will try to bisect when I get
>>> time.
>> I see whats going on with panic. I will reproduce locally. Will get
>> back
>> soon.
> This patch should fix panic. Please give it a try.

 Yes, this patch fixes it. Thank you for fixing it quickly!
>>> Thanks for testing. Patch sent for sparc-next.
>> Why sparc-next - it should go into 4.13 since 4.13 would break all
>> niagara1 and niagara2 systems otherwise?This is sparc arch fix so I
>> used sparc tree(in this case for sparc-next).
> I am open to maintainers suggestions. Thanks.

If the bug is in Linus's tree the fix must target 'sparc' not
'sparc-next'.


Re: sun4v+DMA related boot crash on 4.13-git

2017-07-11 Thread David Miller
From: Meelis Roos 
Date: Wed, 12 Jul 2017 00:48:07 +0300 (EEST)

> Why sparc-next - it should go into 4.13 since 4.13 would break all 
> niagara1 and niagara2 systems otherwise?

Absoultely, positively, correct.


Re: sun4v+DMA related boot crash on 4.13-git

2017-07-11 Thread Tushar Dave



On 07/11/2017 02:48 PM, Meelis Roos wrote:

I tested yesterdayd 4.12+git on sparc64 to see if the sparc merge
works
fine, and on all of my sun4v machines (T1000, T2000, T5120) it crashed
on boot with DMA-related stacktrace (below). Allt he machines are
sun4v
physical machines, not VM-s. Older sun4 machines do not exhibit this
problem.

Maybae DMA APi realted, maybe sparc64. Will try to bisect when I get
time.

I see whats going on with panic. I will reproduce locally. Will get back
soon.

This patch should fix panic. Please give it a try.


Yes, this patch fixes it. Thank you for fixing it quickly!

Thanks for testing. Patch sent for sparc-next.


Why sparc-next - it should go into 4.13 since 4.13 would break all
niagara1 and niagara2 systems otherwise?This is sparc arch fix so I used sparc 
tree(in this case for sparc-next).

I am open to maintainers suggestions. Thanks.

-Tushar




-Tushar


commit b02c2b0bfd ("sparc: remove arch specific dma_supported
implementations") introduced a code that incorrectly allow dma_supported()
to
succeed for 64bit dma mask even if system doesn't have ATU IOMMU. 64bit
DMA
only supported on sun4v equipped with ATU IOMMU HW.

diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c
index 24f21c7..0a32c57 100644
--- a/arch/sparc/kernel/pci_sun4v.c
+++ b/arch/sparc/kernel/pci_sun4v.c
@@ -673,12 +673,14 @@ static void dma_4v_unmap_sg(struct device *dev,
struct
scatterlist *sglist,
   static int dma_4v_supported(struct device *dev, u64 device_mask)
   {
  struct iommu *iommu = dev->archdata.iommu;
-   u64 dma_addr_mask;
+   u64 dma_addr_mask = iommu->dma_addr_mask;

-   if (device_mask > DMA_BIT_MASK(32) && iommu->atu)
-   dma_addr_mask = iommu->atu->dma_addr_mask;
-   else
-   dma_addr_mask = iommu->dma_addr_mask;
+   if (device_mask > DMA_BIT_MASK(32)) {
+   if (iommu->atu)
+   dma_addr_mask = iommu->atu->dma_addr_mask;
+   else
+   return 0;
+   }

  if ((device_mask & dma_addr_mask) == dma_addr_mask)
  return 1;


-Tushar




-Tushar



[0.24] PROMLIB: Sun IEEE Boot Prom 'OBP 4.30.4.d 2011/07/06
14:29'
[0.33] PROMLIB: Root node compatible: sun4v
[0.79] Linux version 4.12.0-08915-gf263fbb (mroos@t2000) (gcc
version 4.9.2 (Debian 4.9.2-20)) #141 SMP Sun Jul 9 17:51:12 EEST 2017
[0.002047] bootconsole [earlyprom0] enabled
[0.002383] ARCH: SUN4V
[0.002668] Ethernet address: 00:14:4f:86:99:26
[0.003406] MM: PAGE_OFFSET is 0x8000 (max_phys_bits ==
39)
[0.004089] MM: VMALLOC [0x0001 --> 0x6000]
[0.004562] MM: VMEMMAP [0x6000 --> 0xc000]
[0.095699] Kernel: Using 3 locked TLB entries for main kernel
image.
[0.096387] Remapping the kernel...
[0.096400] done.
[1.906342] OF stdout device is: /virtual-devices@100/console@1
[1.907160] PROM: Built device tree with 148821 bytes of memory.
[1.907804] MDESC: Size is 42336 bytes.
[1.910139] PLATFORM: banner-name [Sun Fire T200]
[1.910564] PLATFORM: name [SUNW,Sun-Fire-T200]
[1.910919] PLATFORM: hostid [84869926]
[1.911224] PLATFORM: serial# [00ab4130]
[1.911536] PLATFORM: stick-frequency [3b9aca00]
[1.911894] PLATFORM: mac-address [144f869926]
[1.912241] PLATFORM: watchdog-resolution [1000 ms]
[1.912619] PLATFORM: watchdog-max-timeout [3153600 ms]
[1.913042] PLATFORM: max-cpus [32]
[1.913501] Top of RAM: 0x3ffd34000, Total RAM: 0x3f7918000
[1.913936] Memory hole size: 132MB
[2.279507] Allocated 16384 bytes for kernel page tables.
[2.280578] Zone ranges:
[2.280819]   Normal   [mem 0x0840-0x0003ffd33fff]
[2.281292] Movable zone start for each node
[2.281626] Early memory node ranges
[2.281916]   node   0: [mem 0x0840-0x0003ffc1]
[2.282557]   node   0: [mem 0x0003ffc28000-0x0003ffcfdfff]
[2.283030]   node   0: [mem 0x0003ffd0e000-0x0003ffd27fff]
[2.283514]   node   0: [mem 0x0003ffd2c000-0x0003ffd33fff]
[2.283994] Initmem setup node 0 [mem
0x0840-0x0003ffd33fff]
[2.782262] Booting Linux...
[2.782734] CPU CAPS:
[flush,stbar,swap,muldiv,v9,blkinit,mul32,div32]
[2.783255] CPU CAPS: [v8plus,ASIBlkInit]
[2.897543] percpu: Embedded 12 pages/cpu @8003ff80 s55872
r8192 d34240 u262144
[2.913264] SUN4V: Mondo queue sizes [cpu(4096) dev(16384) r(8192)
nr(256)]
[2.915492] Built 1 zonelists in Node order, mobility grouping on.
Total pages: 2063634
[2.916160] Policy zone: Normal
[2.916420] Kernel command line: root=/dev/sda1 ro
[2.918743] PID hash table entries: 4096 (order: 2, 32768 bytes)
[2.919230] Sorting __ex_table...
[3.220450] Memory: 16497120K/16639072K available (5521K kernel
code,
530K rwdata, 1224K rodata, 336K init, 699K bss, 141952K reserved, 

Re: sun4v+DMA related boot crash on 4.13-git

2017-07-11 Thread Tushar Dave



On 07/11/2017 02:48 PM, Meelis Roos wrote:

I tested yesterdayd 4.12+git on sparc64 to see if the sparc merge
works
fine, and on all of my sun4v machines (T1000, T2000, T5120) it crashed
on boot with DMA-related stacktrace (below). Allt he machines are
sun4v
physical machines, not VM-s. Older sun4 machines do not exhibit this
problem.

Maybae DMA APi realted, maybe sparc64. Will try to bisect when I get
time.

I see whats going on with panic. I will reproduce locally. Will get back
soon.

This patch should fix panic. Please give it a try.


Yes, this patch fixes it. Thank you for fixing it quickly!

Thanks for testing. Patch sent for sparc-next.


Why sparc-next - it should go into 4.13 since 4.13 would break all
niagara1 and niagara2 systems otherwise?This is sparc arch fix so I used sparc 
tree(in this case for sparc-next).

I am open to maintainers suggestions. Thanks.

-Tushar




-Tushar


commit b02c2b0bfd ("sparc: remove arch specific dma_supported
implementations") introduced a code that incorrectly allow dma_supported()
to
succeed for 64bit dma mask even if system doesn't have ATU IOMMU. 64bit
DMA
only supported on sun4v equipped with ATU IOMMU HW.

diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c
index 24f21c7..0a32c57 100644
--- a/arch/sparc/kernel/pci_sun4v.c
+++ b/arch/sparc/kernel/pci_sun4v.c
@@ -673,12 +673,14 @@ static void dma_4v_unmap_sg(struct device *dev,
struct
scatterlist *sglist,
   static int dma_4v_supported(struct device *dev, u64 device_mask)
   {
  struct iommu *iommu = dev->archdata.iommu;
-   u64 dma_addr_mask;
+   u64 dma_addr_mask = iommu->dma_addr_mask;

-   if (device_mask > DMA_BIT_MASK(32) && iommu->atu)
-   dma_addr_mask = iommu->atu->dma_addr_mask;
-   else
-   dma_addr_mask = iommu->dma_addr_mask;
+   if (device_mask > DMA_BIT_MASK(32)) {
+   if (iommu->atu)
+   dma_addr_mask = iommu->atu->dma_addr_mask;
+   else
+   return 0;
+   }

  if ((device_mask & dma_addr_mask) == dma_addr_mask)
  return 1;


-Tushar




-Tushar



[0.24] PROMLIB: Sun IEEE Boot Prom 'OBP 4.30.4.d 2011/07/06
14:29'
[0.33] PROMLIB: Root node compatible: sun4v
[0.79] Linux version 4.12.0-08915-gf263fbb (mroos@t2000) (gcc
version 4.9.2 (Debian 4.9.2-20)) #141 SMP Sun Jul 9 17:51:12 EEST 2017
[0.002047] bootconsole [earlyprom0] enabled
[0.002383] ARCH: SUN4V
[0.002668] Ethernet address: 00:14:4f:86:99:26
[0.003406] MM: PAGE_OFFSET is 0x8000 (max_phys_bits ==
39)
[0.004089] MM: VMALLOC [0x0001 --> 0x6000]
[0.004562] MM: VMEMMAP [0x6000 --> 0xc000]
[0.095699] Kernel: Using 3 locked TLB entries for main kernel
image.
[0.096387] Remapping the kernel...
[0.096400] done.
[1.906342] OF stdout device is: /virtual-devices@100/console@1
[1.907160] PROM: Built device tree with 148821 bytes of memory.
[1.907804] MDESC: Size is 42336 bytes.
[1.910139] PLATFORM: banner-name [Sun Fire T200]
[1.910564] PLATFORM: name [SUNW,Sun-Fire-T200]
[1.910919] PLATFORM: hostid [84869926]
[1.911224] PLATFORM: serial# [00ab4130]
[1.911536] PLATFORM: stick-frequency [3b9aca00]
[1.911894] PLATFORM: mac-address [144f869926]
[1.912241] PLATFORM: watchdog-resolution [1000 ms]
[1.912619] PLATFORM: watchdog-max-timeout [3153600 ms]
[1.913042] PLATFORM: max-cpus [32]
[1.913501] Top of RAM: 0x3ffd34000, Total RAM: 0x3f7918000
[1.913936] Memory hole size: 132MB
[2.279507] Allocated 16384 bytes for kernel page tables.
[2.280578] Zone ranges:
[2.280819]   Normal   [mem 0x0840-0x0003ffd33fff]
[2.281292] Movable zone start for each node
[2.281626] Early memory node ranges
[2.281916]   node   0: [mem 0x0840-0x0003ffc1]
[2.282557]   node   0: [mem 0x0003ffc28000-0x0003ffcfdfff]
[2.283030]   node   0: [mem 0x0003ffd0e000-0x0003ffd27fff]
[2.283514]   node   0: [mem 0x0003ffd2c000-0x0003ffd33fff]
[2.283994] Initmem setup node 0 [mem
0x0840-0x0003ffd33fff]
[2.782262] Booting Linux...
[2.782734] CPU CAPS:
[flush,stbar,swap,muldiv,v9,blkinit,mul32,div32]
[2.783255] CPU CAPS: [v8plus,ASIBlkInit]
[2.897543] percpu: Embedded 12 pages/cpu @8003ff80 s55872
r8192 d34240 u262144
[2.913264] SUN4V: Mondo queue sizes [cpu(4096) dev(16384) r(8192)
nr(256)]
[2.915492] Built 1 zonelists in Node order, mobility grouping on.
Total pages: 2063634
[2.916160] Policy zone: Normal
[2.916420] Kernel command line: root=/dev/sda1 ro
[2.918743] PID hash table entries: 4096 (order: 2, 32768 bytes)
[2.919230] Sorting __ex_table...
[3.220450] Memory: 16497120K/16639072K available (5521K kernel
code,
530K rwdata, 1224K rodata, 336K init, 699K bss, 141952K reserved, 

Re: sun4v+DMA related boot crash on 4.13-git

2017-07-11 Thread Meelis Roos
> > > > > I tested yesterdayd 4.12+git on sparc64 to see if the sparc merge
> > > > > works
> > > > > fine, and on all of my sun4v machines (T1000, T2000, T5120) it crashed
> > > > > on boot with DMA-related stacktrace (below). Allt he machines are
> > > > > sun4v
> > > > > physical machines, not VM-s. Older sun4 machines do not exhibit this
> > > > > problem.
> > > > >
> > > > > Maybae DMA APi realted, maybe sparc64. Will try to bisect when I get
> > > > > time.
> > > > I see whats going on with panic. I will reproduce locally. Will get back
> > > > soon.
> > > This patch should fix panic. Please give it a try.
> > 
> > Yes, this patch fixes it. Thank you for fixing it quickly!
> Thanks for testing. Patch sent for sparc-next.

Why sparc-next - it should go into 4.13 since 4.13 would break all 
niagara1 and niagara2 systems otherwise?

> 
> -Tushar
> > >
> > > commit b02c2b0bfd ("sparc: remove arch specific dma_supported
> > > implementations") introduced a code that incorrectly allow dma_supported()
> > > to
> > > succeed for 64bit dma mask even if system doesn't have ATU IOMMU. 64bit
> > > DMA
> > > only supported on sun4v equipped with ATU IOMMU HW.
> > >
> > > diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c
> > > index 24f21c7..0a32c57 100644
> > > --- a/arch/sparc/kernel/pci_sun4v.c
> > > +++ b/arch/sparc/kernel/pci_sun4v.c
> > > @@ -673,12 +673,14 @@ static void dma_4v_unmap_sg(struct device *dev,
> > > struct
> > > scatterlist *sglist,
> > >   static int dma_4v_supported(struct device *dev, u64 device_mask)
> > >   {
> > >  struct iommu *iommu = dev->archdata.iommu;
> > > -   u64 dma_addr_mask;
> > > +   u64 dma_addr_mask = iommu->dma_addr_mask;
> > >
> > > -   if (device_mask > DMA_BIT_MASK(32) && iommu->atu)
> > > -   dma_addr_mask = iommu->atu->dma_addr_mask;
> > > -   else
> > > -   dma_addr_mask = iommu->dma_addr_mask;
> > > +   if (device_mask > DMA_BIT_MASK(32)) {
> > > +   if (iommu->atu)
> > > +   dma_addr_mask = iommu->atu->dma_addr_mask;
> > > +   else
> > > +   return 0;
> > > +   }
> > >
> > >  if ((device_mask & dma_addr_mask) == dma_addr_mask)
> > >  return 1;
> > >
> > >
> > > -Tushar
> > >
> > >
> > > >
> > > > -Tushar
> > > > >
> > > > >
> > > > > [0.24] PROMLIB: Sun IEEE Boot Prom 'OBP 4.30.4.d 2011/07/06
> > > > > 14:29'
> > > > > [0.33] PROMLIB: Root node compatible: sun4v
> > > > > [0.79] Linux version 4.12.0-08915-gf263fbb (mroos@t2000) (gcc
> > > > > version 4.9.2 (Debian 4.9.2-20)) #141 SMP Sun Jul 9 17:51:12 EEST 2017
> > > > > [0.002047] bootconsole [earlyprom0] enabled
> > > > > [0.002383] ARCH: SUN4V
> > > > > [0.002668] Ethernet address: 00:14:4f:86:99:26
> > > > > [0.003406] MM: PAGE_OFFSET is 0x8000 (max_phys_bits ==
> > > > > 39)
> > > > > [0.004089] MM: VMALLOC [0x0001 --> 0x6000]
> > > > > [0.004562] MM: VMEMMAP [0x6000 --> 0xc000]
> > > > > [0.095699] Kernel: Using 3 locked TLB entries for main kernel
> > > > > image.
> > > > > [0.096387] Remapping the kernel...
> > > > > [0.096400] done.
> > > > > [1.906342] OF stdout device is: /virtual-devices@100/console@1
> > > > > [1.907160] PROM: Built device tree with 148821 bytes of memory.
> > > > > [1.907804] MDESC: Size is 42336 bytes.
> > > > > [1.910139] PLATFORM: banner-name [Sun Fire T200]
> > > > > [1.910564] PLATFORM: name [SUNW,Sun-Fire-T200]
> > > > > [1.910919] PLATFORM: hostid [84869926]
> > > > > [1.911224] PLATFORM: serial# [00ab4130]
> > > > > [1.911536] PLATFORM: stick-frequency [3b9aca00]
> > > > > [1.911894] PLATFORM: mac-address [144f869926]
> > > > > [1.912241] PLATFORM: watchdog-resolution [1000 ms]
> > > > > [1.912619] PLATFORM: watchdog-max-timeout [3153600 ms]
> > > > > [1.913042] PLATFORM: max-cpus [32]
> > > > > [1.913501] Top of RAM: 0x3ffd34000, Total RAM: 0x3f7918000
> > > > > [1.913936] Memory hole size: 132MB
> > > > > [2.279507] Allocated 16384 bytes for kernel page tables.
> > > > > [2.280578] Zone ranges:
> > > > > [2.280819]   Normal   [mem 0x0840-0x0003ffd33fff]
> > > > > [2.281292] Movable zone start for each node
> > > > > [2.281626] Early memory node ranges
> > > > > [2.281916]   node   0: [mem 0x0840-0x0003ffc1]
> > > > > [2.282557]   node   0: [mem 0x0003ffc28000-0x0003ffcfdfff]
> > > > > [2.283030]   node   0: [mem 0x0003ffd0e000-0x0003ffd27fff]
> > > > > [2.283514]   node   0: [mem 0x0003ffd2c000-0x0003ffd33fff]
> > > > > [2.283994] Initmem setup node 0 [mem
> > > > > 0x0840-0x0003ffd33fff]
> > > > > [2.782262] Booting Linux...
> > > > > [2.782734] CPU CAPS:
> > > > > 

Re: sun4v+DMA related boot crash on 4.13-git

2017-07-11 Thread Meelis Roos
> > > > > I tested yesterdayd 4.12+git on sparc64 to see if the sparc merge
> > > > > works
> > > > > fine, and on all of my sun4v machines (T1000, T2000, T5120) it crashed
> > > > > on boot with DMA-related stacktrace (below). Allt he machines are
> > > > > sun4v
> > > > > physical machines, not VM-s. Older sun4 machines do not exhibit this
> > > > > problem.
> > > > >
> > > > > Maybae DMA APi realted, maybe sparc64. Will try to bisect when I get
> > > > > time.
> > > > I see whats going on with panic. I will reproduce locally. Will get back
> > > > soon.
> > > This patch should fix panic. Please give it a try.
> > 
> > Yes, this patch fixes it. Thank you for fixing it quickly!
> Thanks for testing. Patch sent for sparc-next.

Why sparc-next - it should go into 4.13 since 4.13 would break all 
niagara1 and niagara2 systems otherwise?

> 
> -Tushar
> > >
> > > commit b02c2b0bfd ("sparc: remove arch specific dma_supported
> > > implementations") introduced a code that incorrectly allow dma_supported()
> > > to
> > > succeed for 64bit dma mask even if system doesn't have ATU IOMMU. 64bit
> > > DMA
> > > only supported on sun4v equipped with ATU IOMMU HW.
> > >
> > > diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c
> > > index 24f21c7..0a32c57 100644
> > > --- a/arch/sparc/kernel/pci_sun4v.c
> > > +++ b/arch/sparc/kernel/pci_sun4v.c
> > > @@ -673,12 +673,14 @@ static void dma_4v_unmap_sg(struct device *dev,
> > > struct
> > > scatterlist *sglist,
> > >   static int dma_4v_supported(struct device *dev, u64 device_mask)
> > >   {
> > >  struct iommu *iommu = dev->archdata.iommu;
> > > -   u64 dma_addr_mask;
> > > +   u64 dma_addr_mask = iommu->dma_addr_mask;
> > >
> > > -   if (device_mask > DMA_BIT_MASK(32) && iommu->atu)
> > > -   dma_addr_mask = iommu->atu->dma_addr_mask;
> > > -   else
> > > -   dma_addr_mask = iommu->dma_addr_mask;
> > > +   if (device_mask > DMA_BIT_MASK(32)) {
> > > +   if (iommu->atu)
> > > +   dma_addr_mask = iommu->atu->dma_addr_mask;
> > > +   else
> > > +   return 0;
> > > +   }
> > >
> > >  if ((device_mask & dma_addr_mask) == dma_addr_mask)
> > >  return 1;
> > >
> > >
> > > -Tushar
> > >
> > >
> > > >
> > > > -Tushar
> > > > >
> > > > >
> > > > > [0.24] PROMLIB: Sun IEEE Boot Prom 'OBP 4.30.4.d 2011/07/06
> > > > > 14:29'
> > > > > [0.33] PROMLIB: Root node compatible: sun4v
> > > > > [0.79] Linux version 4.12.0-08915-gf263fbb (mroos@t2000) (gcc
> > > > > version 4.9.2 (Debian 4.9.2-20)) #141 SMP Sun Jul 9 17:51:12 EEST 2017
> > > > > [0.002047] bootconsole [earlyprom0] enabled
> > > > > [0.002383] ARCH: SUN4V
> > > > > [0.002668] Ethernet address: 00:14:4f:86:99:26
> > > > > [0.003406] MM: PAGE_OFFSET is 0x8000 (max_phys_bits ==
> > > > > 39)
> > > > > [0.004089] MM: VMALLOC [0x0001 --> 0x6000]
> > > > > [0.004562] MM: VMEMMAP [0x6000 --> 0xc000]
> > > > > [0.095699] Kernel: Using 3 locked TLB entries for main kernel
> > > > > image.
> > > > > [0.096387] Remapping the kernel...
> > > > > [0.096400] done.
> > > > > [1.906342] OF stdout device is: /virtual-devices@100/console@1
> > > > > [1.907160] PROM: Built device tree with 148821 bytes of memory.
> > > > > [1.907804] MDESC: Size is 42336 bytes.
> > > > > [1.910139] PLATFORM: banner-name [Sun Fire T200]
> > > > > [1.910564] PLATFORM: name [SUNW,Sun-Fire-T200]
> > > > > [1.910919] PLATFORM: hostid [84869926]
> > > > > [1.911224] PLATFORM: serial# [00ab4130]
> > > > > [1.911536] PLATFORM: stick-frequency [3b9aca00]
> > > > > [1.911894] PLATFORM: mac-address [144f869926]
> > > > > [1.912241] PLATFORM: watchdog-resolution [1000 ms]
> > > > > [1.912619] PLATFORM: watchdog-max-timeout [3153600 ms]
> > > > > [1.913042] PLATFORM: max-cpus [32]
> > > > > [1.913501] Top of RAM: 0x3ffd34000, Total RAM: 0x3f7918000
> > > > > [1.913936] Memory hole size: 132MB
> > > > > [2.279507] Allocated 16384 bytes for kernel page tables.
> > > > > [2.280578] Zone ranges:
> > > > > [2.280819]   Normal   [mem 0x0840-0x0003ffd33fff]
> > > > > [2.281292] Movable zone start for each node
> > > > > [2.281626] Early memory node ranges
> > > > > [2.281916]   node   0: [mem 0x0840-0x0003ffc1]
> > > > > [2.282557]   node   0: [mem 0x0003ffc28000-0x0003ffcfdfff]
> > > > > [2.283030]   node   0: [mem 0x0003ffd0e000-0x0003ffd27fff]
> > > > > [2.283514]   node   0: [mem 0x0003ffd2c000-0x0003ffd33fff]
> > > > > [2.283994] Initmem setup node 0 [mem
> > > > > 0x0840-0x0003ffd33fff]
> > > > > [2.782262] Booting Linux...
> > > > > [2.782734] CPU CAPS:
> > > > > 

Re: sun4v+DMA related boot crash on 4.13-git

2017-07-11 Thread Tushar Dave



On 07/10/2017 10:05 PM, Meelis Roos wrote:

I tested yesterdayd 4.12+git on sparc64 to see if the sparc merge works
fine, and on all of my sun4v machines (T1000, T2000, T5120) it crashed
on boot with DMA-related stacktrace (below). Allt he machines are sun4v
physical machines, not VM-s. Older sun4 machines do not exhibit this
problem.

Maybae DMA APi realted, maybe sparc64. Will try to bisect when I get
time.

I see whats going on with panic. I will reproduce locally. Will get back
soon.

This patch should fix panic. Please give it a try.


Yes, this patch fixes it. Thank you for fixing it quickly!

Thanks for testing. Patch sent for sparc-next.

-Tushar


commit b02c2b0bfd ("sparc: remove arch specific dma_supported
implementations") introduced a code that incorrectly allow dma_supported() to
succeed for 64bit dma mask even if system doesn't have ATU IOMMU. 64bit DMA
only supported on sun4v equipped with ATU IOMMU HW.

diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c
index 24f21c7..0a32c57 100644
--- a/arch/sparc/kernel/pci_sun4v.c
+++ b/arch/sparc/kernel/pci_sun4v.c
@@ -673,12 +673,14 @@ static void dma_4v_unmap_sg(struct device *dev, struct
scatterlist *sglist,
  static int dma_4v_supported(struct device *dev, u64 device_mask)
  {
 struct iommu *iommu = dev->archdata.iommu;
-   u64 dma_addr_mask;
+   u64 dma_addr_mask = iommu->dma_addr_mask;

-   if (device_mask > DMA_BIT_MASK(32) && iommu->atu)
-   dma_addr_mask = iommu->atu->dma_addr_mask;
-   else
-   dma_addr_mask = iommu->dma_addr_mask;
+   if (device_mask > DMA_BIT_MASK(32)) {
+   if (iommu->atu)
+   dma_addr_mask = iommu->atu->dma_addr_mask;
+   else
+   return 0;
+   }

 if ((device_mask & dma_addr_mask) == dma_addr_mask)
 return 1;


-Tushar




-Tushar



[0.24] PROMLIB: Sun IEEE Boot Prom 'OBP 4.30.4.d 2011/07/06 14:29'
[0.33] PROMLIB: Root node compatible: sun4v
[0.79] Linux version 4.12.0-08915-gf263fbb (mroos@t2000) (gcc
version 4.9.2 (Debian 4.9.2-20)) #141 SMP Sun Jul 9 17:51:12 EEST 2017
[0.002047] bootconsole [earlyprom0] enabled
[0.002383] ARCH: SUN4V
[0.002668] Ethernet address: 00:14:4f:86:99:26
[0.003406] MM: PAGE_OFFSET is 0x8000 (max_phys_bits == 39)
[0.004089] MM: VMALLOC [0x0001 --> 0x6000]
[0.004562] MM: VMEMMAP [0x6000 --> 0xc000]
[0.095699] Kernel: Using 3 locked TLB entries for main kernel image.
[0.096387] Remapping the kernel...
[0.096400] done.
[1.906342] OF stdout device is: /virtual-devices@100/console@1
[1.907160] PROM: Built device tree with 148821 bytes of memory.
[1.907804] MDESC: Size is 42336 bytes.
[1.910139] PLATFORM: banner-name [Sun Fire T200]
[1.910564] PLATFORM: name [SUNW,Sun-Fire-T200]
[1.910919] PLATFORM: hostid [84869926]
[1.911224] PLATFORM: serial# [00ab4130]
[1.911536] PLATFORM: stick-frequency [3b9aca00]
[1.911894] PLATFORM: mac-address [144f869926]
[1.912241] PLATFORM: watchdog-resolution [1000 ms]
[1.912619] PLATFORM: watchdog-max-timeout [3153600 ms]
[1.913042] PLATFORM: max-cpus [32]
[1.913501] Top of RAM: 0x3ffd34000, Total RAM: 0x3f7918000
[1.913936] Memory hole size: 132MB
[2.279507] Allocated 16384 bytes for kernel page tables.
[2.280578] Zone ranges:
[2.280819]   Normal   [mem 0x0840-0x0003ffd33fff]
[2.281292] Movable zone start for each node
[2.281626] Early memory node ranges
[2.281916]   node   0: [mem 0x0840-0x0003ffc1]
[2.282557]   node   0: [mem 0x0003ffc28000-0x0003ffcfdfff]
[2.283030]   node   0: [mem 0x0003ffd0e000-0x0003ffd27fff]
[2.283514]   node   0: [mem 0x0003ffd2c000-0x0003ffd33fff]
[2.283994] Initmem setup node 0 [mem
0x0840-0x0003ffd33fff]
[2.782262] Booting Linux...
[2.782734] CPU CAPS: [flush,stbar,swap,muldiv,v9,blkinit,mul32,div32]
[2.783255] CPU CAPS: [v8plus,ASIBlkInit]
[2.897543] percpu: Embedded 12 pages/cpu @8003ff80 s55872
r8192 d34240 u262144
[2.913264] SUN4V: Mondo queue sizes [cpu(4096) dev(16384) r(8192)
nr(256)]
[2.915492] Built 1 zonelists in Node order, mobility grouping on.
Total pages: 2063634
[2.916160] Policy zone: Normal
[2.916420] Kernel command line: root=/dev/sda1 ro
[2.918743] PID hash table entries: 4096 (order: 2, 32768 bytes)
[2.919230] Sorting __ex_table...
[3.220450] Memory: 16497120K/16639072K available (5521K kernel code,
530K rwdata, 1224K rodata, 336K init, 699K bss, 141952K reserved, 0K
cma-reserved)
[3.223109] Hierarchical RCU implementation.
[3.223452] RCU restricting CPUs from NR_CPUS=256 to nr_cpu_ids=16.
[3.223933] RCU: Adjusting geometry for rcu_fanout_leaf=16,
nr_cpu_ids=16
[3.225508] NR_IRQS: 2048, 

Re: sun4v+DMA related boot crash on 4.13-git

2017-07-11 Thread Tushar Dave



On 07/10/2017 10:05 PM, Meelis Roos wrote:

I tested yesterdayd 4.12+git on sparc64 to see if the sparc merge works
fine, and on all of my sun4v machines (T1000, T2000, T5120) it crashed
on boot with DMA-related stacktrace (below). Allt he machines are sun4v
physical machines, not VM-s. Older sun4 machines do not exhibit this
problem.

Maybae DMA APi realted, maybe sparc64. Will try to bisect when I get
time.

I see whats going on with panic. I will reproduce locally. Will get back
soon.

This patch should fix panic. Please give it a try.


Yes, this patch fixes it. Thank you for fixing it quickly!

Thanks for testing. Patch sent for sparc-next.

-Tushar


commit b02c2b0bfd ("sparc: remove arch specific dma_supported
implementations") introduced a code that incorrectly allow dma_supported() to
succeed for 64bit dma mask even if system doesn't have ATU IOMMU. 64bit DMA
only supported on sun4v equipped with ATU IOMMU HW.

diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c
index 24f21c7..0a32c57 100644
--- a/arch/sparc/kernel/pci_sun4v.c
+++ b/arch/sparc/kernel/pci_sun4v.c
@@ -673,12 +673,14 @@ static void dma_4v_unmap_sg(struct device *dev, struct
scatterlist *sglist,
  static int dma_4v_supported(struct device *dev, u64 device_mask)
  {
 struct iommu *iommu = dev->archdata.iommu;
-   u64 dma_addr_mask;
+   u64 dma_addr_mask = iommu->dma_addr_mask;

-   if (device_mask > DMA_BIT_MASK(32) && iommu->atu)
-   dma_addr_mask = iommu->atu->dma_addr_mask;
-   else
-   dma_addr_mask = iommu->dma_addr_mask;
+   if (device_mask > DMA_BIT_MASK(32)) {
+   if (iommu->atu)
+   dma_addr_mask = iommu->atu->dma_addr_mask;
+   else
+   return 0;
+   }

 if ((device_mask & dma_addr_mask) == dma_addr_mask)
 return 1;


-Tushar




-Tushar



[0.24] PROMLIB: Sun IEEE Boot Prom 'OBP 4.30.4.d 2011/07/06 14:29'
[0.33] PROMLIB: Root node compatible: sun4v
[0.79] Linux version 4.12.0-08915-gf263fbb (mroos@t2000) (gcc
version 4.9.2 (Debian 4.9.2-20)) #141 SMP Sun Jul 9 17:51:12 EEST 2017
[0.002047] bootconsole [earlyprom0] enabled
[0.002383] ARCH: SUN4V
[0.002668] Ethernet address: 00:14:4f:86:99:26
[0.003406] MM: PAGE_OFFSET is 0x8000 (max_phys_bits == 39)
[0.004089] MM: VMALLOC [0x0001 --> 0x6000]
[0.004562] MM: VMEMMAP [0x6000 --> 0xc000]
[0.095699] Kernel: Using 3 locked TLB entries for main kernel image.
[0.096387] Remapping the kernel...
[0.096400] done.
[1.906342] OF stdout device is: /virtual-devices@100/console@1
[1.907160] PROM: Built device tree with 148821 bytes of memory.
[1.907804] MDESC: Size is 42336 bytes.
[1.910139] PLATFORM: banner-name [Sun Fire T200]
[1.910564] PLATFORM: name [SUNW,Sun-Fire-T200]
[1.910919] PLATFORM: hostid [84869926]
[1.911224] PLATFORM: serial# [00ab4130]
[1.911536] PLATFORM: stick-frequency [3b9aca00]
[1.911894] PLATFORM: mac-address [144f869926]
[1.912241] PLATFORM: watchdog-resolution [1000 ms]
[1.912619] PLATFORM: watchdog-max-timeout [3153600 ms]
[1.913042] PLATFORM: max-cpus [32]
[1.913501] Top of RAM: 0x3ffd34000, Total RAM: 0x3f7918000
[1.913936] Memory hole size: 132MB
[2.279507] Allocated 16384 bytes for kernel page tables.
[2.280578] Zone ranges:
[2.280819]   Normal   [mem 0x0840-0x0003ffd33fff]
[2.281292] Movable zone start for each node
[2.281626] Early memory node ranges
[2.281916]   node   0: [mem 0x0840-0x0003ffc1]
[2.282557]   node   0: [mem 0x0003ffc28000-0x0003ffcfdfff]
[2.283030]   node   0: [mem 0x0003ffd0e000-0x0003ffd27fff]
[2.283514]   node   0: [mem 0x0003ffd2c000-0x0003ffd33fff]
[2.283994] Initmem setup node 0 [mem
0x0840-0x0003ffd33fff]
[2.782262] Booting Linux...
[2.782734] CPU CAPS: [flush,stbar,swap,muldiv,v9,blkinit,mul32,div32]
[2.783255] CPU CAPS: [v8plus,ASIBlkInit]
[2.897543] percpu: Embedded 12 pages/cpu @8003ff80 s55872
r8192 d34240 u262144
[2.913264] SUN4V: Mondo queue sizes [cpu(4096) dev(16384) r(8192)
nr(256)]
[2.915492] Built 1 zonelists in Node order, mobility grouping on.
Total pages: 2063634
[2.916160] Policy zone: Normal
[2.916420] Kernel command line: root=/dev/sda1 ro
[2.918743] PID hash table entries: 4096 (order: 2, 32768 bytes)
[2.919230] Sorting __ex_table...
[3.220450] Memory: 16497120K/16639072K available (5521K kernel code,
530K rwdata, 1224K rodata, 336K init, 699K bss, 141952K reserved, 0K
cma-reserved)
[3.223109] Hierarchical RCU implementation.
[3.223452] RCU restricting CPUs from NR_CPUS=256 to nr_cpu_ids=16.
[3.223933] RCU: Adjusting geometry for rcu_fanout_leaf=16,
nr_cpu_ids=16
[3.225508] NR_IRQS: 2048, 

Re: sun4v+DMA related boot crash on 4.13-git

2017-07-10 Thread Meelis Roos
> > > I tested yesterdayd 4.12+git on sparc64 to see if the sparc merge works
> > > fine, and on all of my sun4v machines (T1000, T2000, T5120) it crashed
> > > on boot with DMA-related stacktrace (below). Allt he machines are sun4v
> > > physical machines, not VM-s. Older sun4 machines do not exhibit this
> > > problem.
> > >
> > > Maybae DMA APi realted, maybe sparc64. Will try to bisect when I get
> > > time.
> > I see whats going on with panic. I will reproduce locally. Will get back
> > soon.
> This patch should fix panic. Please give it a try.

Yes, this patch fixes it. Thank you for fixing it quickly!
> 
> commit b02c2b0bfd ("sparc: remove arch specific dma_supported
> implementations") introduced a code that incorrectly allow dma_supported() to
> succeed for 64bit dma mask even if system doesn't have ATU IOMMU. 64bit DMA
> only supported on sun4v equipped with ATU IOMMU HW.
> 
> diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c
> index 24f21c7..0a32c57 100644
> --- a/arch/sparc/kernel/pci_sun4v.c
> +++ b/arch/sparc/kernel/pci_sun4v.c
> @@ -673,12 +673,14 @@ static void dma_4v_unmap_sg(struct device *dev, struct
> scatterlist *sglist,
>  static int dma_4v_supported(struct device *dev, u64 device_mask)
>  {
> struct iommu *iommu = dev->archdata.iommu;
> -   u64 dma_addr_mask;
> +   u64 dma_addr_mask = iommu->dma_addr_mask;
> 
> -   if (device_mask > DMA_BIT_MASK(32) && iommu->atu)
> -   dma_addr_mask = iommu->atu->dma_addr_mask;
> -   else
> -   dma_addr_mask = iommu->dma_addr_mask;
> +   if (device_mask > DMA_BIT_MASK(32)) {
> +   if (iommu->atu)
> +   dma_addr_mask = iommu->atu->dma_addr_mask;
> +   else
> +   return 0;
> +   }
> 
> if ((device_mask & dma_addr_mask) == dma_addr_mask)
> return 1;
> 
> 
> -Tushar
> 
> 
> > 
> > -Tushar
> > >
> > >
> > > [0.24] PROMLIB: Sun IEEE Boot Prom 'OBP 4.30.4.d 2011/07/06 14:29'
> > > [0.33] PROMLIB: Root node compatible: sun4v
> > > [0.79] Linux version 4.12.0-08915-gf263fbb (mroos@t2000) (gcc
> > > version 4.9.2 (Debian 4.9.2-20)) #141 SMP Sun Jul 9 17:51:12 EEST 2017
> > > [0.002047] bootconsole [earlyprom0] enabled
> > > [0.002383] ARCH: SUN4V
> > > [0.002668] Ethernet address: 00:14:4f:86:99:26
> > > [0.003406] MM: PAGE_OFFSET is 0x8000 (max_phys_bits == 39)
> > > [0.004089] MM: VMALLOC [0x0001 --> 0x6000]
> > > [0.004562] MM: VMEMMAP [0x6000 --> 0xc000]
> > > [0.095699] Kernel: Using 3 locked TLB entries for main kernel image.
> > > [0.096387] Remapping the kernel...
> > > [0.096400] done.
> > > [1.906342] OF stdout device is: /virtual-devices@100/console@1
> > > [1.907160] PROM: Built device tree with 148821 bytes of memory.
> > > [1.907804] MDESC: Size is 42336 bytes.
> > > [1.910139] PLATFORM: banner-name [Sun Fire T200]
> > > [1.910564] PLATFORM: name [SUNW,Sun-Fire-T200]
> > > [1.910919] PLATFORM: hostid [84869926]
> > > [1.911224] PLATFORM: serial# [00ab4130]
> > > [1.911536] PLATFORM: stick-frequency [3b9aca00]
> > > [1.911894] PLATFORM: mac-address [144f869926]
> > > [1.912241] PLATFORM: watchdog-resolution [1000 ms]
> > > [1.912619] PLATFORM: watchdog-max-timeout [3153600 ms]
> > > [1.913042] PLATFORM: max-cpus [32]
> > > [1.913501] Top of RAM: 0x3ffd34000, Total RAM: 0x3f7918000
> > > [1.913936] Memory hole size: 132MB
> > > [2.279507] Allocated 16384 bytes for kernel page tables.
> > > [2.280578] Zone ranges:
> > > [2.280819]   Normal   [mem 0x0840-0x0003ffd33fff]
> > > [2.281292] Movable zone start for each node
> > > [2.281626] Early memory node ranges
> > > [2.281916]   node   0: [mem 0x0840-0x0003ffc1]
> > > [2.282557]   node   0: [mem 0x0003ffc28000-0x0003ffcfdfff]
> > > [2.283030]   node   0: [mem 0x0003ffd0e000-0x0003ffd27fff]
> > > [2.283514]   node   0: [mem 0x0003ffd2c000-0x0003ffd33fff]
> > > [2.283994] Initmem setup node 0 [mem
> > > 0x0840-0x0003ffd33fff]
> > > [2.782262] Booting Linux...
> > > [2.782734] CPU CAPS: [flush,stbar,swap,muldiv,v9,blkinit,mul32,div32]
> > > [2.783255] CPU CAPS: [v8plus,ASIBlkInit]
> > > [2.897543] percpu: Embedded 12 pages/cpu @8003ff80 s55872
> > > r8192 d34240 u262144
> > > [2.913264] SUN4V: Mondo queue sizes [cpu(4096) dev(16384) r(8192)
> > > nr(256)]
> > > [2.915492] Built 1 zonelists in Node order, mobility grouping on.
> > > Total pages: 2063634
> > > [2.916160] Policy zone: Normal
> > > [2.916420] Kernel command line: root=/dev/sda1 ro
> > > [2.918743] PID hash table entries: 4096 (order: 2, 32768 bytes)
> > > [2.919230] Sorting __ex_table...
> > > [3.220450] Memory: 16497120K/16639072K 

Re: sun4v+DMA related boot crash on 4.13-git

2017-07-10 Thread Meelis Roos
> > > I tested yesterdayd 4.12+git on sparc64 to see if the sparc merge works
> > > fine, and on all of my sun4v machines (T1000, T2000, T5120) it crashed
> > > on boot with DMA-related stacktrace (below). Allt he machines are sun4v
> > > physical machines, not VM-s. Older sun4 machines do not exhibit this
> > > problem.
> > >
> > > Maybae DMA APi realted, maybe sparc64. Will try to bisect when I get
> > > time.
> > I see whats going on with panic. I will reproduce locally. Will get back
> > soon.
> This patch should fix panic. Please give it a try.

Yes, this patch fixes it. Thank you for fixing it quickly!
> 
> commit b02c2b0bfd ("sparc: remove arch specific dma_supported
> implementations") introduced a code that incorrectly allow dma_supported() to
> succeed for 64bit dma mask even if system doesn't have ATU IOMMU. 64bit DMA
> only supported on sun4v equipped with ATU IOMMU HW.
> 
> diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c
> index 24f21c7..0a32c57 100644
> --- a/arch/sparc/kernel/pci_sun4v.c
> +++ b/arch/sparc/kernel/pci_sun4v.c
> @@ -673,12 +673,14 @@ static void dma_4v_unmap_sg(struct device *dev, struct
> scatterlist *sglist,
>  static int dma_4v_supported(struct device *dev, u64 device_mask)
>  {
> struct iommu *iommu = dev->archdata.iommu;
> -   u64 dma_addr_mask;
> +   u64 dma_addr_mask = iommu->dma_addr_mask;
> 
> -   if (device_mask > DMA_BIT_MASK(32) && iommu->atu)
> -   dma_addr_mask = iommu->atu->dma_addr_mask;
> -   else
> -   dma_addr_mask = iommu->dma_addr_mask;
> +   if (device_mask > DMA_BIT_MASK(32)) {
> +   if (iommu->atu)
> +   dma_addr_mask = iommu->atu->dma_addr_mask;
> +   else
> +   return 0;
> +   }
> 
> if ((device_mask & dma_addr_mask) == dma_addr_mask)
> return 1;
> 
> 
> -Tushar
> 
> 
> > 
> > -Tushar
> > >
> > >
> > > [0.24] PROMLIB: Sun IEEE Boot Prom 'OBP 4.30.4.d 2011/07/06 14:29'
> > > [0.33] PROMLIB: Root node compatible: sun4v
> > > [0.79] Linux version 4.12.0-08915-gf263fbb (mroos@t2000) (gcc
> > > version 4.9.2 (Debian 4.9.2-20)) #141 SMP Sun Jul 9 17:51:12 EEST 2017
> > > [0.002047] bootconsole [earlyprom0] enabled
> > > [0.002383] ARCH: SUN4V
> > > [0.002668] Ethernet address: 00:14:4f:86:99:26
> > > [0.003406] MM: PAGE_OFFSET is 0x8000 (max_phys_bits == 39)
> > > [0.004089] MM: VMALLOC [0x0001 --> 0x6000]
> > > [0.004562] MM: VMEMMAP [0x6000 --> 0xc000]
> > > [0.095699] Kernel: Using 3 locked TLB entries for main kernel image.
> > > [0.096387] Remapping the kernel...
> > > [0.096400] done.
> > > [1.906342] OF stdout device is: /virtual-devices@100/console@1
> > > [1.907160] PROM: Built device tree with 148821 bytes of memory.
> > > [1.907804] MDESC: Size is 42336 bytes.
> > > [1.910139] PLATFORM: banner-name [Sun Fire T200]
> > > [1.910564] PLATFORM: name [SUNW,Sun-Fire-T200]
> > > [1.910919] PLATFORM: hostid [84869926]
> > > [1.911224] PLATFORM: serial# [00ab4130]
> > > [1.911536] PLATFORM: stick-frequency [3b9aca00]
> > > [1.911894] PLATFORM: mac-address [144f869926]
> > > [1.912241] PLATFORM: watchdog-resolution [1000 ms]
> > > [1.912619] PLATFORM: watchdog-max-timeout [3153600 ms]
> > > [1.913042] PLATFORM: max-cpus [32]
> > > [1.913501] Top of RAM: 0x3ffd34000, Total RAM: 0x3f7918000
> > > [1.913936] Memory hole size: 132MB
> > > [2.279507] Allocated 16384 bytes for kernel page tables.
> > > [2.280578] Zone ranges:
> > > [2.280819]   Normal   [mem 0x0840-0x0003ffd33fff]
> > > [2.281292] Movable zone start for each node
> > > [2.281626] Early memory node ranges
> > > [2.281916]   node   0: [mem 0x0840-0x0003ffc1]
> > > [2.282557]   node   0: [mem 0x0003ffc28000-0x0003ffcfdfff]
> > > [2.283030]   node   0: [mem 0x0003ffd0e000-0x0003ffd27fff]
> > > [2.283514]   node   0: [mem 0x0003ffd2c000-0x0003ffd33fff]
> > > [2.283994] Initmem setup node 0 [mem
> > > 0x0840-0x0003ffd33fff]
> > > [2.782262] Booting Linux...
> > > [2.782734] CPU CAPS: [flush,stbar,swap,muldiv,v9,blkinit,mul32,div32]
> > > [2.783255] CPU CAPS: [v8plus,ASIBlkInit]
> > > [2.897543] percpu: Embedded 12 pages/cpu @8003ff80 s55872
> > > r8192 d34240 u262144
> > > [2.913264] SUN4V: Mondo queue sizes [cpu(4096) dev(16384) r(8192)
> > > nr(256)]
> > > [2.915492] Built 1 zonelists in Node order, mobility grouping on.
> > > Total pages: 2063634
> > > [2.916160] Policy zone: Normal
> > > [2.916420] Kernel command line: root=/dev/sda1 ro
> > > [2.918743] PID hash table entries: 4096 (order: 2, 32768 bytes)
> > > [2.919230] Sorting __ex_table...
> > > [3.220450] Memory: 16497120K/16639072K 

Re: sun4v+DMA related boot crash on 4.13-git

2017-07-10 Thread tndave



On 07/10/2017 11:47 AM, tndave wrote:



On 07/10/2017 06:20 AM, Meelis Roos wrote:

I tested yesterdayd 4.12+git on sparc64 to see if the sparc merge works
fine, and on all of my sun4v machines (T1000, T2000, T5120) it crashed
on boot with DMA-related stacktrace (below). Allt he machines are sun4v
physical machines, not VM-s. Older sun4 machines do not exhibit this
problem.

Maybae DMA APi realted, maybe sparc64. Will try to bisect when I get
time.
I see whats going on with panic. I will reproduce locally. Will get back 
soon.

This patch should fix panic. Please give it a try.

commit b02c2b0bfd ("sparc: remove arch specific dma_supported 
implementations") introduced a code that incorrectly allow 
dma_supported() to succeed for 64bit dma mask even if system doesn't 
have ATU IOMMU. 64bit DMA only supported on sun4v equipped with ATU 
IOMMU HW.


diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c
index 24f21c7..0a32c57 100644
--- a/arch/sparc/kernel/pci_sun4v.c
+++ b/arch/sparc/kernel/pci_sun4v.c
@@ -673,12 +673,14 @@ static void dma_4v_unmap_sg(struct device *dev, 
struct scatterlist *sglist,

 static int dma_4v_supported(struct device *dev, u64 device_mask)
 {
struct iommu *iommu = dev->archdata.iommu;
-   u64 dma_addr_mask;
+   u64 dma_addr_mask = iommu->dma_addr_mask;

-   if (device_mask > DMA_BIT_MASK(32) && iommu->atu)
-   dma_addr_mask = iommu->atu->dma_addr_mask;
-   else
-   dma_addr_mask = iommu->dma_addr_mask;
+   if (device_mask > DMA_BIT_MASK(32)) {
+   if (iommu->atu)
+   dma_addr_mask = iommu->atu->dma_addr_mask;
+   else
+   return 0;
+   }

if ((device_mask & dma_addr_mask) == dma_addr_mask)
return 1;


-Tushar




-Tushar



[0.24] PROMLIB: Sun IEEE Boot Prom 'OBP 4.30.4.d 2011/07/06 
14:29'

[0.33] PROMLIB: Root node compatible: sun4v
[0.79] Linux version 4.12.0-08915-gf263fbb (mroos@t2000) (gcc 
version 4.9.2 (Debian 4.9.2-20)) #141 SMP Sun Jul 9 17:51:12 EEST 2017

[0.002047] bootconsole [earlyprom0] enabled
[0.002383] ARCH: SUN4V
[0.002668] Ethernet address: 00:14:4f:86:99:26
[0.003406] MM: PAGE_OFFSET is 0x8000 (max_phys_bits == 
39)

[0.004089] MM: VMALLOC [0x0001 --> 0x6000]
[0.004562] MM: VMEMMAP [0x6000 --> 0xc000]
[0.095699] Kernel: Using 3 locked TLB entries for main kernel image.
[0.096387] Remapping the kernel...
[0.096400] done.
[1.906342] OF stdout device is: /virtual-devices@100/console@1
[1.907160] PROM: Built device tree with 148821 bytes of memory.
[1.907804] MDESC: Size is 42336 bytes.
[1.910139] PLATFORM: banner-name [Sun Fire T200]
[1.910564] PLATFORM: name [SUNW,Sun-Fire-T200]
[1.910919] PLATFORM: hostid [84869926]
[1.911224] PLATFORM: serial# [00ab4130]
[1.911536] PLATFORM: stick-frequency [3b9aca00]
[1.911894] PLATFORM: mac-address [144f869926]
[1.912241] PLATFORM: watchdog-resolution [1000 ms]
[1.912619] PLATFORM: watchdog-max-timeout [3153600 ms]
[1.913042] PLATFORM: max-cpus [32]
[1.913501] Top of RAM: 0x3ffd34000, Total RAM: 0x3f7918000
[1.913936] Memory hole size: 132MB
[2.279507] Allocated 16384 bytes for kernel page tables.
[2.280578] Zone ranges:
[2.280819]   Normal   [mem 0x0840-0x0003ffd33fff]
[2.281292] Movable zone start for each node
[2.281626] Early memory node ranges
[2.281916]   node   0: [mem 0x0840-0x0003ffc1]
[2.282557]   node   0: [mem 0x0003ffc28000-0x0003ffcfdfff]
[2.283030]   node   0: [mem 0x0003ffd0e000-0x0003ffd27fff]
[2.283514]   node   0: [mem 0x0003ffd2c000-0x0003ffd33fff]
[2.283994] Initmem setup node 0 [mem 
0x0840-0x0003ffd33fff]

[2.782262] Booting Linux...
[2.782734] CPU CAPS: [flush,stbar,swap,muldiv,v9,blkinit,mul32,div32]
[2.783255] CPU CAPS: [v8plus,ASIBlkInit]
[2.897543] percpu: Embedded 12 pages/cpu @8003ff80 s55872 
r8192 d34240 u262144
[2.913264] SUN4V: Mondo queue sizes [cpu(4096) dev(16384) r(8192) 
nr(256)]
[2.915492] Built 1 zonelists in Node order, mobility grouping on.  
Total pages: 2063634

[2.916160] Policy zone: Normal
[2.916420] Kernel command line: root=/dev/sda1 ro
[2.918743] PID hash table entries: 4096 (order: 2, 32768 bytes)
[2.919230] Sorting __ex_table...
[3.220450] Memory: 16497120K/16639072K available (5521K kernel 
code, 530K rwdata, 1224K rodata, 336K init, 699K bss, 141952K 
reserved, 0K cma-reserved)

[3.223109] Hierarchical RCU implementation.
[3.223452] RCU restricting CPUs from NR_CPUS=256 to 
nr_cpu_ids=16.
[3.223933] RCU: Adjusting geometry for rcu_fanout_leaf=16, 
nr_cpu_ids=16

[3.225508] NR_IRQS: 2048, nr_irqs: 2048, preallocated irqs: 1
[3.225975] SUN4V: 

Re: sun4v+DMA related boot crash on 4.13-git

2017-07-10 Thread tndave



On 07/10/2017 11:47 AM, tndave wrote:



On 07/10/2017 06:20 AM, Meelis Roos wrote:

I tested yesterdayd 4.12+git on sparc64 to see if the sparc merge works
fine, and on all of my sun4v machines (T1000, T2000, T5120) it crashed
on boot with DMA-related stacktrace (below). Allt he machines are sun4v
physical machines, not VM-s. Older sun4 machines do not exhibit this
problem.

Maybae DMA APi realted, maybe sparc64. Will try to bisect when I get
time.
I see whats going on with panic. I will reproduce locally. Will get back 
soon.

This patch should fix panic. Please give it a try.

commit b02c2b0bfd ("sparc: remove arch specific dma_supported 
implementations") introduced a code that incorrectly allow 
dma_supported() to succeed for 64bit dma mask even if system doesn't 
have ATU IOMMU. 64bit DMA only supported on sun4v equipped with ATU 
IOMMU HW.


diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c
index 24f21c7..0a32c57 100644
--- a/arch/sparc/kernel/pci_sun4v.c
+++ b/arch/sparc/kernel/pci_sun4v.c
@@ -673,12 +673,14 @@ static void dma_4v_unmap_sg(struct device *dev, 
struct scatterlist *sglist,

 static int dma_4v_supported(struct device *dev, u64 device_mask)
 {
struct iommu *iommu = dev->archdata.iommu;
-   u64 dma_addr_mask;
+   u64 dma_addr_mask = iommu->dma_addr_mask;

-   if (device_mask > DMA_BIT_MASK(32) && iommu->atu)
-   dma_addr_mask = iommu->atu->dma_addr_mask;
-   else
-   dma_addr_mask = iommu->dma_addr_mask;
+   if (device_mask > DMA_BIT_MASK(32)) {
+   if (iommu->atu)
+   dma_addr_mask = iommu->atu->dma_addr_mask;
+   else
+   return 0;
+   }

if ((device_mask & dma_addr_mask) == dma_addr_mask)
return 1;


-Tushar




-Tushar



[0.24] PROMLIB: Sun IEEE Boot Prom 'OBP 4.30.4.d 2011/07/06 
14:29'

[0.33] PROMLIB: Root node compatible: sun4v
[0.79] Linux version 4.12.0-08915-gf263fbb (mroos@t2000) (gcc 
version 4.9.2 (Debian 4.9.2-20)) #141 SMP Sun Jul 9 17:51:12 EEST 2017

[0.002047] bootconsole [earlyprom0] enabled
[0.002383] ARCH: SUN4V
[0.002668] Ethernet address: 00:14:4f:86:99:26
[0.003406] MM: PAGE_OFFSET is 0x8000 (max_phys_bits == 
39)

[0.004089] MM: VMALLOC [0x0001 --> 0x6000]
[0.004562] MM: VMEMMAP [0x6000 --> 0xc000]
[0.095699] Kernel: Using 3 locked TLB entries for main kernel image.
[0.096387] Remapping the kernel...
[0.096400] done.
[1.906342] OF stdout device is: /virtual-devices@100/console@1
[1.907160] PROM: Built device tree with 148821 bytes of memory.
[1.907804] MDESC: Size is 42336 bytes.
[1.910139] PLATFORM: banner-name [Sun Fire T200]
[1.910564] PLATFORM: name [SUNW,Sun-Fire-T200]
[1.910919] PLATFORM: hostid [84869926]
[1.911224] PLATFORM: serial# [00ab4130]
[1.911536] PLATFORM: stick-frequency [3b9aca00]
[1.911894] PLATFORM: mac-address [144f869926]
[1.912241] PLATFORM: watchdog-resolution [1000 ms]
[1.912619] PLATFORM: watchdog-max-timeout [3153600 ms]
[1.913042] PLATFORM: max-cpus [32]
[1.913501] Top of RAM: 0x3ffd34000, Total RAM: 0x3f7918000
[1.913936] Memory hole size: 132MB
[2.279507] Allocated 16384 bytes for kernel page tables.
[2.280578] Zone ranges:
[2.280819]   Normal   [mem 0x0840-0x0003ffd33fff]
[2.281292] Movable zone start for each node
[2.281626] Early memory node ranges
[2.281916]   node   0: [mem 0x0840-0x0003ffc1]
[2.282557]   node   0: [mem 0x0003ffc28000-0x0003ffcfdfff]
[2.283030]   node   0: [mem 0x0003ffd0e000-0x0003ffd27fff]
[2.283514]   node   0: [mem 0x0003ffd2c000-0x0003ffd33fff]
[2.283994] Initmem setup node 0 [mem 
0x0840-0x0003ffd33fff]

[2.782262] Booting Linux...
[2.782734] CPU CAPS: [flush,stbar,swap,muldiv,v9,blkinit,mul32,div32]
[2.783255] CPU CAPS: [v8plus,ASIBlkInit]
[2.897543] percpu: Embedded 12 pages/cpu @8003ff80 s55872 
r8192 d34240 u262144
[2.913264] SUN4V: Mondo queue sizes [cpu(4096) dev(16384) r(8192) 
nr(256)]
[2.915492] Built 1 zonelists in Node order, mobility grouping on.  
Total pages: 2063634

[2.916160] Policy zone: Normal
[2.916420] Kernel command line: root=/dev/sda1 ro
[2.918743] PID hash table entries: 4096 (order: 2, 32768 bytes)
[2.919230] Sorting __ex_table...
[3.220450] Memory: 16497120K/16639072K available (5521K kernel 
code, 530K rwdata, 1224K rodata, 336K init, 699K bss, 141952K 
reserved, 0K cma-reserved)

[3.223109] Hierarchical RCU implementation.
[3.223452] RCU restricting CPUs from NR_CPUS=256 to 
nr_cpu_ids=16.
[3.223933] RCU: Adjusting geometry for rcu_fanout_leaf=16, 
nr_cpu_ids=16

[3.225508] NR_IRQS: 2048, nr_irqs: 2048, preallocated irqs: 1
[3.225975] SUN4V: 

Re: sun4v+DMA related boot crash on 4.13-git

2017-07-10 Thread tndave



On 07/10/2017 06:20 AM, Meelis Roos wrote:

I tested yesterdayd 4.12+git on sparc64 to see if the sparc merge works
fine, and on all of my sun4v machines (T1000, T2000, T5120) it crashed
on boot with DMA-related stacktrace (below). Allt he machines are sun4v
physical machines, not VM-s. Older sun4 machines do not exhibit this
problem.

Maybae DMA APi realted, maybe sparc64. Will try to bisect when I get
time.
I see whats going on with panic. I will reproduce locally. Will get back 
soon.


-Tushar



[0.24] PROMLIB: Sun IEEE Boot Prom 'OBP 4.30.4.d 2011/07/06 14:29'
[0.33] PROMLIB: Root node compatible: sun4v
[0.79] Linux version 4.12.0-08915-gf263fbb (mroos@t2000) (gcc version 
4.9.2 (Debian 4.9.2-20)) #141 SMP Sun Jul 9 17:51:12 EEST 2017
[0.002047] bootconsole [earlyprom0] enabled
[0.002383] ARCH: SUN4V
[0.002668] Ethernet address: 00:14:4f:86:99:26
[0.003406] MM: PAGE_OFFSET is 0x8000 (max_phys_bits == 39)
[0.004089] MM: VMALLOC [0x0001 --> 0x6000]
[0.004562] MM: VMEMMAP [0x6000 --> 0xc000]
[0.095699] Kernel: Using 3 locked TLB entries for main kernel image.
[0.096387] Remapping the kernel...
[0.096400] done.
[1.906342] OF stdout device is: /virtual-devices@100/console@1
[1.907160] PROM: Built device tree with 148821 bytes of memory.
[1.907804] MDESC: Size is 42336 bytes.
[1.910139] PLATFORM: banner-name [Sun Fire T200]
[1.910564] PLATFORM: name [SUNW,Sun-Fire-T200]
[1.910919] PLATFORM: hostid [84869926]
[1.911224] PLATFORM: serial# [00ab4130]
[1.911536] PLATFORM: stick-frequency [3b9aca00]
[1.911894] PLATFORM: mac-address [144f869926]
[1.912241] PLATFORM: watchdog-resolution [1000 ms]
[1.912619] PLATFORM: watchdog-max-timeout [3153600 ms]
[1.913042] PLATFORM: max-cpus [32]
[1.913501] Top of RAM: 0x3ffd34000, Total RAM: 0x3f7918000
[1.913936] Memory hole size: 132MB
[2.279507] Allocated 16384 bytes for kernel page tables.
[2.280578] Zone ranges:
[2.280819]   Normal   [mem 0x0840-0x0003ffd33fff]
[2.281292] Movable zone start for each node
[2.281626] Early memory node ranges
[2.281916]   node   0: [mem 0x0840-0x0003ffc1]
[2.282557]   node   0: [mem 0x0003ffc28000-0x0003ffcfdfff]
[2.283030]   node   0: [mem 0x0003ffd0e000-0x0003ffd27fff]
[2.283514]   node   0: [mem 0x0003ffd2c000-0x0003ffd33fff]
[2.283994] Initmem setup node 0 [mem 0x0840-0x0003ffd33fff]
[2.782262] Booting Linux...
[2.782734] CPU CAPS: [flush,stbar,swap,muldiv,v9,blkinit,mul32,div32]
[2.783255] CPU CAPS: [v8plus,ASIBlkInit]
[2.897543] percpu: Embedded 12 pages/cpu @8003ff80 s55872 r8192 
d34240 u262144
[2.913264] SUN4V: Mondo queue sizes [cpu(4096) dev(16384) r(8192) nr(256)]
[2.915492] Built 1 zonelists in Node order, mobility grouping on.  Total 
pages: 2063634
[2.916160] Policy zone: Normal
[2.916420] Kernel command line: root=/dev/sda1 ro
[2.918743] PID hash table entries: 4096 (order: 2, 32768 bytes)
[2.919230] Sorting __ex_table...
[3.220450] Memory: 16497120K/16639072K available (5521K kernel code, 530K 
rwdata, 1224K rodata, 336K init, 699K bss, 141952K reserved, 0K cma-reserved)
[3.223109] Hierarchical RCU implementation.
[3.223452]  RCU restricting CPUs from NR_CPUS=256 to nr_cpu_ids=16.
[3.223933] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=16
[3.225508] NR_IRQS: 2048, nr_irqs: 2048, preallocated irqs: 1
[3.225975] SUN4V: Using IRQ API major 1, cookie only virqs disabled
[3.227643] clocksource: stick: mask: 0x max_cycles: 
0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[3.228400] clocksource: mult[80] shift[23]
[3.228755] clockevent: mult[8000] shift[31]
[3.230304] Console: colour dummy device 80x25
[3.230662] console [tty0] enabled
[3.230948] bootconsole [earlyprom0] disabled
[0.24] PROMLIB: Sun IEEE Boot Prom 'OBP 4.30.4.d 2011/07/06 14:29'
[0.33] PROMLIB: Root node compatible: sun4v
[0.79] Linux version 4.12.0-08915-gf263fbb (mroos@t2000) (gcc version 
4.9.2 (Debian 4.9.2-20)) #141 SMP Sun Jul 9 17:51:12 EEST 2017
[0.002047] bootconsole [earlyprom0] enabled
[0.002383] ARCH: SUN4V
[0.002668] Ethernet address: 00:14:4f:86:99:26
[0.003406] MM: PAGE_OFFSET is 0x8000 (max_phys_bits == 39)
[0.004089] MM: VMALLOC [0x0001 --> 0x6000]
[0.004562] MM: VMEMMAP [0x6000 --> 0xc000]
[0.095699] Kernel: Using 3 locked TLB entries for main kernel image.
[0.096387] Remapping the kernel...
[0.096400] done.
[1.906342] OF stdout device is: /virtual-devices@100/console@1
[1.907160] PROM: Built device tree with 148821 bytes of memory.
[1.907804] MDESC: Size is 42336 bytes.
[1.910139] 

Re: sun4v+DMA related boot crash on 4.13-git

2017-07-10 Thread tndave



On 07/10/2017 06:20 AM, Meelis Roos wrote:

I tested yesterdayd 4.12+git on sparc64 to see if the sparc merge works
fine, and on all of my sun4v machines (T1000, T2000, T5120) it crashed
on boot with DMA-related stacktrace (below). Allt he machines are sun4v
physical machines, not VM-s. Older sun4 machines do not exhibit this
problem.

Maybae DMA APi realted, maybe sparc64. Will try to bisect when I get
time.
I see whats going on with panic. I will reproduce locally. Will get back 
soon.


-Tushar



[0.24] PROMLIB: Sun IEEE Boot Prom 'OBP 4.30.4.d 2011/07/06 14:29'
[0.33] PROMLIB: Root node compatible: sun4v
[0.79] Linux version 4.12.0-08915-gf263fbb (mroos@t2000) (gcc version 
4.9.2 (Debian 4.9.2-20)) #141 SMP Sun Jul 9 17:51:12 EEST 2017
[0.002047] bootconsole [earlyprom0] enabled
[0.002383] ARCH: SUN4V
[0.002668] Ethernet address: 00:14:4f:86:99:26
[0.003406] MM: PAGE_OFFSET is 0x8000 (max_phys_bits == 39)
[0.004089] MM: VMALLOC [0x0001 --> 0x6000]
[0.004562] MM: VMEMMAP [0x6000 --> 0xc000]
[0.095699] Kernel: Using 3 locked TLB entries for main kernel image.
[0.096387] Remapping the kernel...
[0.096400] done.
[1.906342] OF stdout device is: /virtual-devices@100/console@1
[1.907160] PROM: Built device tree with 148821 bytes of memory.
[1.907804] MDESC: Size is 42336 bytes.
[1.910139] PLATFORM: banner-name [Sun Fire T200]
[1.910564] PLATFORM: name [SUNW,Sun-Fire-T200]
[1.910919] PLATFORM: hostid [84869926]
[1.911224] PLATFORM: serial# [00ab4130]
[1.911536] PLATFORM: stick-frequency [3b9aca00]
[1.911894] PLATFORM: mac-address [144f869926]
[1.912241] PLATFORM: watchdog-resolution [1000 ms]
[1.912619] PLATFORM: watchdog-max-timeout [3153600 ms]
[1.913042] PLATFORM: max-cpus [32]
[1.913501] Top of RAM: 0x3ffd34000, Total RAM: 0x3f7918000
[1.913936] Memory hole size: 132MB
[2.279507] Allocated 16384 bytes for kernel page tables.
[2.280578] Zone ranges:
[2.280819]   Normal   [mem 0x0840-0x0003ffd33fff]
[2.281292] Movable zone start for each node
[2.281626] Early memory node ranges
[2.281916]   node   0: [mem 0x0840-0x0003ffc1]
[2.282557]   node   0: [mem 0x0003ffc28000-0x0003ffcfdfff]
[2.283030]   node   0: [mem 0x0003ffd0e000-0x0003ffd27fff]
[2.283514]   node   0: [mem 0x0003ffd2c000-0x0003ffd33fff]
[2.283994] Initmem setup node 0 [mem 0x0840-0x0003ffd33fff]
[2.782262] Booting Linux...
[2.782734] CPU CAPS: [flush,stbar,swap,muldiv,v9,blkinit,mul32,div32]
[2.783255] CPU CAPS: [v8plus,ASIBlkInit]
[2.897543] percpu: Embedded 12 pages/cpu @8003ff80 s55872 r8192 
d34240 u262144
[2.913264] SUN4V: Mondo queue sizes [cpu(4096) dev(16384) r(8192) nr(256)]
[2.915492] Built 1 zonelists in Node order, mobility grouping on.  Total 
pages: 2063634
[2.916160] Policy zone: Normal
[2.916420] Kernel command line: root=/dev/sda1 ro
[2.918743] PID hash table entries: 4096 (order: 2, 32768 bytes)
[2.919230] Sorting __ex_table...
[3.220450] Memory: 16497120K/16639072K available (5521K kernel code, 530K 
rwdata, 1224K rodata, 336K init, 699K bss, 141952K reserved, 0K cma-reserved)
[3.223109] Hierarchical RCU implementation.
[3.223452]  RCU restricting CPUs from NR_CPUS=256 to nr_cpu_ids=16.
[3.223933] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=16
[3.225508] NR_IRQS: 2048, nr_irqs: 2048, preallocated irqs: 1
[3.225975] SUN4V: Using IRQ API major 1, cookie only virqs disabled
[3.227643] clocksource: stick: mask: 0x max_cycles: 
0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[3.228400] clocksource: mult[80] shift[23]
[3.228755] clockevent: mult[8000] shift[31]
[3.230304] Console: colour dummy device 80x25
[3.230662] console [tty0] enabled
[3.230948] bootconsole [earlyprom0] disabled
[0.24] PROMLIB: Sun IEEE Boot Prom 'OBP 4.30.4.d 2011/07/06 14:29'
[0.33] PROMLIB: Root node compatible: sun4v
[0.79] Linux version 4.12.0-08915-gf263fbb (mroos@t2000) (gcc version 
4.9.2 (Debian 4.9.2-20)) #141 SMP Sun Jul 9 17:51:12 EEST 2017
[0.002047] bootconsole [earlyprom0] enabled
[0.002383] ARCH: SUN4V
[0.002668] Ethernet address: 00:14:4f:86:99:26
[0.003406] MM: PAGE_OFFSET is 0x8000 (max_phys_bits == 39)
[0.004089] MM: VMALLOC [0x0001 --> 0x6000]
[0.004562] MM: VMEMMAP [0x6000 --> 0xc000]
[0.095699] Kernel: Using 3 locked TLB entries for main kernel image.
[0.096387] Remapping the kernel...
[0.096400] done.
[1.906342] OF stdout device is: /virtual-devices@100/console@1
[1.907160] PROM: Built device tree with 148821 bytes of memory.
[1.907804] MDESC: Size is 42336 bytes.
[1.910139] 

sun4v+DMA related boot crash on 4.13-git

2017-07-10 Thread Meelis Roos
I tested yesterdayd 4.12+git on sparc64 to see if the sparc merge works 
fine, and on all of my sun4v machines (T1000, T2000, T5120) it crashed 
on boot with DMA-related stacktrace (below). Allt he machines are sun4v 
physical machines, not VM-s. Older sun4 machines do not exhibit this 
problem.

Maybae DMA APi realted, maybe sparc64. Will try to bisect when I get 
time.


[0.24] PROMLIB: Sun IEEE Boot Prom 'OBP 4.30.4.d 2011/07/06 14:29'
[0.33] PROMLIB: Root node compatible: sun4v
[0.79] Linux version 4.12.0-08915-gf263fbb (mroos@t2000) (gcc version 
4.9.2 (Debian 4.9.2-20)) #141 SMP Sun Jul 9 17:51:12 EEST 2017
[0.002047] bootconsole [earlyprom0] enabled
[0.002383] ARCH: SUN4V
[0.002668] Ethernet address: 00:14:4f:86:99:26
[0.003406] MM: PAGE_OFFSET is 0x8000 (max_phys_bits == 39)
[0.004089] MM: VMALLOC [0x0001 --> 0x6000]
[0.004562] MM: VMEMMAP [0x6000 --> 0xc000]
[0.095699] Kernel: Using 3 locked TLB entries for main kernel image.
[0.096387] Remapping the kernel... 
[0.096400] done.
[1.906342] OF stdout device is: /virtual-devices@100/console@1
[1.907160] PROM: Built device tree with 148821 bytes of memory.
[1.907804] MDESC: Size is 42336 bytes.
[1.910139] PLATFORM: banner-name [Sun Fire T200]
[1.910564] PLATFORM: name [SUNW,Sun-Fire-T200]
[1.910919] PLATFORM: hostid [84869926]
[1.911224] PLATFORM: serial# [00ab4130]
[1.911536] PLATFORM: stick-frequency [3b9aca00]
[1.911894] PLATFORM: mac-address [144f869926]
[1.912241] PLATFORM: watchdog-resolution [1000 ms]
[1.912619] PLATFORM: watchdog-max-timeout [3153600 ms]
[1.913042] PLATFORM: max-cpus [32]
[1.913501] Top of RAM: 0x3ffd34000, Total RAM: 0x3f7918000
[1.913936] Memory hole size: 132MB
[2.279507] Allocated 16384 bytes for kernel page tables.
[2.280578] Zone ranges:
[2.280819]   Normal   [mem 0x0840-0x0003ffd33fff]
[2.281292] Movable zone start for each node
[2.281626] Early memory node ranges
[2.281916]   node   0: [mem 0x0840-0x0003ffc1]
[2.282557]   node   0: [mem 0x0003ffc28000-0x0003ffcfdfff]
[2.283030]   node   0: [mem 0x0003ffd0e000-0x0003ffd27fff]
[2.283514]   node   0: [mem 0x0003ffd2c000-0x0003ffd33fff]
[2.283994] Initmem setup node 0 [mem 0x0840-0x0003ffd33fff]
[2.782262] Booting Linux...
[2.782734] CPU CAPS: [flush,stbar,swap,muldiv,v9,blkinit,mul32,div32]
[2.783255] CPU CAPS: [v8plus,ASIBlkInit]
[2.897543] percpu: Embedded 12 pages/cpu @8003ff80 s55872 r8192 
d34240 u262144
[2.913264] SUN4V: Mondo queue sizes [cpu(4096) dev(16384) r(8192) nr(256)]
[2.915492] Built 1 zonelists in Node order, mobility grouping on.  Total 
pages: 2063634
[2.916160] Policy zone: Normal
[2.916420] Kernel command line: root=/dev/sda1 ro
[2.918743] PID hash table entries: 4096 (order: 2, 32768 bytes)
[2.919230] Sorting __ex_table...
[3.220450] Memory: 16497120K/16639072K available (5521K kernel code, 530K 
rwdata, 1224K rodata, 336K init, 699K bss, 141952K reserved, 0K cma-reserved)
[3.223109] Hierarchical RCU implementation.
[3.223452]  RCU restricting CPUs from NR_CPUS=256 to nr_cpu_ids=16.
[3.223933] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=16
[3.225508] NR_IRQS: 2048, nr_irqs: 2048, preallocated irqs: 1
[3.225975] SUN4V: Using IRQ API major 1, cookie only virqs disabled
[3.227643] clocksource: stick: mask: 0x max_cycles: 
0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[3.228400] clocksource: mult[80] shift[23]
[3.228755] clockevent: mult[8000] shift[31]
[3.230304] Console: colour dummy device 80x25
[3.230662] console [tty0] enabled
[3.230948] bootconsole [earlyprom0] disabled
[0.24] PROMLIB: Sun IEEE Boot Prom 'OBP 4.30.4.d 2011/07/06 14:29'
[0.33] PROMLIB: Root node compatible: sun4v
[0.79] Linux version 4.12.0-08915-gf263fbb (mroos@t2000) (gcc version 
4.9.2 (Debian 4.9.2-20)) #141 SMP Sun Jul 9 17:51:12 EEST 2017
[0.002047] bootconsole [earlyprom0] enabled
[0.002383] ARCH: SUN4V
[0.002668] Ethernet address: 00:14:4f:86:99:26
[0.003406] MM: PAGE_OFFSET is 0x8000 (max_phys_bits == 39)
[0.004089] MM: VMALLOC [0x0001 --> 0x6000]
[0.004562] MM: VMEMMAP [0x6000 --> 0xc000]
[0.095699] Kernel: Using 3 locked TLB entries for main kernel image.
[0.096387] Remapping the kernel... 
[0.096400] done.
[1.906342] OF stdout device is: /virtual-devices@100/console@1
[1.907160] PROM: Built device tree with 148821 bytes of memory.
[1.907804] MDESC: Size is 42336 bytes.
[1.910139] PLATFORM: banner-name [Sun Fire T200]
[1.910564] PLATFORM: name [SUNW,Sun-Fire-T200]
[1.910919] PLATFORM: hostid [84869926]
[

sun4v+DMA related boot crash on 4.13-git

2017-07-10 Thread Meelis Roos
I tested yesterdayd 4.12+git on sparc64 to see if the sparc merge works 
fine, and on all of my sun4v machines (T1000, T2000, T5120) it crashed 
on boot with DMA-related stacktrace (below). Allt he machines are sun4v 
physical machines, not VM-s. Older sun4 machines do not exhibit this 
problem.

Maybae DMA APi realted, maybe sparc64. Will try to bisect when I get 
time.


[0.24] PROMLIB: Sun IEEE Boot Prom 'OBP 4.30.4.d 2011/07/06 14:29'
[0.33] PROMLIB: Root node compatible: sun4v
[0.79] Linux version 4.12.0-08915-gf263fbb (mroos@t2000) (gcc version 
4.9.2 (Debian 4.9.2-20)) #141 SMP Sun Jul 9 17:51:12 EEST 2017
[0.002047] bootconsole [earlyprom0] enabled
[0.002383] ARCH: SUN4V
[0.002668] Ethernet address: 00:14:4f:86:99:26
[0.003406] MM: PAGE_OFFSET is 0x8000 (max_phys_bits == 39)
[0.004089] MM: VMALLOC [0x0001 --> 0x6000]
[0.004562] MM: VMEMMAP [0x6000 --> 0xc000]
[0.095699] Kernel: Using 3 locked TLB entries for main kernel image.
[0.096387] Remapping the kernel... 
[0.096400] done.
[1.906342] OF stdout device is: /virtual-devices@100/console@1
[1.907160] PROM: Built device tree with 148821 bytes of memory.
[1.907804] MDESC: Size is 42336 bytes.
[1.910139] PLATFORM: banner-name [Sun Fire T200]
[1.910564] PLATFORM: name [SUNW,Sun-Fire-T200]
[1.910919] PLATFORM: hostid [84869926]
[1.911224] PLATFORM: serial# [00ab4130]
[1.911536] PLATFORM: stick-frequency [3b9aca00]
[1.911894] PLATFORM: mac-address [144f869926]
[1.912241] PLATFORM: watchdog-resolution [1000 ms]
[1.912619] PLATFORM: watchdog-max-timeout [3153600 ms]
[1.913042] PLATFORM: max-cpus [32]
[1.913501] Top of RAM: 0x3ffd34000, Total RAM: 0x3f7918000
[1.913936] Memory hole size: 132MB
[2.279507] Allocated 16384 bytes for kernel page tables.
[2.280578] Zone ranges:
[2.280819]   Normal   [mem 0x0840-0x0003ffd33fff]
[2.281292] Movable zone start for each node
[2.281626] Early memory node ranges
[2.281916]   node   0: [mem 0x0840-0x0003ffc1]
[2.282557]   node   0: [mem 0x0003ffc28000-0x0003ffcfdfff]
[2.283030]   node   0: [mem 0x0003ffd0e000-0x0003ffd27fff]
[2.283514]   node   0: [mem 0x0003ffd2c000-0x0003ffd33fff]
[2.283994] Initmem setup node 0 [mem 0x0840-0x0003ffd33fff]
[2.782262] Booting Linux...
[2.782734] CPU CAPS: [flush,stbar,swap,muldiv,v9,blkinit,mul32,div32]
[2.783255] CPU CAPS: [v8plus,ASIBlkInit]
[2.897543] percpu: Embedded 12 pages/cpu @8003ff80 s55872 r8192 
d34240 u262144
[2.913264] SUN4V: Mondo queue sizes [cpu(4096) dev(16384) r(8192) nr(256)]
[2.915492] Built 1 zonelists in Node order, mobility grouping on.  Total 
pages: 2063634
[2.916160] Policy zone: Normal
[2.916420] Kernel command line: root=/dev/sda1 ro
[2.918743] PID hash table entries: 4096 (order: 2, 32768 bytes)
[2.919230] Sorting __ex_table...
[3.220450] Memory: 16497120K/16639072K available (5521K kernel code, 530K 
rwdata, 1224K rodata, 336K init, 699K bss, 141952K reserved, 0K cma-reserved)
[3.223109] Hierarchical RCU implementation.
[3.223452]  RCU restricting CPUs from NR_CPUS=256 to nr_cpu_ids=16.
[3.223933] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=16
[3.225508] NR_IRQS: 2048, nr_irqs: 2048, preallocated irqs: 1
[3.225975] SUN4V: Using IRQ API major 1, cookie only virqs disabled
[3.227643] clocksource: stick: mask: 0x max_cycles: 
0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[3.228400] clocksource: mult[80] shift[23]
[3.228755] clockevent: mult[8000] shift[31]
[3.230304] Console: colour dummy device 80x25
[3.230662] console [tty0] enabled
[3.230948] bootconsole [earlyprom0] disabled
[0.24] PROMLIB: Sun IEEE Boot Prom 'OBP 4.30.4.d 2011/07/06 14:29'
[0.33] PROMLIB: Root node compatible: sun4v
[0.79] Linux version 4.12.0-08915-gf263fbb (mroos@t2000) (gcc version 
4.9.2 (Debian 4.9.2-20)) #141 SMP Sun Jul 9 17:51:12 EEST 2017
[0.002047] bootconsole [earlyprom0] enabled
[0.002383] ARCH: SUN4V
[0.002668] Ethernet address: 00:14:4f:86:99:26
[0.003406] MM: PAGE_OFFSET is 0x8000 (max_phys_bits == 39)
[0.004089] MM: VMALLOC [0x0001 --> 0x6000]
[0.004562] MM: VMEMMAP [0x6000 --> 0xc000]
[0.095699] Kernel: Using 3 locked TLB entries for main kernel image.
[0.096387] Remapping the kernel... 
[0.096400] done.
[1.906342] OF stdout device is: /virtual-devices@100/console@1
[1.907160] PROM: Built device tree with 148821 bytes of memory.
[1.907804] MDESC: Size is 42336 bytes.
[1.910139] PLATFORM: banner-name [Sun Fire T200]
[1.910564] PLATFORM: name [SUNW,Sun-Fire-T200]
[1.910919] PLATFORM: hostid [84869926]
[