Re: ARM64 boot failure on espressobin with 5.0.0-rc6 (1f947a7a011fcceb14cb912f5481a53b18f1879a)

2019-02-15 Thread John David Anglin
On 2019-02-14 12:58 p.m., Robin Murphy wrote:
> Hmm, having felt brave enough to take a closer look, it might actually be as 
> simple as this - Dave, are you able to give the diff below a spin?
> 
> Robin.
> 
> ->8-
> diff --git a/drivers/dma/mv_xor.c b/drivers/dma/mv_xor.c
> index 7f595355fb79..fe4a7c71fede 100644
> --- a/drivers/dma/mv_xor.c
> +++ b/drivers/dma/mv_xor.c
> @@ -1059,6 +1059,7 @@ mv_xor_channel_add(struct mv_xor_device *xordev,
>  mv_chan->op_in_desc = XOR_MODE_IN_DESC;
> 
>  dma_dev = _chan->dmadev;
> +dma_dev->dev = >dev;
>  mv_chan->xordev = xordev;
> 
>  /*
> @@ -1091,7 +1092,6 @@ mv_xor_channel_add(struct mv_xor_device *xordev,
>  dma_dev->device_free_chan_resources = mv_xor_free_chan_resources;
>  dma_dev->device_tx_status = mv_xor_status;
>  dma_dev->device_issue_pending = mv_xor_issue_pending;
> -dma_dev->dev = >dev;
> 
>  /* set prep routines based on capability */
>  if (dma_has_cap(DMA_INTERRUPT, dma_dev->cap_mask))

The patch is fine and it fixes the boot failure.

I misapplied it in previous test.

Thanks,
Dave

-- 
John David Anglin  dave.ang...@bell.net
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: ARM64 boot failure on espressobin with 5.0.0-rc6 (1f947a7a011fcceb14cb912f5481a53b18f1879a)

2019-02-15 Thread John David Anglin
On 2019-02-15 10:22 a.m., John David Anglin wrote:
> On 2019-02-14 12:58 p.m., Robin Murphy wrote:
>> Hmm, having felt brave enough to take a closer look, it might actually be as 
>> simple as this - Dave, are you able to give the diff below a spin?
> Still crashes but in slightly different spot:
>
I think I see what's wrong.

Dave

-- 
John David Anglin  dave.ang...@bell.net

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: ARM64 boot failure on espressobin with 5.0.0-rc6 (1f947a7a011fcceb14cb912f5481a53b18f1879a)

2019-02-15 Thread John David Anglin
On 2019-02-14 12:58 p.m., Robin Murphy wrote:
> Hmm, having felt brave enough to take a closer look, it might actually be as 
> simple as this - Dave, are you able to give the diff below a spin?
Still crashes but in slightly different spot:

[    0.00] Booting Linux on physical CPU 0x00 [0x410fd034]
[    0.00] Linux version 5.0.0-rc6+ (root@espressobin) (gcc version 6.3.0 
20170516 (Debian 6.3.0-18+deb9u1)) #1 SMP PREEMPT Wed Feb 13
16:17:46 EST 2019
[    0.00] Machine model: Globalscale Marvell ESPRESSOBin Board
[    0.00] earlycon: ar3700_uart0 at MMIO 0xd0012000 (options '')
[    0.00] printk: bootconsole [ar3700_uart0] enabled
[    3.210276] Internal error: Oops: 9645 [#1] PREEMPT SMP
[    3.215932] Modules linked in:
[    3.219072] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc6+ #1
[    3.225519] Hardware name: Globalscale Marvell ESPRESSOBin Board (DT)
[    3.232151] pstate: a005 (NzCv daif -PAN -UAO)
[    3.237090] pc : mv_xor_channel_add+0x4c/0xb28
[    3.241650] lr : mv_xor_probe+0x20c/0x4b8
[    3.245768] sp : ff8010033ac0
[    3.249173] x29: ff8010033ac0 x28: 
[    3.254639] x27: ff8010e76068 x26: 0029
[    3.260104] x25:  x24: 
[    3.265570] x23: ffc03fb47400 x22: ff8010ead000
[    3.271035] x21: ffc03fb47410 x20: ffc03bea8d80
[    3.276501] x19: ffc03fb47400 x18: 
[    3.281966] x17: 000c x16: 000a
[    3.287432] x15: ff8010ead6c8 x14: ffc03beaa003
[    3.292898] x13: ffc03beaa002 x12: 0038
[    3.298363] x11: 1fff x10: 0001
[    3.303829] x9 : 0040 x8 : ff8010ec7928
[    3.309294] x7 : ffc03cc003b8 x6 : 
[    3.314760] x5 :  x4 : 0029
[    3.320226] x3 : 0083 x2 : 80c0
[    3.325691] x1 :  x0 : ffc03fb47410
[    3.331158] Process swapper/0 (pid: 1, stack limit = 0x(ptrval))
[    3.338056] Call trace:
[    3.340569]  mv_xor_channel_add+0x4c/0xb28
[    3.344779]  mv_xor_probe+0x20c/0x4b8
[    3.348544]  platform_drv_probe+0x50/0xb0
[    3.352663]  really_probe+0x1fc/0x2c0
[    3.356427]  driver_probe_device+0x58/0x100
[    3.360727]  __driver_attach+0xd8/0xe0
[    3.364580]  bus_for_each_dev+0x68/0xc8
[    3.368522]  driver_attach+0x20/0x28
[    3.372196]  bus_add_driver+0x108/0x228
[    3.376139]  driver_register+0x60/0x110
[    3.380081]  __platform_driver_register+0x44/0x50
[    3.384923]  mv_xor_driver_init+0x18/0x20
[    3.389043]  do_one_initcall+0x58/0x170
[    3.392985]  kernel_init_freeable+0x190/0x234
[    3.397465]  kernel_init+0x10/0x108
[    3.401047]  ret_from_fork+0x10/0x1c
[    3.404723] Code: f90067a5 d285 52901802 aa1503e0 (f9003035)
[    3.411004] ---[ end trace 65be82a62724e328 ]---
[    3.415804] Kernel panic - not syncing: Attempted to kill init! 
exitcode=0x000b
[    3.423626] SMP: stopping secondary CPUs
[    3.427661] Kernel Offset: disabled
[    3.431243] CPU features: 0x002,2000200c
[    3.435272] Memory Limit: none
[    3.438412] ---[ end Kernel panic - not syncing: Attempted to kill init! 
exitcode=0x000b ]---

ff8010630440 :
ff8010630440:   a9b37bfd    stp x29, x30, [sp, #-208]!
ff8010630444:   910003fd    mov x29, sp
ff8010630448:   a9025bf5    stp x21, x22, [sp, #32]
ff801063044c:   b00043f6    adrp    x22, ff8010ead000 

ff8010630450:   a90363f7    stp x23, x24, [sp, #48]
ff8010630454:   aa0103f7    mov x23, x1
ff8010630458:   a9046bf9    stp x25, x26, [sp, #64]
ff801063045c:   d281    mov x1, #0x0    
// #0
ff8010630460:   a90153f3    stp x19, x20, [sp, #16]
ff8010630464:   910042f5    add x21, x23, #0x10
ff8010630468:   a90573fb    stp x27, x28, [sp, #80]
ff801063046c:   aa0003f4    mov x20, x0
ff8010630470:   911b22c0    add x0, x22, #0x6c8
ff8010630474:   2a0203f9    mov w25, w2
ff8010630478:   f945    ldr x5, [x0]
ff801063047c:   f90067a5    str x5, [x29, #200]
ff8010630480:   d285    mov x5, #0x0    
// #0
ff8010630484:   52901802    mov w2, #0x80c0 
// #32960
ff8010630488:   aa1503e0    mov x0, x21
ff801063048c:   f9003035    str x21, [x1, #96]
ff8010630490:   72a00c02    movk    w2, #0x60, lsl #16
ff8010630494:   d2806a01    mov x1, #0x350 
...

Dave

-- 
John David Anglin  dave.ang...@bell.net

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: ARM64 boot failure on espressobin with 5.0.0-rc6 (1f947a7a011fcceb14cb912f5481a53b18f1879a)

2019-02-14 Thread John David Anglin
On 2019-02-14 12:58 p.m., Robin Murphy wrote:
> Hmm, having felt brave enough to take a closer look, it might actually be as 
> simple as this - Dave, are you able to give the diff below a spin?
Yes.

-- 
John David Anglin  dave.ang...@bell.net

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: ARM64 boot failure on espressobin with 5.0.0-rc6 (1f947a7a011fcceb14cb912f5481a53b18f1879a)

2019-02-14 Thread Robin Murphy

On 14/02/2019 17:36, Christoph Hellwig wrote:

On Thu, Feb 14, 2019 at 05:27:41PM +, Robin Murphy wrote:

Oh wow, that driver has possibly the most inventive way of passing a NULL
device to the DMA API that I've ever seen, and on arm64 it will certainly
have been failing since 4.2, but of course there's also no error checking
for anyone to notice...


I did take a brief look and didn't see how we got the NULL device
pointer, so it is well hidden for sure.


This crash will be a fallout from 356da6d0cd (plus the subsequent fix in
9ab91e7c5c51) that's otherwise missed Christoph's big cleanup. Obviously
the right thing to do is for someone to try to figure out the steaming pile
of mess in that driver, but if necessary I think the quick fix below should
probably suffice to mitigate the change in the short term.


The fix looks ok.  And for 5.2 I plan to explicitly reject all uses of
NULL device arguments in the DMA API.  I've sent patches out for all
the obviously problemetic drivers, and most of them got accepted by the
maintainers for the 5.1 merge window.

It seems like the mv_xor code is mostly unmaintained as far as I can
tell unfortunately.


Hmm, having felt brave enough to take a closer look, it might actually 
be as simple as this - Dave, are you able to give the diff below a spin?


Robin.

->8-
diff --git a/drivers/dma/mv_xor.c b/drivers/dma/mv_xor.c
index 7f595355fb79..fe4a7c71fede 100644
--- a/drivers/dma/mv_xor.c
+++ b/drivers/dma/mv_xor.c
@@ -1059,6 +1059,7 @@ mv_xor_channel_add(struct mv_xor_device *xordev,
mv_chan->op_in_desc = XOR_MODE_IN_DESC;

dma_dev = _chan->dmadev;
+   dma_dev->dev = >dev;
mv_chan->xordev = xordev;

/*
@@ -1091,7 +1092,6 @@ mv_xor_channel_add(struct mv_xor_device *xordev,
dma_dev->device_free_chan_resources = mv_xor_free_chan_resources;
dma_dev->device_tx_status = mv_xor_status;
dma_dev->device_issue_pending = mv_xor_issue_pending;
-   dma_dev->dev = >dev;

/* set prep routines based on capability */
if (dma_has_cap(DMA_INTERRUPT, dma_dev->cap_mask))
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: ARM64 boot failure on espressobin with 5.0.0-rc6 (1f947a7a011fcceb14cb912f5481a53b18f1879a)

2019-02-14 Thread Christoph Hellwig
On Thu, Feb 14, 2019 at 05:27:41PM +, Robin Murphy wrote:
> Oh wow, that driver has possibly the most inventive way of passing a NULL 
> device to the DMA API that I've ever seen, and on arm64 it will certainly 
> have been failing since 4.2, but of course there's also no error checking 
> for anyone to notice...

I did take a brief look and didn't see how we got the NULL device
pointer, so it is well hidden for sure.

> This crash will be a fallout from 356da6d0cd (plus the subsequent fix in 
> 9ab91e7c5c51) that's otherwise missed Christoph's big cleanup. Obviously 
> the right thing to do is for someone to try to figure out the steaming pile 
> of mess in that driver, but if necessary I think the quick fix below should 
> probably suffice to mitigate the change in the short term.

The fix looks ok.  And for 5.2 I plan to explicitly reject all uses of
NULL device arguments in the DMA API.  I've sent patches out for all
the obviously problemetic drivers, and most of them got accepted by the
maintainers for the 5.1 merge window.

It seems like the mv_xor code is mostly unmaintained as far as I can
tell unfortunately.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu