Re: 5.1-rc1: mpt init crash in scsi_map_dma, dma_4v_map_sg on sparc64

2019-04-04 Thread Christoph Hellwig
On Thu, Apr 04, 2019 at 06:38:53PM +0300, Meelis Roos wrote:
> > > Yes, reverting this commit makes my T1000 boot.
> > 
> > Does the patch attached to the last mail work as well?
> 
> Sorry for misreading your mail - tested now and yes, it works.

Thanks, I'll submit it with a proper changelog.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: 5.1-rc1: mpt init crash in scsi_map_dma, dma_4v_map_sg on sparc64

2019-04-04 Thread Meelis Roos

Yes, reverting this commit makes my T1000 boot.


Does the patch attached to the last mail work as well?


Sorry for misreading your mail - tested now and yes, it works.

--
Meelis Roos
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: 5.1-rc1: mpt init crash in scsi_map_dma, dma_4v_map_sg on sparc64

2019-04-04 Thread Christoph Hellwig
On Thu, Apr 04, 2019 at 10:11:43AM +0300, Meelis Roos wrote:
> > I think this might have been this commit:
> > 
> > commit 24132a419c68f1d69eb8ecc91b3c80d730ecbb59
> > Author: Christoph Hellwig 
> > Date:   Fri Feb 15 09:30:28 2019 +0100
> > 
> >  sparc64/pci_sun4v: allow large DMA masks
> > 
> > the patch below adds a few missing checks and hopefully should fix
> > your problem.  If not can you try to revert the commit to check if
> > my theory was correct to start with?
> Yes, reverting this commit makes my T1000 boot.

Does the patch attached to the last mail work as well?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: 5.1-rc1: mpt init crash in scsi_map_dma, dma_4v_map_sg on sparc64

2019-04-04 Thread Meelis Roos

I think this might have been this commit:

commit 24132a419c68f1d69eb8ecc91b3c80d730ecbb59
Author: Christoph Hellwig 
Date:   Fri Feb 15 09:30:28 2019 +0100

 sparc64/pci_sun4v: allow large DMA masks

the patch below adds a few missing checks and hopefully should fix
your problem.  If not can you try to revert the commit to check if
my theory was correct to start with?

Yes, reverting this commit makes my T1000 boot.

--
Meelis Roos 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: 5.1-rc1: mpt init crash in scsi_map_dma, dma_4v_map_sg on sparc64

2019-04-03 Thread Christoph Hellwig
I think this might have been this commit:

commit 24132a419c68f1d69eb8ecc91b3c80d730ecbb59
Author: Christoph Hellwig 
Date:   Fri Feb 15 09:30:28 2019 +0100

sparc64/pci_sun4v: allow large DMA masks

the patch below adds a few missing checks and hopefully should fix
your problem.  If not can you try to revert the commit to check if
my theory was correct to start with?


Date:   Wed Apr 3 21:34:34 2019 +0200

diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c
index a8af6023c126..14b93c5564e3 100644
--- a/arch/sparc/kernel/pci_sun4v.c
+++ b/arch/sparc/kernel/pci_sun4v.c
@@ -73,6 +73,11 @@ static inline void iommu_batch_start(struct device *dev, 
unsigned long prot, uns
p->npages   = 0;
 }
 
+static inline bool iommu_use_atu(struct iommu *iommu, u64 mask)
+{
+   return iommu->atu && mask > DMA_BIT_MASK(32);
+}
+
 /* Interrupts must be disabled.  */
 static long iommu_batch_flush(struct iommu_batch *p, u64 mask)
 {
@@ -92,7 +97,7 @@ static long iommu_batch_flush(struct iommu_batch *p, u64 mask)
prot &= (HV_PCI_MAP_ATTR_READ | HV_PCI_MAP_ATTR_WRITE);
 
while (npages != 0) {
-   if (mask <= DMA_BIT_MASK(32) || !pbm->iommu->atu) {
+   if (!iommu_use_atu(pbm->iommu, mask)) {
num = pci_sun4v_iommu_map(devhandle,
  HV_PCI_TSBID(0, entry),
  npages,
@@ -179,7 +184,6 @@ static void *dma_4v_alloc_coherent(struct device *dev, 
size_t size,
unsigned long flags, order, first_page, npages, n;
unsigned long prot = 0;
struct iommu *iommu;
-   struct atu *atu;
struct iommu_map_table *tbl;
struct page *page;
void *ret;
@@ -205,13 +209,11 @@ static void *dma_4v_alloc_coherent(struct device *dev, 
size_t size,
memset((char *)first_page, 0, PAGE_SIZE << order);
 
iommu = dev->archdata.iommu;
-   atu = iommu->atu;
-
mask = dev->coherent_dma_mask;
-   if (mask <= DMA_BIT_MASK(32) || !atu)
+   if (!iommu_use_atu(iommu, mask))
tbl = >tbl;
else
-   tbl = >tbl;
+   tbl = >atu->tbl;
 
entry = iommu_tbl_range_alloc(dev, tbl, npages, NULL,
  (unsigned long)(-1), 0);
@@ -333,7 +335,7 @@ static void dma_4v_free_coherent(struct device *dev, size_t 
size, void *cpu,
atu = iommu->atu;
devhandle = pbm->devhandle;
 
-   if (dvma <= DMA_BIT_MASK(32)) {
+   if (!iommu_use_atu(iommu, dvma)) {
tbl = >tbl;
iotsb_num = 0; /* we don't care for legacy iommu */
} else {
@@ -374,7 +376,7 @@ static dma_addr_t dma_4v_map_page(struct device *dev, 
struct page *page,
npages >>= IO_PAGE_SHIFT;
 
mask = *dev->dma_mask;
-   if (mask <= DMA_BIT_MASK(32))
+   if (!iommu_use_atu(iommu, mask))
tbl = >tbl;
else
tbl = >tbl;
@@ -510,7 +512,7 @@ static int dma_4v_map_sg(struct device *dev, struct 
scatterlist *sglist,
  IO_PAGE_SIZE) >> IO_PAGE_SHIFT;
 
mask = *dev->dma_mask;
-   if (mask <= DMA_BIT_MASK(32))
+   if (!iommu_use_atu(iommu, mask))
tbl = >tbl;
else
tbl = >tbl;
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: 5.1-rc1: mpt init crash in scsi_map_dma, dma_4v_map_sg on sparc64

2019-04-03 Thread Robin Murphy

On 02/04/2019 23:39, Rob Gardner wrote:

On 4/2/19 2:30 PM, Meelis Roos wrote:
[   17.566584] scsi host0: ioc0: LSISAS1064 A3, FwRev=010ah, 
Ports=1, MaxQ=511, IRQ=27
[   17.595897] mptsas: ioc0: attaching ssp device: fw_channel 0, 
fw_id 0, phy 0, sas_addr 0x5000c5001799a45d

[   17.598465] Unable to handle kernel NULL pointer dereference
[   17.598623] tsk->{mm,active_mm}->context = 
[   17.598723] tsk->{mm,active_mm}->pgd = 88802000
[   17.598774]   \|/  \|/
[   17.598774]   "@'/ .. \`@"
[   17.598774]   /_| \__/ |_\
[   17.598774]  \__U_/
[   17.598894] swapper/0(1): Oops [#1]
[   17.598937] CPU: 12 PID: 1 Comm: swapper/0 Not tainted 5.1.0-rc1 
#118
[   17.598994] TSTATE: 80e01601 TPC: 004483a8 TNPC: 
004483ac Y:     Not tainted

[   17.599086] TPC: 


You may use gdb to figure out what the NULL pointer points to:

gdb vmlinux

l *(dma_4v_map_sg+0xe8)


gdb did not parse the file but objdump --disassemble worked and +0xe8 
seems to be 4483a8



Of course that was right there in the panic message, as TPC is the 
address of the instruction that faulted:


ldx  [ %i4 ], %g1

For anyone wishing to dig into this further, here is my off the cuff 
analysis:


I believe the fault is happening on this line:

     base_shift = tbl->table_map_base >> IO_PAGE_SHIFT;

The tbl variable is assigned to one of two values in the statement 
above, but since the register dump shows the value in %i4 was 0x10, that 
strongly suggests that it executed this:


     tbl = >tbl;

Because the offset of the tbl field in struct atu is 0x10, and that was 
computed here:


448384:   b8 07 60 10 add  %i5, 0x10, %i4

(The offset of tbl in struct iommu is 0, so we would have seen that 0 in 
%i4 if it had taken the iommu path.)


 From the register dump, the value in %i5 was 0. And that came from this 
instruction:


4482f4:   fa 58 e2 58 ldx  [ %g3 + 0x258 ], %i5

Likewise, %g3 came from here:

4482d4:   c6 5e 22 18 ldx  [ %i0 + 0x218 ], %g3

And %i0 is arg0, struct device *dev. So the code is loading some field 
in struct device at offset 0x218, which is consistent with the source:


iommu = dev->archdata.iommu;

So %g3 points to struct iommu, and the code is trying to load the value 
at offset 0x258 in that structure, probably this:


atu = iommu->atu;

And atu is the NULL pointer.

Now whether this is the problem, I don't know. It may be that mask 
(*dev->dma_mask) was wrong, causing the code to take the >tbl path 
instead of the >tbl path. We can see from the code that mask is 
in %g7, and the register dump shows the value of %g7 is fff, 
while DMA_BIT_MASK(32) is in %g1 and is , so this might 
be the result of some confusion over 32 bit vs 64 bit stuff.


Nice deduction! If it was AArch64 asm I might have tried, but I've never 
even seen SPARC asm before :)


FWIW, scripts/faddr2line is your friend when deciphering stacktrace symbols.

In terms of the crash itself, I'd note that there's also been ongoing 
cleanup to fix the remaining places where the DMA API was called with 
NULL instead of the appropriate device - it could be that as a result of 
that, the driver/subsystem here is now taking a path that has not been 
properly exercised before, and/or that it's not quite the right device 
pointer being picked up.



I hope these bits of information help somebody debug further.


Thanks,
Robin.




Rob




004482c0 :
  4482c0:   9d e3 be b0 save  %sp, -336, %sp
  4482c4:   80 a6 e0 03 cmp  %i3, 3
  4482c8:   02 40 00 c1 be,pn   %icc, 4485cc 


  4482cc:   92 10 21 e2 mov  0x1e2, %o1
  4482d0:   80 a0 00 1a cmp  %g0, %i2
  4482d4:   c6 5e 22 18 ldx  [ %i0 + 0x218 ], %g3
  4482d8:   82 10 20 00 clr  %g1
  4482dc:   84 60 3f ff subc  %g0, -1, %g2
  4482e0:   83 78 e4 01 movre  %g3, 1, %g1
  4482e4:   80 90 80 01 orcc  %g2, %g1, %g0
  4482e8:   12 40 00 bd bne,pn   %icc, 4485dc 


  4482ec:   80 a6 e0 01 cmp  %i3, 1
  4482f0:   84 10 20 03 mov  3, %g2
  4482f4:   fa 58 e2 58 ldx  [ %g3 + 0x258 ], %i5
  4482f8:   85 64 60 01 move  %icc, 1, %g2
  4482fc:   b8 0f 20 02 and  %i4, 2, %i4
  448300:   c0 77 a7 f7 clrx  [ %fp + 0x7f7 ]
  448304:   82 10 a0 04 or  %g2, 4, %g1
  448308:   c0 26 60 18 clr  [ %i1 + 0x18 ]
  44830c:   85 7f 14 01 movrne  %i4, %g1, %g2
  448310:   8f 52 00 00 rdpr  %pil, %g7
  448314:   82 11 e0 0e or  %g7, 0xe, %g1
  448318:   91 90 60 00 wrpr  %g1, 0, %pil
  44831c:   ce 77 a7 bf stx  %g7, [ %fp + 0x7bf ]
  448320:   0f 00 02 00 sethi  %hi(0x8), %g7
  448324:   27 00 00 40 sethi  %hi(0x1), %l3
  448328:   ce 77 a7 df stx  %g7, [ %fp + 0x7df ]
  44832c:   0f 00 28 21 sethi  %hi(0xa08400), %g7
  448330:   

Re: 5.1-rc1: mpt init crash in scsi_map_dma, dma_4v_map_sg on sparc64

2019-04-02 Thread Rob Gardner

On 4/2/19 2:30 PM, Meelis Roos wrote:
[   17.566584] scsi host0: ioc0: LSISAS1064 A3, FwRev=010ah, 
Ports=1, MaxQ=511, IRQ=27
[   17.595897] mptsas: ioc0: attaching ssp device: fw_channel 0, 
fw_id 0, phy 0, sas_addr 0x5000c5001799a45d

[   17.598465] Unable to handle kernel NULL pointer dereference
[   17.598623] tsk->{mm,active_mm}->context = 
[   17.598723] tsk->{mm,active_mm}->pgd = 88802000
[   17.598774]   \|/  \|/
[   17.598774]   "@'/ .. \`@"
[   17.598774]   /_| \__/ |_\
[   17.598774]  \__U_/
[   17.598894] swapper/0(1): Oops [#1]
[   17.598937] CPU: 12 PID: 1 Comm: swapper/0 Not tainted 5.1.0-rc1 
#118
[   17.598994] TSTATE: 80e01601 TPC: 004483a8 TNPC: 
004483ac Y:     Not tainted

[   17.599086] TPC: 


You may use gdb to figure out what the NULL pointer points to:

gdb vmlinux

l *(dma_4v_map_sg+0xe8)


gdb did not parse the file but objdump --disassemble worked and +0xe8 
seems to be 4483a8



Of course that was right there in the panic message, as TPC is the 
address of the instruction that faulted:


ldx  [ %i4 ], %g1

For anyone wishing to dig into this further, here is my off the cuff 
analysis:


I believe the fault is happening on this line:

    base_shift = tbl->table_map_base >> IO_PAGE_SHIFT;

The tbl variable is assigned to one of two values in the statement 
above, but since the register dump shows the value in %i4 was 0x10, that 
strongly suggests that it executed this:


    tbl = >tbl;

Because the offset of the tbl field in struct atu is 0x10, and that was 
computed here:


448384:   b8 07 60 10 add  %i5, 0x10, %i4

(The offset of tbl in struct iommu is 0, so we would have seen that 0 in 
%i4 if it had taken the iommu path.)


From the register dump, the value in %i5 was 0. And that came from this 
instruction:


4482f4:   fa 58 e2 58 ldx  [ %g3 + 0x258 ], %i5

Likewise, %g3 came from here:

4482d4:   c6 5e 22 18 ldx  [ %i0 + 0x218 ], %g3

And %i0 is arg0, struct device *dev. So the code is loading some field 
in struct device at offset 0x218, which is consistent with the source:


iommu = dev->archdata.iommu;

So %g3 points to struct iommu, and the code is trying to load the value 
at offset 0x258 in that structure, probably this:


atu = iommu->atu;

And atu is the NULL pointer.

Now whether this is the problem, I don't know. It may be that mask 
(*dev->dma_mask) was wrong, causing the code to take the >tbl path 
instead of the >tbl path. We can see from the code that mask is 
in %g7, and the register dump shows the value of %g7 is fff, 
while DMA_BIT_MASK(32) is in %g1 and is , so this might 
be the result of some confusion over 32 bit vs 64 bit stuff.


I hope these bits of information help somebody debug further.


Rob




004482c0 :
  4482c0:   9d e3 be b0 save  %sp, -336, %sp
  4482c4:   80 a6 e0 03 cmp  %i3, 3
  4482c8:   02 40 00 c1 be,pn   %icc, 4485cc 


  4482cc:   92 10 21 e2 mov  0x1e2, %o1
  4482d0:   80 a0 00 1a cmp  %g0, %i2
  4482d4:   c6 5e 22 18 ldx  [ %i0 + 0x218 ], %g3
  4482d8:   82 10 20 00 clr  %g1
  4482dc:   84 60 3f ff subc  %g0, -1, %g2
  4482e0:   83 78 e4 01 movre  %g3, 1, %g1
  4482e4:   80 90 80 01 orcc  %g2, %g1, %g0
  4482e8:   12 40 00 bd bne,pn   %icc, 4485dc 


  4482ec:   80 a6 e0 01 cmp  %i3, 1
  4482f0:   84 10 20 03 mov  3, %g2
  4482f4:   fa 58 e2 58 ldx  [ %g3 + 0x258 ], %i5
  4482f8:   85 64 60 01 move  %icc, 1, %g2
  4482fc:   b8 0f 20 02 and  %i4, 2, %i4
  448300:   c0 77 a7 f7 clrx  [ %fp + 0x7f7 ]
  448304:   82 10 a0 04 or  %g2, 4, %g1
  448308:   c0 26 60 18 clr  [ %i1 + 0x18 ]
  44830c:   85 7f 14 01 movrne  %i4, %g1, %g2
  448310:   8f 52 00 00 rdpr  %pil, %g7
  448314:   82 11 e0 0e or  %g7, 0xe, %g1
  448318:   91 90 60 00 wrpr  %g1, 0, %pil
  44831c:   ce 77 a7 bf stx  %g7, [ %fp + 0x7bf ]
  448320:   0f 00 02 00 sethi  %hi(0x8), %g7
  448324:   27 00 00 40 sethi  %hi(0x1), %l3
  448328:   ce 77 a7 df stx  %g7, [ %fp + 0x7df ]
  44832c:   0f 00 28 21 sethi  %hi(0xa08400), %g7
  448330:   8e 11 e2 b0 or  %g7, 0x2b0, %g7 ! a086b0 


  448334:   f0 71 c0 05 stx  %i0, [ %g7 + %g5 ]
  448338:   82 01 c0 05 add  %g7, %g5, %g1
  44833c:   c4 70 60 08 stx  %g2, [ %g1 + 8 ]
  448340:   84 10 3f ff mov  -1, %g2
  448344:   c0 70 60 20 clrx  [ %g1 + 0x20 ]
  448348:   c4 70 60 10 stx  %g2, [ %g1 + 0x10 ]
  44834c:   c2 5e 22 00 ldx  [ %i0 + 0x200 ], %g1
  448350:   22 c0 40 0d brz,a,pn   %g1, 448384 


  448354:   c2 5e 21 e0 ldx  [ %i0 + 0x1e0 ], %g1
  448358:   e6 00 40 00 ld  [ %g1 ], %l3
  44835c:   05 00 00 40 sethi  

Re: 5.1-rc1: mpt init crash in scsi_map_dma, dma_4v_map_sg on sparc64

2019-04-02 Thread Meelis Roos

[   17.566584] scsi host0: ioc0: LSISAS1064 A3, FwRev=010ah, Ports=1, 
MaxQ=511, IRQ=27
[   17.595897] mptsas: ioc0: attaching ssp device: fw_channel 0, fw_id 0, phy 
0, sas_addr 0x5000c5001799a45d
[   17.598465] Unable to handle kernel NULL pointer dereference
[   17.598623] tsk->{mm,active_mm}->context = 
[   17.598723] tsk->{mm,active_mm}->pgd = 88802000
[   17.598774]   \|/  \|/
[   17.598774]   "@'/ .. \`@"
[   17.598774]   /_| \__/ |_\
[   17.598774]  \__U_/
[   17.598894] swapper/0(1): Oops [#1]
[   17.598937] CPU: 12 PID: 1 Comm: swapper/0 Not tainted 5.1.0-rc1 #118
[   17.598994] TSTATE: 80e01601 TPC: 004483a8 TNPC: 
004483ac Y: Not tainted
[   17.599086] TPC: 


You may use gdb to figure out what the NULL pointer points to:

gdb vmlinux

l *(dma_4v_map_sg+0xe8)


gdb did not parse the file but objdump --disassemble worked and +0xe8 seems to 
be 4483a8

004482c0 :
  4482c0:   9d e3 be b0 save  %sp, -336, %sp
  4482c4:   80 a6 e0 03 cmp  %i3, 3
  4482c8:   02 40 00 c1 be,pn   %icc, 4485cc 
  4482cc:   92 10 21 e2 mov  0x1e2, %o1
  4482d0:   80 a0 00 1a cmp  %g0, %i2
  4482d4:   c6 5e 22 18 ldx  [ %i0 + 0x218 ], %g3
  4482d8:   82 10 20 00 clr  %g1
  4482dc:   84 60 3f ff subc  %g0, -1, %g2
  4482e0:   83 78 e4 01 movre  %g3, 1, %g1
  4482e4:   80 90 80 01 orcc  %g2, %g1, %g0
  4482e8:   12 40 00 bd bne,pn   %icc, 4485dc 
  4482ec:   80 a6 e0 01 cmp  %i3, 1
  4482f0:   84 10 20 03 mov  3, %g2
  4482f4:   fa 58 e2 58 ldx  [ %g3 + 0x258 ], %i5
  4482f8:   85 64 60 01 move  %icc, 1, %g2
  4482fc:   b8 0f 20 02 and  %i4, 2, %i4
  448300:   c0 77 a7 f7 clrx  [ %fp + 0x7f7 ]
  448304:   82 10 a0 04 or  %g2, 4, %g1
  448308:   c0 26 60 18 clr  [ %i1 + 0x18 ]
  44830c:   85 7f 14 01 movrne  %i4, %g1, %g2
  448310:   8f 52 00 00 rdpr  %pil, %g7
  448314:   82 11 e0 0e or  %g7, 0xe, %g1
  448318:   91 90 60 00 wrpr  %g1, 0, %pil
  44831c:   ce 77 a7 bf stx  %g7, [ %fp + 0x7bf ]
  448320:   0f 00 02 00 sethi  %hi(0x8), %g7
  448324:   27 00 00 40 sethi  %hi(0x1), %l3
  448328:   ce 77 a7 df stx  %g7, [ %fp + 0x7df ]
  44832c:   0f 00 28 21 sethi  %hi(0xa08400), %g7
  448330:   8e 11 e2 b0 or  %g7, 0x2b0, %g7 ! a086b0 
  448334:   f0 71 c0 05 stx  %i0, [ %g7 + %g5 ]
  448338:   82 01 c0 05 add  %g7, %g5, %g1
  44833c:   c4 70 60 08 stx  %g2, [ %g1 + 8 ]
  448340:   84 10 3f ff mov  -1, %g2
  448344:   c0 70 60 20 clrx  [ %g1 + 0x20 ]
  448348:   c4 70 60 10 stx  %g2, [ %g1 + 0x10 ]
  44834c:   c2 5e 22 00 ldx  [ %i0 + 0x200 ], %g1
  448350:   22 c0 40 0d brz,a,pn   %g1, 448384 
  448354:   c2 5e 21 e0 ldx  [ %i0 + 0x1e0 ], %g1
  448358:   e6 00 40 00 ld  [ %g1 ], %l3
  44835c:   05 00 00 40 sethi  %hi(0x1), %g2
  448360:   c2 58 60 08 ldx  [ %g1 + 8 ], %g1
  448364:   80 a4 e0 00 cmp  %l3, 0
  448368:   02 c8 40 06 brz  %g1, 448380 
  44836c:   a7 64 40 02 move  %icc, %g2, %l3
  448370:   25 00 00 08 sethi  %hi(0x2000), %l2
  448374:   a4 00 40 12 add  %g1, %l2, %l2
  448378:   a5 34 b0 0d srlx  %l2, 0xd, %l2
  44837c:   e4 77 a7 df stx  %l2, [ %fp + 0x7df ]
  448380:   c2 5e 21 e0 ldx  [ %i0 + 0x1e0 ], %g1
  448384:   b8 07 60 10 add  %i5, 0x10, %i4
  448388:   c2 58 40 00 ldx  [ %g1 ], %g1
  44838c:   c2 77 a7 d7 stx  %g1, [ %fp + 0x7d7 ]
  448390:   82 10 3f ff mov  -1, %g1
  448394:   ce 5f a7 d7 ldx  [ %fp + 0x7d7 ], %g7
  448398:   83 30 70 20 srlx  %g1, 0x20, %g1
  44839c:   80 a1 c0 01 cmp  %g7, %g1
  4483a0:   b9 65 10 03 movleu  %xcc, %g3, %i4
  4483a4:   80 a6 a0 00 cmp  %i2, 0
  4483a8:   c2 5f 00 00 ldx  [ %i4 ], %g1
 
  4483ac:   83 30 70 0d srlx  %g1, 0xd, %g1
  4483b0:   04 40 01 26 ble,pn   %icc, 448848 
  4483b4:   c2 77 a7 9f stx  %g1, [ %fp + 0x79f ]
  4483b8:   c2 5f a7 df ldx  [ %fp + 0x7df ], %g1
  4483bc:   84 10 3f ff mov  -1, %g2
  4483c0:   23 00 28 21 sethi  %hi(0xa08400), %l1
  4483c4:   ce 5f a7 df ldx  [ %fp + 0x7df ], %g7
  4483c8:   a2 14 62 b0 or  %l1, 0x2b0, %l1
  4483cc:   86 10 20 01 mov  1, %g3
  4483d0:   82 00 7f ff add  %g1, -1, %g1
  4483d4:   e6 27 a7 af st  %l3, [ %fp + 0x7af ]
  4483d8:   ab 30 b0 33 srlx  %g2, 0x33, %l5
  4483dc:   8e 08 40 07 and  %g1, %g7, %g7
  4483e0:   c2 77 a7 cf stx  %g1, [ %fp + 0x7cf ]
  4483e4:   a0 10 00 19 mov  %i1, %l0
  4483e8:   f2 77 a7 a7 stx  %i1, [ %fp 

Re: 5.1-rc1: mpt init crash in scsi_map_dma, dma_4v_map_sg on sparc64

2019-04-02 Thread Ming Lei
On Tue, Mar 19, 2019 at 7:20 PM Meelis Roos  wrote:
>
> Tried 5.1-rc1 on a bunch of sparcs, this hits all my sparcs with sun4v and 
> mpt scsi.
>
> [2.733263] Fusion MPT base driver 3.04.20
> [2.742995] Copyright (c) 1999-2008 LSI Corporation
> [2.743052] Fusion MPT SAS Host driver 3.04.20
> [2.743881] mptbase: ioc0: Initiating bringup
> [3.737822] ioc0: LSISAS1064 A3: Capabilities={Initiator}
> [   17.566584] scsi host0: ioc0: LSISAS1064 A3, FwRev=010ah, Ports=1, 
> MaxQ=511, IRQ=27
> [   17.595897] mptsas: ioc0: attaching ssp device: fw_channel 0, fw_id 0, phy 
> 0, sas_addr 0x5000c5001799a45d
> [   17.598465] Unable to handle kernel NULL pointer dereference
> [   17.598623] tsk->{mm,active_mm}->context = 
> [   17.598723] tsk->{mm,active_mm}->pgd = 88802000
> [   17.598774]   \|/  \|/
> [   17.598774]   "@'/ .. \`@"
> [   17.598774]   /_| \__/ |_\
> [   17.598774]  \__U_/
> [   17.598894] swapper/0(1): Oops [#1]
> [   17.598937] CPU: 12 PID: 1 Comm: swapper/0 Not tainted 5.1.0-rc1 #118
> [   17.598994] TSTATE: 80e01601 TPC: 004483a8 TNPC: 
> 004483ac Y: Not tainted
> [   17.599086] TPC: 

You may use gdb to figure out what the NULL pointer points to:

gdb vmlinux
> l *(dma_4v_map_sg+0xe8)


Thanks,
Ming Lei
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: 5.1-rc1: mpt init crash in scsi_map_dma, dma_4v_map_sg on sparc64

2019-04-01 Thread Meelis Roos

Still broken in 5.1-rc3.


Tried 5.1-rc1 on a bunch of sparcs, this hits all my sparcs with sun4v and mpt 
scsi.

[    2.733263] Fusion MPT base driver 3.04.20
[    2.742995] Copyright (c) 1999-2008 LSI Corporation
[    2.743052] Fusion MPT SAS Host driver 3.04.20
[    2.743881] mptbase: ioc0: Initiating bringup
[    3.737822] ioc0: LSISAS1064 A3: Capabilities={Initiator}
[   17.566584] scsi host0: ioc0: LSISAS1064 A3, FwRev=010ah, Ports=1, 
MaxQ=511, IRQ=27
[   17.595897] mptsas: ioc0: attaching ssp device: fw_channel 0, fw_id 0, phy 
0, sas_addr 0x5000c5001799a45d
[   17.598465] Unable to handle kernel NULL pointer dereference
[   17.598623] tsk->{mm,active_mm}->context = 
[   17.598723] tsk->{mm,active_mm}->pgd = 88802000
[   17.598774]   \|/  \|/
[   17.598774]   "@'/ .. \`@"
[   17.598774]   /_| \__/ |_\
[   17.598774]  \__U_/
[   17.598894] swapper/0(1): Oops [#1]
[   17.598937] CPU: 12 PID: 1 Comm: swapper/0 Not tainted 5.1.0-rc1 #118
[   17.598994] TSTATE: 80e01601 TPC: 004483a8 TNPC: 
004483ac Y:     Not tainted
[   17.599086] TPC: 
[   17.599127] g0: 886d1d51 g1:  g2: 0001 
g3: 886b8000
[   17.599197] g4: 886c g5: 8001fef78000 g6: 886d 
g7: 
[   17.599267] o0: 8001f526bc90 o1: 01e2 o2: 8001f4fc2000 
o3: 8001f4fc2000
[   17.599337] o4: 8001f4fc1144 o5: 8001f5002800 sp: 886d1db1 
ret_pc: 00740720
[   17.599415] RPC: 
[   17.599456] l0: 2400 l1: ff00 l2: 0008 
l3: 0001
[   17.599526] l4: 8001f5002830 l5: 00ff l6: 8001f46c7e10 
l7: 8001f4fc1000
[   17.599596] i0: 8001f4b350b0 i1: 8001f526be28 i2: 0001 
i3: 0002
[   17.599665] i4: 0010 i5:  i6: 886d1f01 
i7: 00725570
[   17.599745] I7: 
[   17.599781] Call Trace:
[   17.599824]  [00725570] scsi_dma_map+0x50/0xc0
[   17.599881]  [00740720] mptscsih_qcmd+0x280/0x660
[   17.599940]  [00723dec] scsi_queue_rq+0x6ac/0x880
[   17.65]  [00680198] blk_mq_dispatch_rq_list+0x138/0x540
[   17.600065]  [00685154] blk_mq_do_dispatch_sched+0x54/0x100
[   17.600124]  [0068560c] blk_mq_sched_dispatch_requests+0xec/0x160
[   17.600186]  [0067e83c] __blk_mq_run_hw_queue+0x9c/0x180
[   17.600246]  [0067eaa8] __blk_mq_delay_run_hw_queue+0x188/0x1e0
[   17.600307]  [0067ff74] blk_mq_run_hw_queue+0x54/0x140
[   17.600365]  [00685be0] blk_mq_sched_insert_request+0x120/0x180
[   17.600424]  [0067a394] blk_execute_rq+0x34/0x60
[   17.600483]  [007218cc] __scsi_execute+0xcc/0x1a0
[   17.600543]  [00725f40] scsi_probe_and_add_lun+0x1e0/0xec0
[   17.600603]  [00726e98] __scsi_scan_target+0xb8/0x680
[   17.600663]  [0072757c] scsi_scan_target+0x11c/0x140
[   17.600727]  [0072e9b8] sas_rphy_add+0x138/0x1c0
[   17.600777] Disabling lock debugging due to kernel taint
[   17.600837] Caller[00725570]: scsi_dma_map+0x50/0xc0
[   17.600896] Caller[00740720]: mptscsih_qcmd+0x280/0x660
[   17.600956] Caller[00723dec]: scsi_queue_rq+0x6ac/0x880
[   17.601018] Caller[00680198]: blk_mq_dispatch_rq_list+0x138/0x540
[   17.601078] Caller[00685154]: blk_mq_do_dispatch_sched+0x54/0x100
[   17.601138] Caller[0068560c]: 
blk_mq_sched_dispatch_requests+0xec/0x160
[   17.601210] Caller[0067e83c]: __blk_mq_run_hw_queue+0x9c/0x180
[   17.601271] Caller[0067eaa8]: __blk_mq_delay_run_hw_queue+0x188/0x1e0
[   17.601333] Caller[0067ff74]: blk_mq_run_hw_queue+0x54/0x140
[   17.601392] Caller[00685be0]: blk_mq_sched_insert_request+0x120/0x180
[   17.601453] Caller[0067a394]: blk_execute_rq+0x34/0x60
[   17.601513] Caller[007218cc]: __scsi_execute+0xcc/0x1a0
[   17.601574] Caller[00725f40]: scsi_probe_and_add_lun+0x1e0/0xec0
[   17.601635] Caller[00726e98]: __scsi_scan_target+0xb8/0x680
[   17.601696] Caller[0072757c]: scsi_scan_target+0x11c/0x140
[   17.601758] Caller[0072e9b8]: sas_rphy_add+0x138/0x1c0
[   17.601819] Caller[00743b64]: mptsas_add_end_device+0xc4/0x100
[   17.601882] Caller[00746964]: mptsas_scan_sas_topology+0x164/0x300
[   17.601943] Caller[00749094]: mptsas_probe+0x2d4/0x440
[   17.602004] Caller[006bf948]: pci_device_probe+0xc8/0x160
[   17.602066] Caller[0070dab0]: really_probe+0x1b0/0x2e0
[   17.602126] Caller[0070de10]: driver_probe_device+0x50/0x100
[   17.602186] Caller[0070e0a8]: device_driver_attach+0x48/0x60
[   17.602245] Caller[0070e140]: __driver_attach+0x80/0xe0
[   17.602302] Caller[0070c484]: bus_for_each_dev+0x44/0x80
[   17.602360]