Re: Bad DMA from Marvell 9230

2014-05-30 Thread Roger Heflin
pretty much any smartcommands...I was running something that got all of the smart stats 1x per hour per disk...and this made it crash about 1x per week, if you were pushing the disks hard it appear to make it even more likely to crash under the smart cmds, removing the commands took things up to

Re: Bad DMA from Marvell 9230

2014-05-30 Thread Benjamin Herrenschmidt
On Fri, 2014-05-30 at 09:13 -0500, Roger Heflin wrote: > Do enough smartcmds and the entire board (all 4 ports) locked up and > required a reboot, I quit doing smartcmds and stability went way up, > but it was still not 100% stable. Any chance you can give me an example of "enough smartcmds" ? IE

Re: Bad DMA from Marvell 9230

2014-05-30 Thread Benjamin Herrenschmidt
On Fri, 2014-05-30 at 09:58 -0400, Jérôme Carretero wrote: > Weird (I hadn't seen that you reported the 9235 working...), I have > IOMMU problems with a 9235... > > What system are you running it on (when you say "power box", is it a > beefy x86 computer or literally a PowerPC)? > For me, AMD

Re: Bad DMA from Marvell 9230

2014-05-30 Thread Jérôme Carretero
On Fri, 30 May 2014 09:13:43 -0500 Roger Heflin wrote: > I had a 9230... > [...] > Supplier support "claimed" it to be a Linux AHCI bug as the "claim" > that their board correctly supports AHCI, even though all other AHCI > boards work right in this exact same use case in the exact same >

Re: Bad DMA from Marvell 9230

2014-05-30 Thread Roger Heflin
I had a 9230...on older kernels it worked "ok" so long as you did not do any smart commands, I removed it and went to something that works. Marvell appears to be hit and miss with some cards/chips working right and some not... Do enough smartcmds and the entire board (all 4 ports) locked up

Re: Bad DMA from Marvell 9230

2014-05-30 Thread Jérôme Carretero
On Fri, 30 May 2014 20:37:58 +1000 Benjamin Herrenschmidt wrote: > We've switched to a 9235 instead which seems to work fine. Weird (I hadn't seen that you reported the 9235 working...), I have IOMMU problems with a 9235... What system are you running it on (when you say "power box", is it a

Re: Bad DMA from Marvell 9230

2014-05-30 Thread Benjamin Herrenschmidt
On Fri, 2014-05-30 at 03:06 -0400, Jérôme Carretero wrote: > On Thu, 27 Mar 2014 17:57:37 +1100 > Benjamin Herrenschmidt wrote: > > > I've been trying a 9230 on a power box here (a 9235 on the same > > machine works fine) and it blows up with an IOMMU violation early > > during init. > > Hi, >

Re: Bad DMA from Marvell 9230

2014-05-30 Thread Jérôme Carretero
On Thu, 27 Mar 2014 17:57:37 +1100 Benjamin Herrenschmidt wrote: > I've been trying a 9230 on a power box here (a 9235 on the same > machine works fine) and it blows up with an IOMMU violation early > during init. Hi, That's https://bugzilla.kernel.org/show_bug.cgi?id=42679 if you haven't

Re: Bad DMA from Marvell 9230

2014-05-30 Thread Jérôme Carretero
On Thu, 27 Mar 2014 17:57:37 +1100 Benjamin Herrenschmidt b...@kernel.crashing.org wrote: I've been trying a 9230 on a power box here (a 9235 on the same machine works fine) and it blows up with an IOMMU violation early during init. Hi, That's

Re: Bad DMA from Marvell 9230

2014-05-30 Thread Benjamin Herrenschmidt
On Fri, 2014-05-30 at 03:06 -0400, Jérôme Carretero wrote: On Thu, 27 Mar 2014 17:57:37 +1100 Benjamin Herrenschmidt b...@kernel.crashing.org wrote: I've been trying a 9230 on a power box here (a 9235 on the same machine works fine) and it blows up with an IOMMU violation early during

Re: Bad DMA from Marvell 9230

2014-05-30 Thread Jérôme Carretero
On Fri, 30 May 2014 20:37:58 +1000 Benjamin Herrenschmidt b...@kernel.crashing.org wrote: We've switched to a 9235 instead which seems to work fine. Weird (I hadn't seen that you reported the 9235 working...), I have IOMMU problems with a 9235... What system are you running it on (when you say

Re: Bad DMA from Marvell 9230

2014-05-30 Thread Roger Heflin
I had a 9230...on older kernels it worked ok so long as you did not do any smart commands, I removed it and went to something that works. Marvell appears to be hit and miss with some cards/chips working right and some not... Do enough smartcmds and the entire board (all 4 ports) locked up and

Re: Bad DMA from Marvell 9230

2014-05-30 Thread Jérôme Carretero
On Fri, 30 May 2014 09:13:43 -0500 Roger Heflin rogerhef...@gmail.com wrote: I had a 9230... [...] Supplier support claimed it to be a Linux AHCI bug as the claim that their board correctly supports AHCI, even though all other AHCI boards work right in this exact same use case in the exact

Re: Bad DMA from Marvell 9230

2014-05-30 Thread Benjamin Herrenschmidt
On Fri, 2014-05-30 at 09:58 -0400, Jérôme Carretero wrote: Weird (I hadn't seen that you reported the 9235 working...), I have IOMMU problems with a 9235... What system are you running it on (when you say power box, is it a beefy x86 computer or literally a PowerPC)? For me, AMD 990FX

Re: Bad DMA from Marvell 9230

2014-05-30 Thread Benjamin Herrenschmidt
On Fri, 2014-05-30 at 09:13 -0500, Roger Heflin wrote: Do enough smartcmds and the entire board (all 4 ports) locked up and required a reboot, I quit doing smartcmds and stability went way up, but it was still not 100% stable. Any chance you can give me an example of enough smartcmds ? IE a

Re: Bad DMA from Marvell 9230

2014-05-30 Thread Roger Heflin
pretty much any smartcommands...I was running something that got all of the smart stats 1x per hour per disk...and this made it crash about 1x per week, if you were pushing the disks hard it appear to make it even more likely to crash under the smart cmds, removing the commands took things up to

Re: Bad DMA from Marvell 9230

2014-04-04 Thread Robert Hancock
On 27/03/14 09:19 AM, Tejun Heo wrote: On Thu, Mar 27, 2014 at 05:57:37PM +1100, Benjamin Herrenschmidt wrote: I've contacted Marvell, but I was wondering if anybody here had already experienced something similar or has an idea of what else the chip might be doing wrong so we can try to find a

Re: Bad DMA from Marvell 9230

2014-04-04 Thread Robert Hancock
On 27/03/14 09:19 AM, Tejun Heo wrote: On Thu, Mar 27, 2014 at 05:57:37PM +1100, Benjamin Herrenschmidt wrote: I've contacted Marvell, but I was wondering if anybody here had already experienced something similar or has an idea of what else the chip might be doing wrong so we can try to find a

Re: Bad DMA from Marvell 9230

2014-03-27 Thread Tejun Heo
On Thu, Mar 27, 2014 at 05:57:37PM +1100, Benjamin Herrenschmidt wrote: > I've contacted Marvell, but I was wondering if anybody here had already > experienced something similar or has an idea of what else the chip > might be doing wrong so we can try to find a workaround ? No idea. First time

Bad DMA from Marvell 9230

2014-03-27 Thread Benjamin Herrenschmidt
Hi Folks ! Do that ring any bell ? I've been trying a 9230 on a power box here (a 9235 on the same machine works fine) and it blows up with an IOMMU violation early during init. >From what I can tell the scenario is: - So we still haven't issued any command per-se, all our DMA command buffers

Bad DMA from Marvell 9230

2014-03-27 Thread Benjamin Herrenschmidt
Hi Folks ! Do that ring any bell ? I've been trying a 9230 on a power box here (a 9235 on the same machine works fine) and it blows up with an IOMMU violation early during init. From what I can tell the scenario is: - So we still haven't issued any command per-se, all our DMA command buffers

Re: Bad DMA from Marvell 9230

2014-03-27 Thread Tejun Heo
On Thu, Mar 27, 2014 at 05:57:37PM +1100, Benjamin Herrenschmidt wrote: I've contacted Marvell, but I was wondering if anybody here had already experienced something similar or has an idea of what else the chip might be doing wrong so we can try to find a workaround ? No idea. First time to