Re: hpsa driver bug crack kernel down!

2014-04-16 Thread j...@8bytes.org
On Wed, Apr 16, 2014 at 01:58:44PM +, Woodhouse, David wrote: > On Wed, 2014-04-16 at 15:37 +0200, j...@8bytes.org wrote: > > What is the state of these fixes? I plan to send out a pull-request > > before easter and hoped to include these fixes as well. > > I'm travelling and was going to do s

Re: hpsa driver bug crack kernel down!

2014-04-16 Thread Woodhouse, David
On Wed, 2014-04-16 at 15:37 +0200, j...@8bytes.org wrote: > Hey David, > > On Mon, Apr 14, 2014 at 05:03:51PM +, Woodhouse, David wrote: > > Jiang, if you can then let me have a copy with a signed-off-by I'll > > shepherd it upstream along with your other patch which is already in my > > iommu

Re: hpsa driver bug crack kernel down!

2014-04-16 Thread j...@8bytes.org
Hey David, On Mon, Apr 14, 2014 at 05:03:51PM +, Woodhouse, David wrote: > Jiang, if you can then let me have a copy with a signed-off-by I'll > shepherd it upstream along with your other patch which is already in my > iommu-2.6.git tree. What is the state of these fixes? I plan to send out a

Re: hpsa driver bug crack kernel down!

2014-04-14 Thread Davidlohr Bueso
On Mon, 2014-04-14 at 16:57 +0800, Jiang Liu wrote: > Hi all, > I guess I found the root cause. It's a bug in matching > device scope, variable 'level' should be decreased when walking up PCI > topology. > Could you please help to test following patch? > Thanks! > Gerry Worked like a c

Re: hpsa driver bug crack kernel down!

2014-04-14 Thread Woodhouse, David
On Mon, 2014-04-14 at 09:47 -0700, Davidlohr Bueso wrote: > On Mon, 2014-04-14 at 09:44 -0700, Davidlohr Bueso wrote: > > On Tue, 2014-04-15 at 00:19 +0800, Jiang Liu wrote: > > > Hi Davidlohr, > > > Thanks for providing the DMAR table. According to the DMAR > > > table, one bug in the iommu driv

Re: hpsa driver bug crack kernel down!

2014-04-14 Thread Davidlohr Bueso
On Mon, 2014-04-14 at 09:44 -0700, Davidlohr Bueso wrote: > On Tue, 2014-04-15 at 00:19 +0800, Jiang Liu wrote: > > Hi Davidlohr, > > Thanks for providing the DMAR table. According to the DMAR > > table, one bug in the iommu driver fails to handle this entry: > > [1D2h 0466 1] Device Sco

Re: hpsa driver bug crack kernel down!

2014-04-14 Thread Davidlohr Bueso
On Tue, 2014-04-15 at 00:19 +0800, Jiang Liu wrote: > Hi Davidlohr, > Thanks for providing the DMAR table. According to the DMAR > table, one bug in the iommu driver fails to handle this entry: > [1D2h 0466 1] Device Scope Entry Type : 01 > [1D3h 0467 1] Entry Length

Re: hpsa driver bug crack kernel down!

2014-04-14 Thread Jiang Liu
Hi Davidlohr, Thanks for providing the DMAR table. According to the DMAR table, one bug in the iommu driver fails to handle this entry: [1D2h 0466 1] Device Scope Entry Type : 01 [1D3h 0467 1] Entry Length : 0A [1D4h 0468 2] Reserved : [1D

Re: hpsa driver bug crack kernel down!

2014-04-14 Thread Davidlohr Bueso
Sorry for the delay, I've been having to take turns for this box. On Fri, 2014-04-11 at 09:18 +, Woodhouse, David wrote: > On Thu, 2014-04-10 at 09:19 -0700, Davidlohr Bueso wrote: > > Attaching a dmesg from one of the kernels that boots. It doesn't appear > > to have much of the related infor

Re: hpsa driver bug crack kernel down!

2014-04-14 Thread Jiang Liu
Hi all, I guess I found the root cause. It's a bug in matching device scope, variable 'level' should be decreased when walking up PCI topology. Could you please help to test following patch? Thanks! Gerry diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c index f445c10..1f830

Re: hpsa driver bug crack kernel down!

2014-04-14 Thread Jiang Liu
Hi Davidlohr, Thanks for the information! According to lspci output, device :02:00.2 is HP ILO controller, device :03:00.0 is RAID controller. Both ILO and RAID controllers need to access reserved memory range [0x7f61e000 - 0x7f61] in physical mode. According to

Re: hpsa driver bug crack kernel down!

2014-04-11 Thread Woodhouse, David
On Thu, 2014-04-10 at 09:19 -0700, Davidlohr Bueso wrote: > Attaching a dmesg from one of the kernels that boots. It doesn't appear > to have much of the related information... is there any debug config > option I can enable that might give you more data? I'd like the contents of /sys/firmware/acp

Re: hpsa driver bug crack kernel down!

2014-04-11 Thread David Woodhouse
On Thu, 2014-04-10 at 17:17 -0600, Shuah Khan wrote: > This smells very much like the problem that was solved couple of years > ago for SI domain. It is likely that path is broken with the DMAR > device scope array change. Please take a look to see if the following > no longer occurs. Looks like BI

Re: hpsa driver bug crack kernel down!

2014-04-10 Thread Baoquan He
On 04/10/14 at 04:34pm, Jiang Liu wrote: > Hi Baoquan, > Could you please help to give output of "lspci -"? > Is device "hpsa :03:00.0" a legacy PCI device(non-PCIe)? > It may have relationship with IOMMU driver. > Thanks! > Gerry Well, the machine bug was reported on is a AMD machin

Re: hpsa driver bug crack kernel down!

2014-04-10 Thread Baoquan He
On 04/10/14 at 04:34pm, Jiang Liu wrote: > Hi Baoquan, > Could you please help to give output of "lspci -"? > Is device "hpsa :03:00.0" a legacy PCI device(non-PCIe)? > It may have relationship with IOMMU driver. > Thanks! > Gerry Hi, I just saw your mail now. Do you still need the

Re: hpsa driver bug crack kernel down!

2014-04-10 Thread Shuah Khan
On Thu, Apr 10, 2014 at 2:45 PM, wrote: >> > 3f583bc21977 BAD ("Merge tag 'iommu-updates-v3.15'") >> >> Yes, specifically (finally done bisecting): >> >> commit 2e45528930388658603ea24d49cf52867b928d3e >> Author: Jiang Liu >> Date: Wed Feb 19 14:07:36 2014 +0800 >> >> iommu/vt-d: Unify the

Re: hpsa driver bug crack kernel down!

2014-04-10 Thread scameron
On Wed, Apr 09, 2014 at 11:32:37PM -0700, Davidlohr Bueso wrote: > On Wed, 2014-04-09 at 22:03 -0600, Bjorn Helgaas wrote: > > [+cc Joerg, iommu list] > > > > On Wed, Apr 9, 2014 at 6:19 PM, Davidlohr Bueso wrote: > > > On Wed, 2014-04-09 at 16:50 -0700, James Bottomley wrote: > > >> On Wed, 2014

Re: hpsa driver bug crack kernel down!

2014-04-10 Thread Woodhouse, David
On Thu, 2014-04-10 at 09:19 -0700, Davidlohr Bueso wrote: > > > > > >> > > > > dmar: DMAR:[DMA Read] Request device [02:00.0] fault addr > > > > > >> > > > > 7f61e000 > > > > That "Present bit in context entry is clear" fault means that we have > > not set up *any* mappings for this PCI deviceā€¦ o

Re: hpsa driver bug crack kernel down!

2014-04-10 Thread Bjorn Helgaas
[+cc Steve and iss_storagedev, remove "storagedev" which bounced (apparent typo)] On Thu, Apr 10, 2014 at 9:43 AM, Bjorn Helgaas wrote: > On Tue, Apr 8, 2014 at 8:39 PM, Baoquan He wrote: >> Hi, >> >> The kernel is 3.14.0+ which is pulled just now. >> >> >> [ 18.402695] systemd[1]: Set hostnam

Re: hpsa driver bug crack kernel down!

2014-04-10 Thread Davidlohr Bueso
On Thu, 2014-04-10 at 16:34 +0800, Jiang Liu wrote: > Hi Baoquan, > Could you please help to give output of "lspci -"? Attached. > Is device "hpsa :03:00.0" a legacy PCI device(non-PCIe)? > It may have relationship with IOMMU driver. I honestly don't know. PCI is way out of my area

Re: hpsa driver bug crack kernel down!

2014-04-10 Thread Bjorn Helgaas
On Tue, Apr 8, 2014 at 8:39 PM, Baoquan He wrote: > Hi, > > The kernel is 3.14.0+ which is pulled just now. > > > [ 18.402695] systemd[1]: Set hostname to > . > [ 18.408456] random: systemd urandom read with 70 bits of entropy > available > [ 18md[1]: Expecting device > dev-mapper-rhel_hp\x2

Re: hpsa driver bug crack kernel down!

2014-04-10 Thread Woodhouse, David
On Thu, 2014-04-10 at 09:14 -0600, Bjorn Helgaas wrote: > > Thus, my first guess would be that we are quite happily setting up the > > requested DMA maps on the *wrong* IOMMU, and then taking faults when the > > device actually tries to do DMA. > > > I like the "wrong IOMMU (or no IOMMU at all)" th

Re: hpsa driver bug crack kernel down!

2014-04-10 Thread Linda Knippers
On 4/10/2014 11:14 AM, Bjorn Helgaas wrote: > On Thu, Apr 10, 2014 at 2:46 AM, Woodhouse, David > wrote: > >>> DMAR:[fault reason 02] Present bit in context entry is clear >>> dmar: DRHD: handling fault status reg 602 >>> dmar: DMAR:[DMA Read] Request device [02:00.0] faul

Re: hpsa driver bug crack kernel down!

2014-04-10 Thread Bjorn Helgaas
On Thu, Apr 10, 2014 at 2:46 AM, Woodhouse, David wrote: >> > > >> > > > > DMAR:[fault reason 02] Present bit in context entry is clear >> > > >> > > > > dmar: DRHD: handling fault status reg 602 >> > > >> > > > > dmar: DMAR:[DMA Read] Request device [02:00.0] fault addr >> > > >> > > > > 7f61e0

Re: hpsa driver bug crack kernel down!

2014-04-10 Thread Woodhouse, David
On Thu, 2014-04-10 at 09:15 +0200, Joerg Roedel wrote: > [+ David, VT-d maintainer ] > > Jiang, David, can you please have a look into this issue? > > > > >> > > > > DMAR:[fault reason 02] Present bit in context entry is clear > > > >> > > > > dmar: DRHD: handling fault status reg 602 > > > >> >

Re: hpsa driver bug crack kernel down!

2014-04-10 Thread Jiang Liu
Hi Baoquan, Could you please help to give output of "lspci -"? Is device "hpsa :03:00.0" a legacy PCI device(non-PCIe)? It may have relationship with IOMMU driver. Thanks! Gerry On 2014/4/10 12:03, Bjorn Helgaas wrote: > [+cc Joerg, iommu list] > > On Wed, Apr 9, 2014 at 6:19 PM,

Re: hpsa driver bug crack kernel down!

2014-04-10 Thread Joerg Roedel
[+ David, VT-d maintainer ] Jiang, David, can you please have a look into this issue? Thanks, Joerg On Wed, Apr 09, 2014 at 11:32:37PM -0700, Davidlohr Bueso wrote: > On Wed, 2014-04-09 at 22:03 -0600, Bjorn Helgaas wrote: > > [+cc Joerg, iommu list] > > > > On Wed, Apr 9, 2014 at 6:19

Re: hpsa driver bug crack kernel down!

2014-04-09 Thread Davidlohr Bueso
On Wed, 2014-04-09 at 22:03 -0600, Bjorn Helgaas wrote: > [+cc Joerg, iommu list] > > On Wed, Apr 9, 2014 at 6:19 PM, Davidlohr Bueso wrote: > > On Wed, 2014-04-09 at 16:50 -0700, James Bottomley wrote: > >> On Wed, 2014-04-09 at 16:40 -0700, Davidlohr Bueso wrote: > >> > On Wed, 2014-04-09 at 16

Re: hpsa driver bug crack kernel down!

2014-04-09 Thread Bjorn Helgaas
[+cc Joerg, iommu list] On Wed, Apr 9, 2014 at 6:19 PM, Davidlohr Bueso wrote: > On Wed, 2014-04-09 at 16:50 -0700, James Bottomley wrote: >> On Wed, 2014-04-09 at 16:40 -0700, Davidlohr Bueso wrote: >> > On Wed, 2014-04-09 at 16:10 -0700, James Bottomley wrote: >> > > On Wed, 2014-04-09 at 16:08

Re: hpsa driver bug crack kernel down!

2014-04-09 Thread Davidlohr Bueso
On Wed, 2014-04-09 at 16:50 -0700, James Bottomley wrote: > On Wed, 2014-04-09 at 16:40 -0700, Davidlohr Bueso wrote: > > On Wed, 2014-04-09 at 16:10 -0700, James Bottomley wrote: > > > On Wed, 2014-04-09 at 16:08 -0700, James Bottomley wrote: > > > > [+linux-scsi] > > > > On Wed, 2014-04-09 at 15:

Re: hpsa driver bug crack kernel down!

2014-04-09 Thread James Bottomley
On Wed, 2014-04-09 at 16:40 -0700, Davidlohr Bueso wrote: > On Wed, 2014-04-09 at 16:10 -0700, James Bottomley wrote: > > On Wed, 2014-04-09 at 16:08 -0700, James Bottomley wrote: > > > [+linux-scsi] > > > On Wed, 2014-04-09 at 15:49 -0700, Davidlohr Bueso wrote: > > > > On Wed, 2014-04-09 at 10:39

Re: hpsa driver bug crack kernel down!

2014-04-09 Thread Davidlohr Bueso
On Wed, 2014-04-09 at 16:10 -0700, James Bottomley wrote: > On Wed, 2014-04-09 at 16:08 -0700, James Bottomley wrote: > > [+linux-scsi] > > On Wed, 2014-04-09 at 15:49 -0700, Davidlohr Bueso wrote: > > > On Wed, 2014-04-09 at 10:39 +0800, Baoquan He wrote: > > > > Hi, > > > > > > > > The kernel is

Re: hpsa driver bug crack kernel down!

2014-04-09 Thread James Bottomley
On Wed, 2014-04-09 at 16:08 -0700, James Bottomley wrote: > [+linux-scsi] > On Wed, 2014-04-09 at 15:49 -0700, Davidlohr Bueso wrote: > > On Wed, 2014-04-09 at 10:39 +0800, Baoquan He wrote: > > > Hi, > > > > > > The kernel is 3.14.0+ which is pulled just now. > > > > Cc'ing more people. > > >

Re: hpsa driver bug crack kernel down!

2014-04-09 Thread James Bottomley
[+linux-scsi] On Wed, 2014-04-09 at 15:49 -0700, Davidlohr Bueso wrote: > On Wed, 2014-04-09 at 10:39 +0800, Baoquan He wrote: > > Hi, > > > > The kernel is 3.14.0+ which is pulled just now. > > Cc'ing more people. > > While the hpsa driver appears to be involved in some way, I'm sure if > this

Re: hpsa driver bug crack kernel down!

2014-04-09 Thread Davidlohr Bueso
On Wed, 2014-04-09 at 10:39 +0800, Baoquan He wrote: > Hi, > > The kernel is 3.14.0+ which is pulled just now. Cc'ing more people. While the hpsa driver appears to be involved in some way, I'm sure if this is a related issue, but as of today's pull I'm getting another problem that causes my DL9

hpsa driver bug crack kernel down!

2014-04-08 Thread Baoquan He
Hi, The kernel is 3.14.0+ which is pulled just now. [ 18.402695] systemd[1]: Set hostname to . [ 18.408456] random: systemd urandom read with 70 bits of entropy available [ 18md[1]: Expecting device dev-mapper-rhel_hp\x2d\x2dsl4545g7\x2d\x2d01\x2droot.device... Expecting device d