Re: [PATCH 1/1] ia64/pci: set mmio decoding on for some host bridge
Hi Bjorn, Thanks for your suggestions. I will try to find more information. ZhenHua On 07/11/2013 12:12 AM, Bjorn Helgaas wrote: On Wed, Jul 10, 2013 at 12:23 AM, ZhenHua wrote: Hi Bjorn, On the system that this bug happens, an MCA event is generated while kernel crashed: Transaction Address: memory write to address 0x0ae041428 (LMMIO - SBL Blade 1 SFW DDR Memory) I guess the there is some module trying to visit the address 0x0ae041428 right after this line is run: pci_write_config_word(dev, PCI_COMMAND, orig_cmd & ~(PCI_COMMAND_MEMORY | PCI_COMMAND_IO)); Well, you need to figure out what is accessing 0x0ae041428 and why. Presumably that address belongs to some device below the 40:01.0 root port, and knowing which device that is would be a good clue, but you didn't include that in your lspci. I'm trying to give you hints about how *you* can figure out what's going on here. Obviously I don't have the system and I'm not proposing a change, so that's about all I can do. The output of lspci -vvv is followed. 40:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 (rev 22) (prog-if 00 [Normal decode]) Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: [40] Subsystem: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 Capabilities: [60] Message Signalled Interrupts: Mask+ 64bit- Count=1/2 Enable+ Address: fee0 Data: 4046 Masking: 0002 Pending: Capabilities: [90] Express (v2) Root Port (Slot-), MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us ExtTag+ RBE+ FLReset- DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+ RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 128 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- LnkCap: Port #0, Speed 5GT/s, Width x2, ASPM L0s L1, Latency L0 <512ns, L1 <64us ClockPM- Suprise+ LLActRep+ BwNot+ LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk- ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt- RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible- RootCap: CRSVisible- RootSta: PME ReqID , PMEStatus- PMEPending- DevCap2: Completion Timeout: Range BCD, TimeoutDis+ ARIFwd+ DevCtl2: Completion Timeout: 260ms to 900ms, TimeoutDis- ARIFwd- LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -3.5dB Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -3.5dB Capabilities: [e0] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=0 PME- Capabilities: [100] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP+ FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq+ ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn- Capabilities: [150] Access Control Services ACSCap: SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd+ EgressCtrl- DirectTrans- ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans- Capabilities: [160] Vendor Specific Information Kernel driver in use: pcieport Kernel modules: shpchp On 07/10/2013 12:49 AM, Bjorn Helgaas wrote: On Mon, Jul 8, 2013 at 11:42 PM, Li, Zhen-Hua wrote: On some IA64 platforms with intel PCI bridge, for example, HP BL890c i2 with Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, when kernel tries to disable the mmio decoding on the PCI bridge devices, kernel may crash. And in the comment of
Re: [PATCH 1/1] ia64/pci: set mmio decoding on for some host bridge
Hi Bjorn, Thanks for your suggestions. I will try to find more information. ZhenHua On 07/11/2013 12:12 AM, Bjorn Helgaas wrote: On Wed, Jul 10, 2013 at 12:23 AM, ZhenHua zhen-h...@hp.com wrote: Hi Bjorn, On the system that this bug happens, an MCA event is generated while kernel crashed: Transaction Address: memory write to address 0x0ae041428 (LMMIO - SBL Blade 1 SFW DDR Memory) I guess the there is some module trying to visit the address 0x0ae041428 right after this line is run: pci_write_config_word(dev, PCI_COMMAND, orig_cmd ~(PCI_COMMAND_MEMORY | PCI_COMMAND_IO)); Well, you need to figure out what is accessing 0x0ae041428 and why. Presumably that address belongs to some device below the 40:01.0 root port, and knowing which device that is would be a good clue, but you didn't include that in your lspci. I'm trying to give you hints about how *you* can figure out what's going on here. Obviously I don't have the system and I'm not proposing a change, so that's about all I can do. The output of lspci -vvv is followed. 40:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 (rev 22) (prog-if 00 [Normal decode]) Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Bus: primary=40, secondary=41, subordinate=41, sec-latency=0 I/O behind bridge: f000-0fff Memory behind bridge: ae00-af8f Prefetchable memory behind bridge: fff0-000f Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- BridgeCtl: Parity+ SERR+ NoISA- VGA- MAbort- Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: [40] Subsystem: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 Capabilities: [60] Message Signalled Interrupts: Mask+ 64bit- Count=1/2 Enable+ Address: fee0 Data: 4046 Masking: 0002 Pending: Capabilities: [90] Express (v2) Root Port (Slot-), MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s 64ns, L1 1us ExtTag+ RBE+ FLReset- DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+ RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 128 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- LnkCap: Port #0, Speed 5GT/s, Width x2, ASPM L0s L1, Latency L0 512ns, L1 64us ClockPM- Suprise+ LLActRep+ BwNot+ LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk- ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt- RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible- RootCap: CRSVisible- RootSta: PME ReqID , PMEStatus- PMEPending- DevCap2: Completion Timeout: Range BCD, TimeoutDis+ ARIFwd+ DevCtl2: Completion Timeout: 260ms to 900ms, TimeoutDis- ARIFwd- LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -3.5dB Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -3.5dB Capabilities: [e0] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=0 PME- Capabilities: [100] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP+ FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq+ ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn- Capabilities: [150] Access Control Services ACSCap: SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd+ EgressCtrl- DirectTrans- ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans- Capabilities:
Re: [PATCH 1/1] ia64/pci: set mmio decoding on for some host bridge
On Wed, Jul 10, 2013 at 12:23 AM, ZhenHua wrote: > Hi Bjorn, > On the system that this bug happens, an MCA event is generated while kernel > crashed: > Transaction Address: memory write to address 0x0ae041428 (LMMIO - > SBL Blade 1 SFW DDR Memory) > > I guess the there is some module trying to visit the address 0x0ae041428 > right after this line is run: > pci_write_config_word(dev, PCI_COMMAND, > orig_cmd & ~(PCI_COMMAND_MEMORY | PCI_COMMAND_IO)); Well, you need to figure out what is accessing 0x0ae041428 and why. Presumably that address belongs to some device below the 40:01.0 root port, and knowing which device that is would be a good clue, but you didn't include that in your lspci. I'm trying to give you hints about how *you* can figure out what's going on here. Obviously I don't have the system and I'm not proposing a change, so that's about all I can do. > > The output of lspci -vvv is followed. > 40:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root > Port 1 (rev 22) (prog-if 00 [Normal decode]) > Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ > Stepping- SERR+ FastB2B- DisINTx+ > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- > SERR- Latency: 0, Cache Line Size: 64 bytes > Bus: primary=40, secondary=41, subordinate=41, sec-latency=0 > I/O behind bridge: f000-0fff > Memory behind bridge: ae00-af8f > Prefetchable memory behind bridge: fff0-000f > Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- > BridgeCtl: Parity+ SERR+ NoISA- VGA- MAbort- >Reset- FastB2B- > PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- > Capabilities: [40] Subsystem: Intel Corporation 5520/5500/X58 I/O > Hub PCI Express Root Port 1 > Capabilities: [60] Message Signalled Interrupts: Mask+ 64bit- > Count=1/2 Enable+ > Address: fee0 Data: 4046 > Masking: 0002 Pending: > Capabilities: [90] Express (v2) Root Port (Slot-), MSI 00 > DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s > <64ns, L1 <1us > ExtTag+ RBE+ FLReset- > DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ > Unsupported+ > RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- > MaxPayload 128 bytes, MaxReadReq 128 bytes > DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- > TransPend- > LnkCap: Port #0, Speed 5GT/s, Width x2, ASPM L0s L1, Latency > L0 <512ns, L1 <64us > ClockPM- Suprise+ LLActRep+ BwNot+ > LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- > CommClk- > ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- > LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ > DLActive+ BWMgmt- ABWMgmt- > RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ > CRSVisible- > RootCap: CRSVisible- > RootSta: PME ReqID , PMEStatus- PMEPending- > DevCap2: Completion Timeout: Range BCD, TimeoutDis+ ARIFwd+ > DevCtl2: Completion Timeout: 260ms to 900ms, TimeoutDis- > ARIFwd- > LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- > SpeedDis-, Selectable De-emphasis: -3.5dB > Transmit Margin: Normal Operating Range, > EnterModifiedCompliance- ComplianceSOS- > Compliance De-emphasis: -6dB > LnkSta2: Current De-emphasis Level: -3.5dB > Capabilities: [e0] Power Management version 3 > Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA > PME(D0+,D1-,D2-,D3hot+,D3cold+) > Status: D0 PME-Enable- DSel=0 DScale=0 PME- > Capabilities: [100] Advanced Error Reporting > UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- > RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- > RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UESvrt: DLP+ SDES+ TLP+ FCP+ CmpltTO- CmpltAbrt- UnxCmplt- > RxOF+ MalfTLP+ ECRC- UnsupReq+ ACSViol- > CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- > NonFatalErr- > CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- > NonFatalErr+ > AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- > ChkEn- > Capabilities: [150] Access Control Services > ACSCap: SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ > UpstreamFwd+ EgressCtrl- DirectTrans- > ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- > UpstreamFwd- EgressCtrl- DirectTrans- > Capabilities: [160] Vendor Specific Information > Kernel driver in use: pcieport >
Re: [PATCH 1/1] ia64/pci: set mmio decoding on for some host bridge
Hi Bjorn, On the system that this bug happens, an MCA event is generated while kernel crashed: Transaction Address: memory write to address 0x0ae041428 (LMMIO - SBL Blade 1 SFW DDR Memory) I guess the there is some module trying to visit the address 0x0ae041428 right after this line is run: pci_write_config_word(dev, PCI_COMMAND, orig_cmd & ~(PCI_COMMAND_MEMORY | PCI_COMMAND_IO)); The output of lspci -vvv is followed. 40:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 (rev 22) (prog-if 00 [Normal decode]) Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- Latency: 0, Cache Line Size: 64 bytes Bus: primary=40, secondary=41, subordinate=41, sec-latency=0 I/O behind bridge: f000-0fff Memory behind bridge: ae00-af8f Prefetchable memory behind bridge: fff0-000f Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- BridgeCtl: Parity+ SERR+ NoISA- VGA- MAbort- >Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: [40] Subsystem: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 Capabilities: [60] Message Signalled Interrupts: Mask+ 64bit- Count=1/2 Enable+ Address: fee0 Data: 4046 Masking: 0002 Pending: Capabilities: [90] Express (v2) Root Port (Slot-), MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us ExtTag+ RBE+ FLReset- DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+ RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 128 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- LnkCap: Port #0, Speed 5GT/s, Width x2, ASPM L0s L1, Latency L0 <512ns, L1 <64us ClockPM- Suprise+ LLActRep+ BwNot+ LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk- ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt- RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible- RootCap: CRSVisible- RootSta: PME ReqID , PMEStatus- PMEPending- DevCap2: Completion Timeout: Range BCD, TimeoutDis+ ARIFwd+ DevCtl2: Completion Timeout: 260ms to 900ms, TimeoutDis- ARIFwd- LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -3.5dB Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -3.5dB Capabilities: [e0] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=0 PME- Capabilities: [100] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP+ FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq+ ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn- Capabilities: [150] Access Control Services ACSCap: SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd+ EgressCtrl- DirectTrans- ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans- Capabilities: [160] Vendor Specific Information Kernel driver in use: pcieport Kernel modules: shpchp Thanks ZhenHua On 07/10/2013 12:49 AM, Bjorn Helgaas wrote: On Mon, Jul 8, 2013 at 11:42 PM, Li, Zhen-Hua wrote: On some IA64 platforms with intel PCI bridge, for example, HP BL890c i2 with Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, when kernel tries to disable the mmio decoding on the PCI bridge devices, kernel may crash. And in the comment of function quirk_mmio_always_on, it also says: "But doing so (disable the mmio decoding) may cause problems on host bridge and perhaps other key system devices" So, for this PCI bridge, dev->mmio_always_on
Re: [PATCH 1/1] ia64/pci: set mmio decoding on for some host bridge
Hi Bjorn, On the system that this bug happens, an MCA event is generated while kernel crashed: Transaction Address: memory write to address 0x0ae041428 (LMMIO - SBL Blade 1 SFW DDR Memory) I guess the there is some module trying to visit the address 0x0ae041428 right after this line is run: pci_write_config_word(dev, PCI_COMMAND, orig_cmd ~(PCI_COMMAND_MEMORY | PCI_COMMAND_IO)); The output of lspci -vvv is followed. 40:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 (rev 22) (prog-if 00 [Normal decode]) Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Bus: primary=40, secondary=41, subordinate=41, sec-latency=0 I/O behind bridge: f000-0fff Memory behind bridge: ae00-af8f Prefetchable memory behind bridge: fff0-000f Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- BridgeCtl: Parity+ SERR+ NoISA- VGA- MAbort- Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: [40] Subsystem: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 Capabilities: [60] Message Signalled Interrupts: Mask+ 64bit- Count=1/2 Enable+ Address: fee0 Data: 4046 Masking: 0002 Pending: Capabilities: [90] Express (v2) Root Port (Slot-), MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s 64ns, L1 1us ExtTag+ RBE+ FLReset- DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+ RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 128 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- LnkCap: Port #0, Speed 5GT/s, Width x2, ASPM L0s L1, Latency L0 512ns, L1 64us ClockPM- Suprise+ LLActRep+ BwNot+ LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk- ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt- RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible- RootCap: CRSVisible- RootSta: PME ReqID , PMEStatus- PMEPending- DevCap2: Completion Timeout: Range BCD, TimeoutDis+ ARIFwd+ DevCtl2: Completion Timeout: 260ms to 900ms, TimeoutDis- ARIFwd- LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -3.5dB Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -3.5dB Capabilities: [e0] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=0 PME- Capabilities: [100] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP+ FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq+ ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn- Capabilities: [150] Access Control Services ACSCap: SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd+ EgressCtrl- DirectTrans- ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans- Capabilities: [160] Vendor Specific Information ? Kernel driver in use: pcieport Kernel modules: shpchp Thanks ZhenHua On 07/10/2013 12:49 AM, Bjorn Helgaas wrote: On Mon, Jul 8, 2013 at 11:42 PM, Li, Zhen-Hua zhen-h...@hp.com wrote: On some IA64 platforms with intel PCI bridge, for example, HP BL890c i2 with Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, when kernel tries to disable the mmio decoding on the PCI bridge devices, kernel may crash. And in the comment of function quirk_mmio_always_on, it also says: But doing so (disable the mmio decoding) may cause problems on host bridge and perhaps other key
Re: [PATCH 1/1] ia64/pci: set mmio decoding on for some host bridge
On Wed, Jul 10, 2013 at 12:23 AM, ZhenHua zhen-h...@hp.com wrote: Hi Bjorn, On the system that this bug happens, an MCA event is generated while kernel crashed: Transaction Address: memory write to address 0x0ae041428 (LMMIO - SBL Blade 1 SFW DDR Memory) I guess the there is some module trying to visit the address 0x0ae041428 right after this line is run: pci_write_config_word(dev, PCI_COMMAND, orig_cmd ~(PCI_COMMAND_MEMORY | PCI_COMMAND_IO)); Well, you need to figure out what is accessing 0x0ae041428 and why. Presumably that address belongs to some device below the 40:01.0 root port, and knowing which device that is would be a good clue, but you didn't include that in your lspci. I'm trying to give you hints about how *you* can figure out what's going on here. Obviously I don't have the system and I'm not proposing a change, so that's about all I can do. The output of lspci -vvv is followed. 40:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 (rev 22) (prog-if 00 [Normal decode]) Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Bus: primary=40, secondary=41, subordinate=41, sec-latency=0 I/O behind bridge: f000-0fff Memory behind bridge: ae00-af8f Prefetchable memory behind bridge: fff0-000f Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- BridgeCtl: Parity+ SERR+ NoISA- VGA- MAbort- Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: [40] Subsystem: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 Capabilities: [60] Message Signalled Interrupts: Mask+ 64bit- Count=1/2 Enable+ Address: fee0 Data: 4046 Masking: 0002 Pending: Capabilities: [90] Express (v2) Root Port (Slot-), MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s 64ns, L1 1us ExtTag+ RBE+ FLReset- DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+ RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 128 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- LnkCap: Port #0, Speed 5GT/s, Width x2, ASPM L0s L1, Latency L0 512ns, L1 64us ClockPM- Suprise+ LLActRep+ BwNot+ LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk- ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt- RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible- RootCap: CRSVisible- RootSta: PME ReqID , PMEStatus- PMEPending- DevCap2: Completion Timeout: Range BCD, TimeoutDis+ ARIFwd+ DevCtl2: Completion Timeout: 260ms to 900ms, TimeoutDis- ARIFwd- LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -3.5dB Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -3.5dB Capabilities: [e0] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=0 PME- Capabilities: [100] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP+ FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq+ ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn- Capabilities: [150] Access Control Services ACSCap: SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd+ EgressCtrl- DirectTrans- ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans- Capabilities: [160] Vendor Specific Information ? Kernel driver in use: pcieport Kernel modules:
Re: [PATCH 1/1] ia64/pci: set mmio decoding on for some host bridge
On Mon, Jul 8, 2013 at 11:42 PM, Li, Zhen-Hua wrote: > On some IA64 platforms with intel PCI bridge, for example, HP BL890c i2 > with Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, > when kernel tries to disable the mmio decoding on the PCI bridge devices, > kernel may crash. > > And in the comment of function quirk_mmio_always_on, it also says: > "But doing so (disable the mmio decoding) may cause problems on host bridge > and perhaps other key system devices" > > So, for this PCI bridge, dev->mmio_always_on bit should be set to 1. > > To avoid affecting the use of quirk_mmio_always_on, a new function is created. > > Signed-off-by: Li, Zhen-Hua > --- > drivers/pci/quirks.c| 17 + > include/linux/pci_ids.h |1 + > 2 files changed, 18 insertions(+) > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > index e85d230..665af3e 100644 > --- a/drivers/pci/quirks.c > +++ b/drivers/pci/quirks.c > @@ -44,6 +44,23 @@ static void quirk_mmio_always_on(struct pci_dev *dev) > DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_ANY_ID, PCI_ANY_ID, > PCI_CLASS_BRIDGE_HOST, 8, > quirk_mmio_always_on); > > +#ifdef CONFIG_IA64 > +/* > + * On some IA64 platforms, for some intel PCI bridge devices, for example, > + * the Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, > + * disable the mmio decoding on this device may cause system crash. > + * So dev->mmio_always_on bit should be set to 1. > + */ > +static void quirk_mmio_on_intel_pcibridge(struct pci_dev *dev) > +{ > + dev->mmio_always_on = 1; > +} > +DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, > + PCI_DEVICE_ID_INTEL_5520_5550_X58, > + PCI_CLASS_BRIDGE_PCI, > + 8, quirk_mmio_on_intel_pcibridge); > +#endif > + > /* The Mellanox Tavor device gives false positive parity errors > * Mark this device with a broken_parity_status, to allow > * PCI scanning code to "skip" this now blacklisted device. > diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h > index 3bed2e8..d8c60b7 100644 > --- a/include/linux/pci_ids.h > +++ b/include/linux/pci_ids.h > @@ -2742,6 +2742,7 @@ > #define PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH2_RANK_REV2 0x2db2 > #define PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH2_TC_REV20x2db3 > #define PCI_DEVICE_ID_INTEL_82855PM_HB 0x3340 > +#define PCI_DEVICE_ID_INTEL_5520_5550_X58 0x3408 > #define PCI_DEVICE_ID_INTEL_IOAT_TBG4 0x3429 > #define PCI_DEVICE_ID_INTEL_IOAT_TBG5 0x342a > #define PCI_DEVICE_ID_INTEL_IOAT_TBG6 0x342b > -- > 1.7.10.4 > You need to figure out what the problem is, not just avoid it. It's very unlikely that the problem is something unique to ia64. In fact, I think it's very doubtful that the problem is even something unique to the 5520 root ports. My guess is there's something special about the system you're testing. Evidently you have traffic going to a device behind the root port at the same time as we're trying to read the root port's BARs. Linux should not generate traffic like that while we're enumerating the root port. Does the problem happen on a root port with an iLO behind it? Can you collect "lspci -vvv" output and identify the root port where the problem occurs? Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] ia64/pci: set mmio decoding on for some host bridge
Hi Bjorn, I tested on an X86 system with the same chipset, and this bug does not happen. I am not sure why , but if it is needed, I will try to find out why. Thanks ZhenHua On 07/09/2013 01:42 PM, Li, Zhen-Hua wrote: On some IA64 platforms with intel PCI bridge, for example, HP BL890c i2 with Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, when kernel tries to disable the mmio decoding on the PCI bridge devices, kernel may crash. And in the comment of function quirk_mmio_always_on, it also says: "But doing so (disable the mmio decoding) may cause problems on host bridge and perhaps other key system devices" So, for this PCI bridge, dev->mmio_always_on bit should be set to 1. To avoid affecting the use of quirk_mmio_always_on, a new function is created. Signed-off-by: Li, Zhen-Hua --- drivers/pci/quirks.c| 17 + include/linux/pci_ids.h |1 + 2 files changed, 18 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index e85d230..665af3e 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -44,6 +44,23 @@ static void quirk_mmio_always_on(struct pci_dev *dev) DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_BRIDGE_HOST, 8, quirk_mmio_always_on); +#ifdef CONFIG_IA64 +/* + * On some IA64 platforms, for some intel PCI bridge devices, for example, + * the Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, + * disable the mmio decoding on this device may cause system crash. + * So dev->mmio_always_on bit should be set to 1. + */ +static void quirk_mmio_on_intel_pcibridge(struct pci_dev *dev) +{ + dev->mmio_always_on = 1; +} +DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, + PCI_DEVICE_ID_INTEL_5520_5550_X58, + PCI_CLASS_BRIDGE_PCI, + 8, quirk_mmio_on_intel_pcibridge); +#endif + /* The Mellanox Tavor device gives false positive parity errors * Mark this device with a broken_parity_status, to allow * PCI scanning code to "skip" this now blacklisted device. diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h index 3bed2e8..d8c60b7 100644 --- a/include/linux/pci_ids.h +++ b/include/linux/pci_ids.h @@ -2742,6 +2742,7 @@ #define PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH2_RANK_REV2 0x2db2 #define PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH2_TC_REV20x2db3 #define PCI_DEVICE_ID_INTEL_82855PM_HB0x3340 +#define PCI_DEVICE_ID_INTEL_5520_5550_X58 0x3408 #define PCI_DEVICE_ID_INTEL_IOAT_TBG4 0x3429 #define PCI_DEVICE_ID_INTEL_IOAT_TBG5 0x342a #define PCI_DEVICE_ID_INTEL_IOAT_TBG6 0x342b -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] ia64/pci: set mmio decoding on for some host bridge
Hi Bjorn, I tested on an X86 system with the same chipset, and this bug does not happen. I am not sure why , but if it is needed, I will try to find out why. Thanks ZhenHua On 07/09/2013 01:42 PM, Li, Zhen-Hua wrote: On some IA64 platforms with intel PCI bridge, for example, HP BL890c i2 with Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, when kernel tries to disable the mmio decoding on the PCI bridge devices, kernel may crash. And in the comment of function quirk_mmio_always_on, it also says: But doing so (disable the mmio decoding) may cause problems on host bridge and perhaps other key system devices So, for this PCI bridge, dev-mmio_always_on bit should be set to 1. To avoid affecting the use of quirk_mmio_always_on, a new function is created. Signed-off-by: Li, Zhen-Hua zhen-h...@hp.com --- drivers/pci/quirks.c| 17 + include/linux/pci_ids.h |1 + 2 files changed, 18 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index e85d230..665af3e 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -44,6 +44,23 @@ static void quirk_mmio_always_on(struct pci_dev *dev) DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_BRIDGE_HOST, 8, quirk_mmio_always_on); +#ifdef CONFIG_IA64 +/* + * On some IA64 platforms, for some intel PCI bridge devices, for example, + * the Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, + * disable the mmio decoding on this device may cause system crash. + * So dev-mmio_always_on bit should be set to 1. + */ +static void quirk_mmio_on_intel_pcibridge(struct pci_dev *dev) +{ + dev-mmio_always_on = 1; +} +DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, + PCI_DEVICE_ID_INTEL_5520_5550_X58, + PCI_CLASS_BRIDGE_PCI, + 8, quirk_mmio_on_intel_pcibridge); +#endif + /* The Mellanox Tavor device gives false positive parity errors * Mark this device with a broken_parity_status, to allow * PCI scanning code to skip this now blacklisted device. diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h index 3bed2e8..d8c60b7 100644 --- a/include/linux/pci_ids.h +++ b/include/linux/pci_ids.h @@ -2742,6 +2742,7 @@ #define PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH2_RANK_REV2 0x2db2 #define PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH2_TC_REV20x2db3 #define PCI_DEVICE_ID_INTEL_82855PM_HB0x3340 +#define PCI_DEVICE_ID_INTEL_5520_5550_X58 0x3408 #define PCI_DEVICE_ID_INTEL_IOAT_TBG4 0x3429 #define PCI_DEVICE_ID_INTEL_IOAT_TBG5 0x342a #define PCI_DEVICE_ID_INTEL_IOAT_TBG6 0x342b -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] ia64/pci: set mmio decoding on for some host bridge
On Mon, Jul 8, 2013 at 11:42 PM, Li, Zhen-Hua zhen-h...@hp.com wrote: On some IA64 platforms with intel PCI bridge, for example, HP BL890c i2 with Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, when kernel tries to disable the mmio decoding on the PCI bridge devices, kernel may crash. And in the comment of function quirk_mmio_always_on, it also says: But doing so (disable the mmio decoding) may cause problems on host bridge and perhaps other key system devices So, for this PCI bridge, dev-mmio_always_on bit should be set to 1. To avoid affecting the use of quirk_mmio_always_on, a new function is created. Signed-off-by: Li, Zhen-Hua zhen-h...@hp.com --- drivers/pci/quirks.c| 17 + include/linux/pci_ids.h |1 + 2 files changed, 18 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index e85d230..665af3e 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -44,6 +44,23 @@ static void quirk_mmio_always_on(struct pci_dev *dev) DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_BRIDGE_HOST, 8, quirk_mmio_always_on); +#ifdef CONFIG_IA64 +/* + * On some IA64 platforms, for some intel PCI bridge devices, for example, + * the Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, + * disable the mmio decoding on this device may cause system crash. + * So dev-mmio_always_on bit should be set to 1. + */ +static void quirk_mmio_on_intel_pcibridge(struct pci_dev *dev) +{ + dev-mmio_always_on = 1; +} +DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, + PCI_DEVICE_ID_INTEL_5520_5550_X58, + PCI_CLASS_BRIDGE_PCI, + 8, quirk_mmio_on_intel_pcibridge); +#endif + /* The Mellanox Tavor device gives false positive parity errors * Mark this device with a broken_parity_status, to allow * PCI scanning code to skip this now blacklisted device. diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h index 3bed2e8..d8c60b7 100644 --- a/include/linux/pci_ids.h +++ b/include/linux/pci_ids.h @@ -2742,6 +2742,7 @@ #define PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH2_RANK_REV2 0x2db2 #define PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH2_TC_REV20x2db3 #define PCI_DEVICE_ID_INTEL_82855PM_HB 0x3340 +#define PCI_DEVICE_ID_INTEL_5520_5550_X58 0x3408 #define PCI_DEVICE_ID_INTEL_IOAT_TBG4 0x3429 #define PCI_DEVICE_ID_INTEL_IOAT_TBG5 0x342a #define PCI_DEVICE_ID_INTEL_IOAT_TBG6 0x342b -- 1.7.10.4 You need to figure out what the problem is, not just avoid it. It's very unlikely that the problem is something unique to ia64. In fact, I think it's very doubtful that the problem is even something unique to the 5520 root ports. My guess is there's something special about the system you're testing. Evidently you have traffic going to a device behind the root port at the same time as we're trying to read the root port's BARs. Linux should not generate traffic like that while we're enumerating the root port. Does the problem happen on a root port with an iLO behind it? Can you collect lspci -vvv output and identify the root port where the problem occurs? Bjorn -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] ia64/pci: set mmio decoding on for some host bridge
Hi Bjorn, Thank you for reviewing this patch. I have created a new one and sent it out. And your questions are answered in that new wmail. Regards ZhenHua On 07/09/2013 04:35 AM, Bjorn Helgaas wrote: On Sun, Jul 7, 2013 at 6:16 PM, Li, Zhen-Hua wrote: On some IA64 platforms with intel PCI bridge, for example, HP BL890c i2 with Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, when kernel tries to disable the mmio decoding on the PCI bridge devices, kernel may crash. And in the comment of function quirk_mmio_always_on, it also says: "But doing so (disable the mmio decoding) may cause problems on host bridge and perhaps other key system devices" So, for these PCI bridges, dev->mmio_always_on bit should be set to 1. Signed-off-by: Li, Zhen-Hua --- drivers/pci/quirks.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index e85d230..24b8024 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -44,6 +44,21 @@ static void quirk_mmio_always_on(struct pci_dev *dev) DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_BRIDGE_HOST, 8, quirk_mmio_always_on); +#ifdef CONFIG_IA64 +/* + * On some IA64 platforms, for some intel PCI bridge devices, for example, + * the Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, + * disable the mmio decoding on these devices may cause system crash. + * So dev->mmio_always_on bit should be set to 1. + */ +static void quirk_mmio_on_intel_pcibridge(struct pci_dev *dev) +{ + dev->mmio_always_on = 1; +} +DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_ANY_ID, PCI_ANY_ID, + PCI_CLASS_BRIDGE_PCI, 8, quirk_mmio_on_intel_pcibridge); +#endif The changelog and comment suggest an issue specific to Intel 5520/5500/X58, but the patch sets mmio_always_on for *all* PCI bridges. It claims to be specific to ia64 (and is only compiled there), but the chipset is also used for x86. You need to explain why the problem only affects ia64. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/1] ia64/pci: set mmio decoding on for some host bridge
On some IA64 platforms with intel PCI bridge, for example, HP BL890c i2 with Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, when kernel tries to disable the mmio decoding on the PCI bridge devices, kernel may crash. And in the comment of function quirk_mmio_always_on, it also says: "But doing so (disable the mmio decoding) may cause problems on host bridge and perhaps other key system devices" So, for this PCI bridge, dev->mmio_always_on bit should be set to 1. To avoid affecting the use of quirk_mmio_always_on, a new function is created. Signed-off-by: Li, Zhen-Hua --- drivers/pci/quirks.c| 17 + include/linux/pci_ids.h |1 + 2 files changed, 18 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index e85d230..665af3e 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -44,6 +44,23 @@ static void quirk_mmio_always_on(struct pci_dev *dev) DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_BRIDGE_HOST, 8, quirk_mmio_always_on); +#ifdef CONFIG_IA64 +/* + * On some IA64 platforms, for some intel PCI bridge devices, for example, + * the Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, + * disable the mmio decoding on this device may cause system crash. + * So dev->mmio_always_on bit should be set to 1. + */ +static void quirk_mmio_on_intel_pcibridge(struct pci_dev *dev) +{ + dev->mmio_always_on = 1; +} +DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, + PCI_DEVICE_ID_INTEL_5520_5550_X58, + PCI_CLASS_BRIDGE_PCI, + 8, quirk_mmio_on_intel_pcibridge); +#endif + /* The Mellanox Tavor device gives false positive parity errors * Mark this device with a broken_parity_status, to allow * PCI scanning code to "skip" this now blacklisted device. diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h index 3bed2e8..d8c60b7 100644 --- a/include/linux/pci_ids.h +++ b/include/linux/pci_ids.h @@ -2742,6 +2742,7 @@ #define PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH2_RANK_REV2 0x2db2 #define PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH2_TC_REV20x2db3 #define PCI_DEVICE_ID_INTEL_82855PM_HB 0x3340 +#define PCI_DEVICE_ID_INTEL_5520_5550_X58 0x3408 #define PCI_DEVICE_ID_INTEL_IOAT_TBG4 0x3429 #define PCI_DEVICE_ID_INTEL_IOAT_TBG5 0x342a #define PCI_DEVICE_ID_INTEL_IOAT_TBG6 0x342b -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] ia64/pci: set mmio decoding on for some host bridge
On Sun, Jul 7, 2013 at 6:16 PM, Li, Zhen-Hua wrote: > On some IA64 platforms with intel PCI bridge, for example, HP BL890c i2 > with Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, > when kernel tries to disable the mmio decoding on the PCI bridge devices, > kernel may crash. > > And in the comment of function quirk_mmio_always_on, it also says: > "But doing so (disable the mmio decoding) may cause problems on host bridge > and perhaps other key system devices" > > So, for these PCI bridges, dev->mmio_always_on bit should be set to 1. > > > Signed-off-by: Li, Zhen-Hua > --- > drivers/pci/quirks.c | 15 +++ > 1 file changed, 15 insertions(+) > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > index e85d230..24b8024 100644 > --- a/drivers/pci/quirks.c > +++ b/drivers/pci/quirks.c > @@ -44,6 +44,21 @@ static void quirk_mmio_always_on(struct pci_dev *dev) > DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_ANY_ID, PCI_ANY_ID, > PCI_CLASS_BRIDGE_HOST, 8, > quirk_mmio_always_on); > > +#ifdef CONFIG_IA64 > +/* > + * On some IA64 platforms, for some intel PCI bridge devices, for example, > + * the Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, > + * disable the mmio decoding on these devices may cause system crash. > + * So dev->mmio_always_on bit should be set to 1. > + */ > +static void quirk_mmio_on_intel_pcibridge(struct pci_dev *dev) > +{ > + dev->mmio_always_on = 1; > +} > +DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_ANY_ID, PCI_ANY_ID, > + PCI_CLASS_BRIDGE_PCI, 8, > quirk_mmio_on_intel_pcibridge); > +#endif The changelog and comment suggest an issue specific to Intel 5520/5500/X58, but the patch sets mmio_always_on for *all* PCI bridges. It claims to be specific to ia64 (and is only compiled there), but the chipset is also used for x86. You need to explain why the problem only affects ia64. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] ia64/pci: set mmio decoding on for some host bridge
On Sun, Jul 7, 2013 at 6:16 PM, Li, Zhen-Hua zhen-h...@hp.com wrote: On some IA64 platforms with intel PCI bridge, for example, HP BL890c i2 with Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, when kernel tries to disable the mmio decoding on the PCI bridge devices, kernel may crash. And in the comment of function quirk_mmio_always_on, it also says: But doing so (disable the mmio decoding) may cause problems on host bridge and perhaps other key system devices So, for these PCI bridges, dev-mmio_always_on bit should be set to 1. Signed-off-by: Li, Zhen-Hua zhen-h...@hp.com --- drivers/pci/quirks.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index e85d230..24b8024 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -44,6 +44,21 @@ static void quirk_mmio_always_on(struct pci_dev *dev) DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_BRIDGE_HOST, 8, quirk_mmio_always_on); +#ifdef CONFIG_IA64 +/* + * On some IA64 platforms, for some intel PCI bridge devices, for example, + * the Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, + * disable the mmio decoding on these devices may cause system crash. + * So dev-mmio_always_on bit should be set to 1. + */ +static void quirk_mmio_on_intel_pcibridge(struct pci_dev *dev) +{ + dev-mmio_always_on = 1; +} +DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_ANY_ID, PCI_ANY_ID, + PCI_CLASS_BRIDGE_PCI, 8, quirk_mmio_on_intel_pcibridge); +#endif The changelog and comment suggest an issue specific to Intel 5520/5500/X58, but the patch sets mmio_always_on for *all* PCI bridges. It claims to be specific to ia64 (and is only compiled there), but the chipset is also used for x86. You need to explain why the problem only affects ia64. Bjorn -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/1] ia64/pci: set mmio decoding on for some host bridge
On some IA64 platforms with intel PCI bridge, for example, HP BL890c i2 with Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, when kernel tries to disable the mmio decoding on the PCI bridge devices, kernel may crash. And in the comment of function quirk_mmio_always_on, it also says: But doing so (disable the mmio decoding) may cause problems on host bridge and perhaps other key system devices So, for this PCI bridge, dev-mmio_always_on bit should be set to 1. To avoid affecting the use of quirk_mmio_always_on, a new function is created. Signed-off-by: Li, Zhen-Hua zhen-h...@hp.com --- drivers/pci/quirks.c| 17 + include/linux/pci_ids.h |1 + 2 files changed, 18 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index e85d230..665af3e 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -44,6 +44,23 @@ static void quirk_mmio_always_on(struct pci_dev *dev) DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_BRIDGE_HOST, 8, quirk_mmio_always_on); +#ifdef CONFIG_IA64 +/* + * On some IA64 platforms, for some intel PCI bridge devices, for example, + * the Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, + * disable the mmio decoding on this device may cause system crash. + * So dev-mmio_always_on bit should be set to 1. + */ +static void quirk_mmio_on_intel_pcibridge(struct pci_dev *dev) +{ + dev-mmio_always_on = 1; +} +DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, + PCI_DEVICE_ID_INTEL_5520_5550_X58, + PCI_CLASS_BRIDGE_PCI, + 8, quirk_mmio_on_intel_pcibridge); +#endif + /* The Mellanox Tavor device gives false positive parity errors * Mark this device with a broken_parity_status, to allow * PCI scanning code to skip this now blacklisted device. diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h index 3bed2e8..d8c60b7 100644 --- a/include/linux/pci_ids.h +++ b/include/linux/pci_ids.h @@ -2742,6 +2742,7 @@ #define PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH2_RANK_REV2 0x2db2 #define PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH2_TC_REV20x2db3 #define PCI_DEVICE_ID_INTEL_82855PM_HB 0x3340 +#define PCI_DEVICE_ID_INTEL_5520_5550_X58 0x3408 #define PCI_DEVICE_ID_INTEL_IOAT_TBG4 0x3429 #define PCI_DEVICE_ID_INTEL_IOAT_TBG5 0x342a #define PCI_DEVICE_ID_INTEL_IOAT_TBG6 0x342b -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] ia64/pci: set mmio decoding on for some host bridge
Hi Bjorn, Thank you for reviewing this patch. I have created a new one and sent it out. And your questions are answered in that new wmail. Regards ZhenHua On 07/09/2013 04:35 AM, Bjorn Helgaas wrote: On Sun, Jul 7, 2013 at 6:16 PM, Li, Zhen-Hua zhen-h...@hp.com wrote: On some IA64 platforms with intel PCI bridge, for example, HP BL890c i2 with Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, when kernel tries to disable the mmio decoding on the PCI bridge devices, kernel may crash. And in the comment of function quirk_mmio_always_on, it also says: But doing so (disable the mmio decoding) may cause problems on host bridge and perhaps other key system devices So, for these PCI bridges, dev-mmio_always_on bit should be set to 1. Signed-off-by: Li, Zhen-Hua zhen-h...@hp.com --- drivers/pci/quirks.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index e85d230..24b8024 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -44,6 +44,21 @@ static void quirk_mmio_always_on(struct pci_dev *dev) DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_BRIDGE_HOST, 8, quirk_mmio_always_on); +#ifdef CONFIG_IA64 +/* + * On some IA64 platforms, for some intel PCI bridge devices, for example, + * the Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, + * disable the mmio decoding on these devices may cause system crash. + * So dev-mmio_always_on bit should be set to 1. + */ +static void quirk_mmio_on_intel_pcibridge(struct pci_dev *dev) +{ + dev-mmio_always_on = 1; +} +DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_ANY_ID, PCI_ANY_ID, + PCI_CLASS_BRIDGE_PCI, 8, quirk_mmio_on_intel_pcibridge); +#endif The changelog and comment suggest an issue specific to Intel 5520/5500/X58, but the patch sets mmio_always_on for *all* PCI bridges. It claims to be specific to ia64 (and is only compiled there), but the chipset is also used for x86. You need to explain why the problem only affects ia64. Bjorn -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/1] ia64/pci: set mmio decoding on for some host bridge
On some IA64 platforms with intel PCI bridge, for example, HP BL890c i2 with Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, when kernel tries to disable the mmio decoding on the PCI bridge devices, kernel may crash. And in the comment of function quirk_mmio_always_on, it also says: "But doing so (disable the mmio decoding) may cause problems on host bridge and perhaps other key system devices" So, for these PCI bridges, dev->mmio_always_on bit should be set to 1. Signed-off-by: Li, Zhen-Hua --- drivers/pci/quirks.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index e85d230..24b8024 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -44,6 +44,21 @@ static void quirk_mmio_always_on(struct pci_dev *dev) DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_BRIDGE_HOST, 8, quirk_mmio_always_on); +#ifdef CONFIG_IA64 +/* + * On some IA64 platforms, for some intel PCI bridge devices, for example, + * the Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, + * disable the mmio decoding on these devices may cause system crash. + * So dev->mmio_always_on bit should be set to 1. + */ +static void quirk_mmio_on_intel_pcibridge(struct pci_dev *dev) +{ + dev->mmio_always_on = 1; +} +DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_ANY_ID, PCI_ANY_ID, + PCI_CLASS_BRIDGE_PCI, 8, quirk_mmio_on_intel_pcibridge); +#endif + /* The Mellanox Tavor device gives false positive parity errors * Mark this device with a broken_parity_status, to allow * PCI scanning code to "skip" this now blacklisted device. -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/1] ia64/pci: set mmio decoding on for some host bridge
On some IA64 platforms with intel PCI bridge, for example, HP BL890c i2 with Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, when kernel tries to disable the mmio decoding on the PCI bridge devices, kernel may crash. And in the comment of function quirk_mmio_always_on, it also says: But doing so (disable the mmio decoding) may cause problems on host bridge and perhaps other key system devices So, for these PCI bridges, dev-mmio_always_on bit should be set to 1. Signed-off-by: Li, Zhen-Hua zhen-h...@hp.com --- drivers/pci/quirks.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index e85d230..24b8024 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -44,6 +44,21 @@ static void quirk_mmio_always_on(struct pci_dev *dev) DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_BRIDGE_HOST, 8, quirk_mmio_always_on); +#ifdef CONFIG_IA64 +/* + * On some IA64 platforms, for some intel PCI bridge devices, for example, + * the Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port, + * disable the mmio decoding on these devices may cause system crash. + * So dev-mmio_always_on bit should be set to 1. + */ +static void quirk_mmio_on_intel_pcibridge(struct pci_dev *dev) +{ + dev-mmio_always_on = 1; +} +DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_ANY_ID, PCI_ANY_ID, + PCI_CLASS_BRIDGE_PCI, 8, quirk_mmio_on_intel_pcibridge); +#endif + /* The Mellanox Tavor device gives false positive parity errors * Mark this device with a broken_parity_status, to allow * PCI scanning code to skip this now blacklisted device. -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/