Re: [PATCH v11 0/5] Add new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
On 2017/8/15 22:03, Eric Dumazet wrote: > On Tue, 2017-08-15 at 06:58 -0700, Eric Dumazet wrote: >> On Mon, 2017-08-14 at 22:15 -0700, David Miller wrote: >>> From: Ding Tianhong>>> Date: Tue, 15 Aug 2017 11:23:22 +0800 >>> Some devices have problems with Transaction Layer Packets with the Relaxed Ordering Attribute set. This patch set adds a new PCIe Device Flag, PCI_DEV_FLAGS_NO_RELAXED_ORDERING, a set of PCI Quirks to catch some known devices with Relaxed Ordering issues, and a use of this new flag by the cxgb4 driver to avoid using Relaxed Ordering with problematic Root Complex Ports. >>> ... >>> >>> Series applied, thanks. >> >> I got a NULL deref in pci_find_pcie_root_port() >> > > This was : > > [4.241029] BUG: unable to handle kernel NULL pointer dereference at > 0050 > [4.247001] IP: pci_find_pcie_root_port+0x62/0x80 > [4.253011] PGD 0 > [4.253011] P4D 0 > [4.253011] > [4.258013] Oops: [#1] SMP DEBUG_PAGEALLOC > [4.262015] Modules linked in: > [4.265005] CPU: 31 PID: 1 Comm: swapper/0 Not tainted 4.13.0-dbx-DEV #316 > [4.271002] Hardware name: Intel RML,PCH/Iota_QC_19, BIOS 2.40.0 06/22/2016 > [4.279002] task: a2ee38cfa040 task.stack: a51ec0004000 > [4.285001] RIP: 0010:pci_find_pcie_root_port+0x62/0x80 > [4.290012] RSP: :a51ec0007ab8 EFLAGS: 00010246 > [4.295003] RAX: RBX: a2ee36bae000 RCX: > 0006 > [4.303002] RDX: 081c RSI: a2ee38cfa8c8 RDI: > a2ee36bae000 > [4.310013] RBP: a51ec0007b58 R08: 0001 R09: > > [4.317001] R10: R11: R12: > a51ec0007ad0 > [4.324005] R13: a2ee36bae098 R14: 0002 R15: > a2ee37204818 > [4.331002] FS: () GS:a2ee3fcc() > knlGS: > [4.339002] CS: 0010 DS: ES: CR0: 80050033 > [4.345001] CR2: 0050 CR3: 00401000f000 CR4: > 001406e0 > [4.351002] Call Trace: > [4.354012] ? pci_configure_device+0x19f/0x570 > [4.359002] ? pci_conf1_read+0xb8/0xf0 > [4.363002] ? raw_pci_read+0x23/0x40 > [4.366011] ? pci_read+0x2c/0x30 > [4.370014] ? pci_read_config_word+0x67/0x70 > [4.374012] pci_device_add+0x28/0x230 > [4.378012] ? pci_vpd_f0_read+0x50/0x80 > [4.382014] pci_scan_single_device+0x96/0xc0 > [4.386012] pci_scan_slot+0x79/0xf0 > [4.389001] pci_scan_child_bus+0x31/0x180 > [4.394014] acpi_pci_root_create+0x1c6/0x240 > [4.398013] pci_acpi_scan_root+0x15f/0x1b0 > [4.402012] acpi_pci_root_add+0x2e6/0x400 > [4.406012] ? acpi_evaluate_integer+0x37/0x60 > [4.411002] acpi_bus_attach+0xdf/0x200 > [4.415002] acpi_bus_attach+0x6a/0x200 > [4.418014] acpi_bus_attach+0x6a/0x200 > [4.422013] acpi_bus_scan+0x38/0x70 > [4.426011] acpi_scan_init+0x10c/0x271 > [4.429001] acpi_init+0x2fa/0x348 > [4.433004] ? acpi_sleep_proc_init+0x2d/0x2d > [4.437001] do_one_initcall+0x43/0x169 > [4.441001] kernel_init_freeable+0x1d0/0x258 > [4.445003] ? rest_init+0xe0/0xe0 > [4.449001] kernel_init+0xe/0x150 > [4.451002] ret_from_fork+0x27/0x40 > [4.457004] Code: 85 d2 74 27 80 7a 4a 00 74 21 48 89 d0 48 89 c2 f6 80 1b > 09 00 00 10 74 07 48 8b 90 a0 0a 00 00 48 8b 52 10 48 83 7a 10 00 75 d0 <0f> > b7 50 50 5d 81 e2 f0 00 00 00 83 fa 40 ba 00 00 00 00 48 0f > [4.474012] RIP: pci_find_pcie_root_port+0x62/0x80 RSP: a51ec0007ab8 > [4.481004] CR2: 0050 > [4.484001] ---[ end trace 6f9be6a057581199 ]--- > [4.488001] Kernel panic - not syncing: Fatal exception > [4.494013] Rebooting in 10 seconds.. > [4.494013] ACPI MEMORY or I/O RESET_REG. > >> >> This local hack seems to fix the issue. >> >> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c >> index >> af0cc3456dc1b48b1325c06c5edd2ca8cc22a640..cfd8eb5a3d0ba8347d44952ffab28d9c761044d3 >> 100644 >> --- a/drivers/pci/pci.c >> +++ b/drivers/pci/pci.c >> @@ -522,7 +522,7 @@ struct pci_dev *pci_find_pcie_root_port(struct pci_dev >> *dev) >> bridge = pci_upstream_bridge(bridge); >> } >> >> - if (pci_pcie_type(highest_pcie_bridge) != PCI_EXP_TYPE_ROOT_PORT) >> + if (highest_pcie_bridge && pci_pcie_type(highest_pcie_bridge) != >> PCI_EXP_TYPE_ROOT_PORT) >> return NULL; >> >> return highest_pcie_bridge; > It is very strange that I could not reproduce this problem on my server which is Xeon 2690v3, but it is really a obviously issue when the dev could not find a upstream bridge in the pci_find_pcie_root_port(), so the better way is just like your did in this patch. Thanks. Regards Tianhong > > > . >
Re: [PATCH v11 0/5] Add new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
On 2017/8/15 22:03, Eric Dumazet wrote: > On Tue, 2017-08-15 at 06:58 -0700, Eric Dumazet wrote: >> On Mon, 2017-08-14 at 22:15 -0700, David Miller wrote: >>> From: Ding Tianhong >>> Date: Tue, 15 Aug 2017 11:23:22 +0800 >>> Some devices have problems with Transaction Layer Packets with the Relaxed Ordering Attribute set. This patch set adds a new PCIe Device Flag, PCI_DEV_FLAGS_NO_RELAXED_ORDERING, a set of PCI Quirks to catch some known devices with Relaxed Ordering issues, and a use of this new flag by the cxgb4 driver to avoid using Relaxed Ordering with problematic Root Complex Ports. >>> ... >>> >>> Series applied, thanks. >> >> I got a NULL deref in pci_find_pcie_root_port() >> > > This was : > > [4.241029] BUG: unable to handle kernel NULL pointer dereference at > 0050 > [4.247001] IP: pci_find_pcie_root_port+0x62/0x80 > [4.253011] PGD 0 > [4.253011] P4D 0 > [4.253011] > [4.258013] Oops: [#1] SMP DEBUG_PAGEALLOC > [4.262015] Modules linked in: > [4.265005] CPU: 31 PID: 1 Comm: swapper/0 Not tainted 4.13.0-dbx-DEV #316 > [4.271002] Hardware name: Intel RML,PCH/Iota_QC_19, BIOS 2.40.0 06/22/2016 > [4.279002] task: a2ee38cfa040 task.stack: a51ec0004000 > [4.285001] RIP: 0010:pci_find_pcie_root_port+0x62/0x80 > [4.290012] RSP: :a51ec0007ab8 EFLAGS: 00010246 > [4.295003] RAX: RBX: a2ee36bae000 RCX: > 0006 > [4.303002] RDX: 081c RSI: a2ee38cfa8c8 RDI: > a2ee36bae000 > [4.310013] RBP: a51ec0007b58 R08: 0001 R09: > > [4.317001] R10: R11: R12: > a51ec0007ad0 > [4.324005] R13: a2ee36bae098 R14: 0002 R15: > a2ee37204818 > [4.331002] FS: () GS:a2ee3fcc() > knlGS: > [4.339002] CS: 0010 DS: ES: CR0: 80050033 > [4.345001] CR2: 0050 CR3: 00401000f000 CR4: > 001406e0 > [4.351002] Call Trace: > [4.354012] ? pci_configure_device+0x19f/0x570 > [4.359002] ? pci_conf1_read+0xb8/0xf0 > [4.363002] ? raw_pci_read+0x23/0x40 > [4.366011] ? pci_read+0x2c/0x30 > [4.370014] ? pci_read_config_word+0x67/0x70 > [4.374012] pci_device_add+0x28/0x230 > [4.378012] ? pci_vpd_f0_read+0x50/0x80 > [4.382014] pci_scan_single_device+0x96/0xc0 > [4.386012] pci_scan_slot+0x79/0xf0 > [4.389001] pci_scan_child_bus+0x31/0x180 > [4.394014] acpi_pci_root_create+0x1c6/0x240 > [4.398013] pci_acpi_scan_root+0x15f/0x1b0 > [4.402012] acpi_pci_root_add+0x2e6/0x400 > [4.406012] ? acpi_evaluate_integer+0x37/0x60 > [4.411002] acpi_bus_attach+0xdf/0x200 > [4.415002] acpi_bus_attach+0x6a/0x200 > [4.418014] acpi_bus_attach+0x6a/0x200 > [4.422013] acpi_bus_scan+0x38/0x70 > [4.426011] acpi_scan_init+0x10c/0x271 > [4.429001] acpi_init+0x2fa/0x348 > [4.433004] ? acpi_sleep_proc_init+0x2d/0x2d > [4.437001] do_one_initcall+0x43/0x169 > [4.441001] kernel_init_freeable+0x1d0/0x258 > [4.445003] ? rest_init+0xe0/0xe0 > [4.449001] kernel_init+0xe/0x150 > [4.451002] ret_from_fork+0x27/0x40 > [4.457004] Code: 85 d2 74 27 80 7a 4a 00 74 21 48 89 d0 48 89 c2 f6 80 1b > 09 00 00 10 74 07 48 8b 90 a0 0a 00 00 48 8b 52 10 48 83 7a 10 00 75 d0 <0f> > b7 50 50 5d 81 e2 f0 00 00 00 83 fa 40 ba 00 00 00 00 48 0f > [4.474012] RIP: pci_find_pcie_root_port+0x62/0x80 RSP: a51ec0007ab8 > [4.481004] CR2: 0050 > [4.484001] ---[ end trace 6f9be6a057581199 ]--- > [4.488001] Kernel panic - not syncing: Fatal exception > [4.494013] Rebooting in 10 seconds.. > [4.494013] ACPI MEMORY or I/O RESET_REG. > >> >> This local hack seems to fix the issue. >> >> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c >> index >> af0cc3456dc1b48b1325c06c5edd2ca8cc22a640..cfd8eb5a3d0ba8347d44952ffab28d9c761044d3 >> 100644 >> --- a/drivers/pci/pci.c >> +++ b/drivers/pci/pci.c >> @@ -522,7 +522,7 @@ struct pci_dev *pci_find_pcie_root_port(struct pci_dev >> *dev) >> bridge = pci_upstream_bridge(bridge); >> } >> >> - if (pci_pcie_type(highest_pcie_bridge) != PCI_EXP_TYPE_ROOT_PORT) >> + if (highest_pcie_bridge && pci_pcie_type(highest_pcie_bridge) != >> PCI_EXP_TYPE_ROOT_PORT) >> return NULL; >> >> return highest_pcie_bridge; > It is very strange that I could not reproduce this problem on my server which is Xeon 2690v3, but it is really a obviously issue when the dev could not find a upstream bridge in the pci_find_pcie_root_port(), so the better way is just like your did in this patch. Thanks. Regards Tianhong > > > . >
Re: [PATCH v11 0/5] Add new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
On Tue, 2017-08-15 at 06:58 -0700, Eric Dumazet wrote: > On Mon, 2017-08-14 at 22:15 -0700, David Miller wrote: > > From: Ding Tianhong> > Date: Tue, 15 Aug 2017 11:23:22 +0800 > > > > > Some devices have problems with Transaction Layer Packets with the Relaxed > > > Ordering Attribute set. This patch set adds a new PCIe Device Flag, > > > PCI_DEV_FLAGS_NO_RELAXED_ORDERING, a set of PCI Quirks to catch some known > > > devices with Relaxed Ordering issues, and a use of this new flag by the > > > cxgb4 driver to avoid using Relaxed Ordering with problematic Root Complex > > > Ports. > > ... > > > > Series applied, thanks. > > I got a NULL deref in pci_find_pcie_root_port() > This was : [4.241029] BUG: unable to handle kernel NULL pointer dereference at 0050 [4.247001] IP: pci_find_pcie_root_port+0x62/0x80 [4.253011] PGD 0 [4.253011] P4D 0 [4.253011] [4.258013] Oops: [#1] SMP DEBUG_PAGEALLOC [4.262015] Modules linked in: [4.265005] CPU: 31 PID: 1 Comm: swapper/0 Not tainted 4.13.0-dbx-DEV #316 [4.271002] Hardware name: Intel RML,PCH/Iota_QC_19, BIOS 2.40.0 06/22/2016 [4.279002] task: a2ee38cfa040 task.stack: a51ec0004000 [4.285001] RIP: 0010:pci_find_pcie_root_port+0x62/0x80 [4.290012] RSP: :a51ec0007ab8 EFLAGS: 00010246 [4.295003] RAX: RBX: a2ee36bae000 RCX: 0006 [4.303002] RDX: 081c RSI: a2ee38cfa8c8 RDI: a2ee36bae000 [4.310013] RBP: a51ec0007b58 R08: 0001 R09: [4.317001] R10: R11: R12: a51ec0007ad0 [4.324005] R13: a2ee36bae098 R14: 0002 R15: a2ee37204818 [4.331002] FS: () GS:a2ee3fcc() knlGS: [4.339002] CS: 0010 DS: ES: CR0: 80050033 [4.345001] CR2: 0050 CR3: 00401000f000 CR4: 001406e0 [4.351002] Call Trace: [4.354012] ? pci_configure_device+0x19f/0x570 [4.359002] ? pci_conf1_read+0xb8/0xf0 [4.363002] ? raw_pci_read+0x23/0x40 [4.366011] ? pci_read+0x2c/0x30 [4.370014] ? pci_read_config_word+0x67/0x70 [4.374012] pci_device_add+0x28/0x230 [4.378012] ? pci_vpd_f0_read+0x50/0x80 [4.382014] pci_scan_single_device+0x96/0xc0 [4.386012] pci_scan_slot+0x79/0xf0 [4.389001] pci_scan_child_bus+0x31/0x180 [4.394014] acpi_pci_root_create+0x1c6/0x240 [4.398013] pci_acpi_scan_root+0x15f/0x1b0 [4.402012] acpi_pci_root_add+0x2e6/0x400 [4.406012] ? acpi_evaluate_integer+0x37/0x60 [4.411002] acpi_bus_attach+0xdf/0x200 [4.415002] acpi_bus_attach+0x6a/0x200 [4.418014] acpi_bus_attach+0x6a/0x200 [4.422013] acpi_bus_scan+0x38/0x70 [4.426011] acpi_scan_init+0x10c/0x271 [4.429001] acpi_init+0x2fa/0x348 [4.433004] ? acpi_sleep_proc_init+0x2d/0x2d [4.437001] do_one_initcall+0x43/0x169 [4.441001] kernel_init_freeable+0x1d0/0x258 [4.445003] ? rest_init+0xe0/0xe0 [4.449001] kernel_init+0xe/0x150 [4.451002] ret_from_fork+0x27/0x40 [4.457004] Code: 85 d2 74 27 80 7a 4a 00 74 21 48 89 d0 48 89 c2 f6 80 1b 09 00 00 10 74 07 48 8b 90 a0 0a 00 00 48 8b 52 10 48 83 7a 10 00 75 d0 <0f> b7 50 50 5d 81 e2 f0 00 00 00 83 fa 40 ba 00 00 00 00 48 0f [4.474012] RIP: pci_find_pcie_root_port+0x62/0x80 RSP: a51ec0007ab8 [4.481004] CR2: 0050 [4.484001] ---[ end trace 6f9be6a057581199 ]--- [4.488001] Kernel panic - not syncing: Fatal exception [4.494013] Rebooting in 10 seconds.. [4.494013] ACPI MEMORY or I/O RESET_REG. > > This local hack seems to fix the issue. > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index > af0cc3456dc1b48b1325c06c5edd2ca8cc22a640..cfd8eb5a3d0ba8347d44952ffab28d9c761044d3 > 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -522,7 +522,7 @@ struct pci_dev *pci_find_pcie_root_port(struct pci_dev > *dev) > bridge = pci_upstream_bridge(bridge); > } > > - if (pci_pcie_type(highest_pcie_bridge) != PCI_EXP_TYPE_ROOT_PORT) > + if (highest_pcie_bridge && pci_pcie_type(highest_pcie_bridge) != > PCI_EXP_TYPE_ROOT_PORT) > return NULL; > > return highest_pcie_bridge;
Re: [PATCH v11 0/5] Add new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
On Tue, 2017-08-15 at 06:58 -0700, Eric Dumazet wrote: > On Mon, 2017-08-14 at 22:15 -0700, David Miller wrote: > > From: Ding Tianhong > > Date: Tue, 15 Aug 2017 11:23:22 +0800 > > > > > Some devices have problems with Transaction Layer Packets with the Relaxed > > > Ordering Attribute set. This patch set adds a new PCIe Device Flag, > > > PCI_DEV_FLAGS_NO_RELAXED_ORDERING, a set of PCI Quirks to catch some known > > > devices with Relaxed Ordering issues, and a use of this new flag by the > > > cxgb4 driver to avoid using Relaxed Ordering with problematic Root Complex > > > Ports. > > ... > > > > Series applied, thanks. > > I got a NULL deref in pci_find_pcie_root_port() > This was : [4.241029] BUG: unable to handle kernel NULL pointer dereference at 0050 [4.247001] IP: pci_find_pcie_root_port+0x62/0x80 [4.253011] PGD 0 [4.253011] P4D 0 [4.253011] [4.258013] Oops: [#1] SMP DEBUG_PAGEALLOC [4.262015] Modules linked in: [4.265005] CPU: 31 PID: 1 Comm: swapper/0 Not tainted 4.13.0-dbx-DEV #316 [4.271002] Hardware name: Intel RML,PCH/Iota_QC_19, BIOS 2.40.0 06/22/2016 [4.279002] task: a2ee38cfa040 task.stack: a51ec0004000 [4.285001] RIP: 0010:pci_find_pcie_root_port+0x62/0x80 [4.290012] RSP: :a51ec0007ab8 EFLAGS: 00010246 [4.295003] RAX: RBX: a2ee36bae000 RCX: 0006 [4.303002] RDX: 081c RSI: a2ee38cfa8c8 RDI: a2ee36bae000 [4.310013] RBP: a51ec0007b58 R08: 0001 R09: [4.317001] R10: R11: R12: a51ec0007ad0 [4.324005] R13: a2ee36bae098 R14: 0002 R15: a2ee37204818 [4.331002] FS: () GS:a2ee3fcc() knlGS: [4.339002] CS: 0010 DS: ES: CR0: 80050033 [4.345001] CR2: 0050 CR3: 00401000f000 CR4: 001406e0 [4.351002] Call Trace: [4.354012] ? pci_configure_device+0x19f/0x570 [4.359002] ? pci_conf1_read+0xb8/0xf0 [4.363002] ? raw_pci_read+0x23/0x40 [4.366011] ? pci_read+0x2c/0x30 [4.370014] ? pci_read_config_word+0x67/0x70 [4.374012] pci_device_add+0x28/0x230 [4.378012] ? pci_vpd_f0_read+0x50/0x80 [4.382014] pci_scan_single_device+0x96/0xc0 [4.386012] pci_scan_slot+0x79/0xf0 [4.389001] pci_scan_child_bus+0x31/0x180 [4.394014] acpi_pci_root_create+0x1c6/0x240 [4.398013] pci_acpi_scan_root+0x15f/0x1b0 [4.402012] acpi_pci_root_add+0x2e6/0x400 [4.406012] ? acpi_evaluate_integer+0x37/0x60 [4.411002] acpi_bus_attach+0xdf/0x200 [4.415002] acpi_bus_attach+0x6a/0x200 [4.418014] acpi_bus_attach+0x6a/0x200 [4.422013] acpi_bus_scan+0x38/0x70 [4.426011] acpi_scan_init+0x10c/0x271 [4.429001] acpi_init+0x2fa/0x348 [4.433004] ? acpi_sleep_proc_init+0x2d/0x2d [4.437001] do_one_initcall+0x43/0x169 [4.441001] kernel_init_freeable+0x1d0/0x258 [4.445003] ? rest_init+0xe0/0xe0 [4.449001] kernel_init+0xe/0x150 [4.451002] ret_from_fork+0x27/0x40 [4.457004] Code: 85 d2 74 27 80 7a 4a 00 74 21 48 89 d0 48 89 c2 f6 80 1b 09 00 00 10 74 07 48 8b 90 a0 0a 00 00 48 8b 52 10 48 83 7a 10 00 75 d0 <0f> b7 50 50 5d 81 e2 f0 00 00 00 83 fa 40 ba 00 00 00 00 48 0f [4.474012] RIP: pci_find_pcie_root_port+0x62/0x80 RSP: a51ec0007ab8 [4.481004] CR2: 0050 [4.484001] ---[ end trace 6f9be6a057581199 ]--- [4.488001] Kernel panic - not syncing: Fatal exception [4.494013] Rebooting in 10 seconds.. [4.494013] ACPI MEMORY or I/O RESET_REG. > > This local hack seems to fix the issue. > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index > af0cc3456dc1b48b1325c06c5edd2ca8cc22a640..cfd8eb5a3d0ba8347d44952ffab28d9c761044d3 > 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -522,7 +522,7 @@ struct pci_dev *pci_find_pcie_root_port(struct pci_dev > *dev) > bridge = pci_upstream_bridge(bridge); > } > > - if (pci_pcie_type(highest_pcie_bridge) != PCI_EXP_TYPE_ROOT_PORT) > + if (highest_pcie_bridge && pci_pcie_type(highest_pcie_bridge) != > PCI_EXP_TYPE_ROOT_PORT) > return NULL; > > return highest_pcie_bridge;
Re: [PATCH v11 0/5] Add new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
On Mon, 2017-08-14 at 22:15 -0700, David Miller wrote: > From: Ding Tianhong> Date: Tue, 15 Aug 2017 11:23:22 +0800 > > > Some devices have problems with Transaction Layer Packets with the Relaxed > > Ordering Attribute set. This patch set adds a new PCIe Device Flag, > > PCI_DEV_FLAGS_NO_RELAXED_ORDERING, a set of PCI Quirks to catch some known > > devices with Relaxed Ordering issues, and a use of this new flag by the > > cxgb4 driver to avoid using Relaxed Ordering with problematic Root Complex > > Ports. > ... > > Series applied, thanks. I got a NULL deref in pci_find_pcie_root_port() Was it expected ? This local hack seems to fix the issue. diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index af0cc3456dc1b48b1325c06c5edd2ca8cc22a640..cfd8eb5a3d0ba8347d44952ffab28d9c761044d3 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -522,7 +522,7 @@ struct pci_dev *pci_find_pcie_root_port(struct pci_dev *dev) bridge = pci_upstream_bridge(bridge); } - if (pci_pcie_type(highest_pcie_bridge) != PCI_EXP_TYPE_ROOT_PORT) + if (highest_pcie_bridge && pci_pcie_type(highest_pcie_bridge) != PCI_EXP_TYPE_ROOT_PORT) return NULL; return highest_pcie_bridge;
Re: [PATCH v11 0/5] Add new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
On Mon, 2017-08-14 at 22:15 -0700, David Miller wrote: > From: Ding Tianhong > Date: Tue, 15 Aug 2017 11:23:22 +0800 > > > Some devices have problems with Transaction Layer Packets with the Relaxed > > Ordering Attribute set. This patch set adds a new PCIe Device Flag, > > PCI_DEV_FLAGS_NO_RELAXED_ORDERING, a set of PCI Quirks to catch some known > > devices with Relaxed Ordering issues, and a use of this new flag by the > > cxgb4 driver to avoid using Relaxed Ordering with problematic Root Complex > > Ports. > ... > > Series applied, thanks. I got a NULL deref in pci_find_pcie_root_port() Was it expected ? This local hack seems to fix the issue. diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index af0cc3456dc1b48b1325c06c5edd2ca8cc22a640..cfd8eb5a3d0ba8347d44952ffab28d9c761044d3 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -522,7 +522,7 @@ struct pci_dev *pci_find_pcie_root_port(struct pci_dev *dev) bridge = pci_upstream_bridge(bridge); } - if (pci_pcie_type(highest_pcie_bridge) != PCI_EXP_TYPE_ROOT_PORT) + if (highest_pcie_bridge && pci_pcie_type(highest_pcie_bridge) != PCI_EXP_TYPE_ROOT_PORT) return NULL; return highest_pcie_bridge;
Re: [PATCH v11 0/5] Add new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
From: Ding TianhongDate: Tue, 15 Aug 2017 11:23:22 +0800 > Some devices have problems with Transaction Layer Packets with the Relaxed > Ordering Attribute set. This patch set adds a new PCIe Device Flag, > PCI_DEV_FLAGS_NO_RELAXED_ORDERING, a set of PCI Quirks to catch some known > devices with Relaxed Ordering issues, and a use of this new flag by the > cxgb4 driver to avoid using Relaxed Ordering with problematic Root Complex > Ports. ... Series applied, thanks.
Re: [PATCH v11 0/5] Add new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
From: Ding Tianhong Date: Tue, 15 Aug 2017 11:23:22 +0800 > Some devices have problems with Transaction Layer Packets with the Relaxed > Ordering Attribute set. This patch set adds a new PCIe Device Flag, > PCI_DEV_FLAGS_NO_RELAXED_ORDERING, a set of PCI Quirks to catch some known > devices with Relaxed Ordering issues, and a use of this new flag by the > cxgb4 driver to avoid using Relaxed Ordering with problematic Root Complex > Ports. ... Series applied, thanks.
[PATCH v11 0/5] Add new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
Some devices have problems with Transaction Layer Packets with the Relaxed Ordering Attribute set. This patch set adds a new PCIe Device Flag, PCI_DEV_FLAGS_NO_RELAXED_ORDERING, a set of PCI Quirks to catch some known devices with Relaxed Ordering issues, and a use of this new flag by the cxgb4 driver to avoid using Relaxed Ordering with problematic Root Complex Ports. It's been years since I've submitted kernel.org patches, I appolgise for the almost certain submission errors. v2: Alexander point out that the v1 was only a part of the whole solution, some platform which has some issues could use the new flag to indicate that it is not safe to enable relaxed ordering attribute, then we need to clear the relaxed ordering enable bits in the PCI configuration when initializing the device. So add a new second patch to modify the PCI initialization code to clear the relaxed ordering enable bit in the event that the root complex doesn't want relaxed ordering enabled. The third patch was base on the v1's second patch and only be changed to query the relaxed ordering enable bit in the PCI configuration space to allow the Chelsio NIC to send TLPs with the relaxed ordering attributes set. This version didn't plan to drop the defines for Intel Drivers to use the new checking way to enable relaxed ordering because it is not the hardest part of the moment, we could fix it in next patchset when this patches reach the goal. v3: Redesigned the logic for pci_configure_relaxed_ordering when configuration, If a PCIe device didn't enable the relaxed ordering attribute default, we should not do anything in the PCIe configuration, otherwise we should check if any of the devices above us do not support relaxed ordering by the PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag, then base on the result if we get a return that indicate that the relaxed ordering is not supported we should update our device to disable relaxed ordering in configuration space. If the device above us doesn't exist or isn't the PCIe device, we shouldn't do anything and skip updating relaxed ordering because we are probably running in a guest. v4: Rename the functions pcie_get_relaxed_ordering and pcie_disable_relaxed_ordering according John's suggestion, and modify the description, use the true/false as the return value. We shouldn't enable relaxed ordering attribute by the setting in the root complex configuration space for PCIe device, so fix it for cxgb4. Fix some format issues. v5: Removed the unnecessary code for some function which only return the bool value, and add the check for VF device. Make this patch set base on 4.12-rc5. v6: Fix the logic error in the need to enable the relaxed ordering attribute for cxgb4. v7: The cxgb4 drivers will enable the PCIe Capability Device Control[Relaxed Ordering Enable] in PCI Probe() routine, this will break our current solution for some platform which has problematic when enable the relaxed ordering attribute. According to the latest recommendations, remove the enable_pcie_relaxed_ordering(), although it could not cover the Peer-to-Peer scene, but we agree to leave this problem until we really trigger it. Make this patch set base on 4.12 release version. v8: Change the second patch title and description to make it more reasonable, add the acked-by from Alex and Ashok. Add a new patch to enable the Relaxed Ordering Attribute for cxgb4vf driver. Make this patch set base on 4.13-rc2. v9: The document (https://software.intel.com/sites/default/files/managed/9e/ bc/64-ia-32-architectures-optimization-manual.pdf) indicate that the Xeon processors based on Broadwell/Haswell microarchitecture has the problem with Relaxed Ordering Attribute enabled, so add the whole list Device ID from Intel to the patch. v10: Significant rework based on Bjorn's feedback, reorganize the first 2 patches, now the Intel and AMD erratum soc has been divided to the different patches, rename the pcie_relaxed_ordering_supported() to pcie_relaxed_ordering_enabled(), and no need to check every intervening switch except the root ports, update some commits. v11: We shouldn't let the Intel engineer to acked the AMD's erratum patch, fix the funny mistake. Casey Leedom (2): net/cxgb4: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag net/cxgb4vf: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag Ding Tianhong (3): PCI: Disable PCIe Relaxed Ordering if unsupported PCI: Disable Relaxed Ordering for some Intel processors PCI: Disable Relaxed Ordering Attributes for AMD A1100 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h | 1 + drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c| 23 -- drivers/net/ethernet/chelsio/cxgb4/sge.c | 5 +- drivers/net/ethernet/chelsio/cxgb4vf/adapter.h | 1 +
[PATCH v11 0/5] Add new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
Some devices have problems with Transaction Layer Packets with the Relaxed Ordering Attribute set. This patch set adds a new PCIe Device Flag, PCI_DEV_FLAGS_NO_RELAXED_ORDERING, a set of PCI Quirks to catch some known devices with Relaxed Ordering issues, and a use of this new flag by the cxgb4 driver to avoid using Relaxed Ordering with problematic Root Complex Ports. It's been years since I've submitted kernel.org patches, I appolgise for the almost certain submission errors. v2: Alexander point out that the v1 was only a part of the whole solution, some platform which has some issues could use the new flag to indicate that it is not safe to enable relaxed ordering attribute, then we need to clear the relaxed ordering enable bits in the PCI configuration when initializing the device. So add a new second patch to modify the PCI initialization code to clear the relaxed ordering enable bit in the event that the root complex doesn't want relaxed ordering enabled. The third patch was base on the v1's second patch and only be changed to query the relaxed ordering enable bit in the PCI configuration space to allow the Chelsio NIC to send TLPs with the relaxed ordering attributes set. This version didn't plan to drop the defines for Intel Drivers to use the new checking way to enable relaxed ordering because it is not the hardest part of the moment, we could fix it in next patchset when this patches reach the goal. v3: Redesigned the logic for pci_configure_relaxed_ordering when configuration, If a PCIe device didn't enable the relaxed ordering attribute default, we should not do anything in the PCIe configuration, otherwise we should check if any of the devices above us do not support relaxed ordering by the PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag, then base on the result if we get a return that indicate that the relaxed ordering is not supported we should update our device to disable relaxed ordering in configuration space. If the device above us doesn't exist or isn't the PCIe device, we shouldn't do anything and skip updating relaxed ordering because we are probably running in a guest. v4: Rename the functions pcie_get_relaxed_ordering and pcie_disable_relaxed_ordering according John's suggestion, and modify the description, use the true/false as the return value. We shouldn't enable relaxed ordering attribute by the setting in the root complex configuration space for PCIe device, so fix it for cxgb4. Fix some format issues. v5: Removed the unnecessary code for some function which only return the bool value, and add the check for VF device. Make this patch set base on 4.12-rc5. v6: Fix the logic error in the need to enable the relaxed ordering attribute for cxgb4. v7: The cxgb4 drivers will enable the PCIe Capability Device Control[Relaxed Ordering Enable] in PCI Probe() routine, this will break our current solution for some platform which has problematic when enable the relaxed ordering attribute. According to the latest recommendations, remove the enable_pcie_relaxed_ordering(), although it could not cover the Peer-to-Peer scene, but we agree to leave this problem until we really trigger it. Make this patch set base on 4.12 release version. v8: Change the second patch title and description to make it more reasonable, add the acked-by from Alex and Ashok. Add a new patch to enable the Relaxed Ordering Attribute for cxgb4vf driver. Make this patch set base on 4.13-rc2. v9: The document (https://software.intel.com/sites/default/files/managed/9e/ bc/64-ia-32-architectures-optimization-manual.pdf) indicate that the Xeon processors based on Broadwell/Haswell microarchitecture has the problem with Relaxed Ordering Attribute enabled, so add the whole list Device ID from Intel to the patch. v10: Significant rework based on Bjorn's feedback, reorganize the first 2 patches, now the Intel and AMD erratum soc has been divided to the different patches, rename the pcie_relaxed_ordering_supported() to pcie_relaxed_ordering_enabled(), and no need to check every intervening switch except the root ports, update some commits. v11: We shouldn't let the Intel engineer to acked the AMD's erratum patch, fix the funny mistake. Casey Leedom (2): net/cxgb4: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag net/cxgb4vf: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag Ding Tianhong (3): PCI: Disable PCIe Relaxed Ordering if unsupported PCI: Disable Relaxed Ordering for some Intel processors PCI: Disable Relaxed Ordering Attributes for AMD A1100 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h | 1 + drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c| 23 -- drivers/net/ethernet/chelsio/cxgb4/sge.c | 5 +- drivers/net/ethernet/chelsio/cxgb4vf/adapter.h | 1 +