Re: [PATCH v7 1/2] PCI/ERR: Call pci_bus_reset() before calling ->slot_reset() callback

2020-11-26 Thread Ethan Zhao
On Wed, Nov 25, 2020 at 2:47 AM Guilherme G. Piccoli wrote: > > Hi Kuppuswamy Sathyanarayanan (and all involved here), thanks for the > patch! I'd like to ask what is the status of this patchset - I just > "parachuted" in the issue, and by tracking the linux-pci ML, I found > this V7 (and all

Re: [PATCH v11 4/5] PCI/portdrv: Remove redundant pci_aer_available() check in DPC enable logic

2020-11-02 Thread Ethan Zhao
n one of the possible is if (pci_find_ext_capability(dev, PCI_EXT_CAP_ID_DPC) && (pcie_ports_dpc_native)) services |= PCIE_PORT_SERVICE_DPC; after your patch ? nothing about AER ? Thanks, Ethan On Thu, Oct 29, 2020 at 1:14 AM Kuppuswamy, Sathyanarayanan wrote: > > > > On 10/27/20

Re: [PATCH] AER: aer_root_reset() non-native handling

2020-11-01 Thread Ethan Zhao
On Sat, Oct 31, 2020 at 6:36 AM Sean V Kelley wrote: > > If an OS has not been granted AER control via _OSC, then > the OS should not make changes to PCI_ERR_ROOT_COMMAND and > PCI_ERR_ROOT_STATUS related registers. Per section 4.5.1 of > the System Firmware Intermediary (SFI) _OSC and DPC

Re: [PATCH v11 1/5] PCI: Conditionally initialize host bridge native_* members

2020-10-28 Thread Ethan Zhao
On Tue, Oct 27, 2020 at 10:00 PM Kuppuswamy Sathyanarayanan wrote: > > If CONFIG_PCIEPORTBUS is not enabled in kernel then initialing > struct pci_host_bridge PCIe specific native_* members to "1" is > incorrect. So protect the PCIe specific member initialization > with CONFIG_PCIEPORTBUS. > >

Re: [PATCH v10 5/5] PCI/DPC: Move AER/DPC dependency checks out of DPC driver

2020-10-28 Thread Ethan Zhao
On Tue, Oct 27, 2020 at 11:12 AM Kuppuswamy Sathyanarayanan wrote: > > Currently, AER and DPC Capabilities dependency checks is > distributed between DPC and portdrv service drivers. So move > them out of DPC driver. > > Also, since services & PCIE_PORT_SERVICE_AER check already > ensures AER

Re: [PATCH v11 4/5] PCI/portdrv: Remove redundant pci_aer_available() check in DPC enable logic

2020-10-28 Thread Ethan Zhao
On Tue, Oct 27, 2020 at 10:00 PM Kuppuswamy Sathyanarayanan wrote: > > In DPC service enable logic, check for > services & PCIE_PORT_SERVICE_AER implies pci_aer_available() How about PCIE_PORT_SERVICE_AER is not configured, but pcie_aer_disable == 0 ? > is true. So there is no need to explicitly

Re: [PATCH V2 1/2] PCI/AER: Add pcie_is_ecrc_enabled() API

2020-10-28 Thread Ethan Zhao
On Wed, Oct 28, 2020 at 9:48 AM Vidya Sagar wrote: > > Adds pcie_is_ecrc_enabled() API to let other sub-systems (like DesignWare) > to query if ECRC policy is enabled and perform any configuration > required in those respective sub-systems. > > Signed-off-by: Vidya Sagar > --- > V2: > * None

Re: [PATCH v2] pciehp: Add check for DL_ACTIVE bit in pciehp_check_link_status()

2020-10-20 Thread Ethan Zhao
On Tue, Oct 20, 2020 at 2:33 PM Sanjay R Mehta wrote: > > From: Sanjay R Mehta > > if DL_ACTIVE bit is set it means that there is no need to check > PCI_EXP_LNKSTA_LT bit, as DL_ACTIVE would have set only if the link > is already trained. Hence adding a check which takes care of this > scenario.

Re: [PATCH v9 12/15] PCI/RCEC: Add RCiEP's linked RCEC to AER/ERR

2020-10-19 Thread Ethan Zhao
On Sat, Oct 17, 2020 at 6:29 AM Bjorn Helgaas wrote: > > [+cc Christoph, Ethan, Sinan, Keith; sorry should have cc'd you to > begin with since you're looking at this code too. Particularly > interested in your thoughts about whether we should be touching > PCI_ERR_ROOT_COMMAND and

Re: [PATCH v6 2/2] PCI/ERR: Split the fatal and non-fatal error recovery handling

2020-10-15 Thread Ethan Zhao
On Thu, Oct 15, 2020 at 1:53 PM Kuppuswamy, Sathyanarayanan wrote: > > > > On 10/14/20 10:05 PM, Ethan Zhao wrote: > > On Thu, Oct 15, 2020 at 11:04 AM Kuppuswamy, Sathyanarayanan > > wrote: > >> > >> > >> > >> On 10/14/20 6:58

Re: [PATCH v6 2/2] PCI/ERR: Split the fatal and non-fatal error recovery handling

2020-10-15 Thread Ethan Zhao
On Wed, Oct 14, 2020 at 5:00 PM Kuppuswamy Sathyanarayanan wrote: > > Commit bdb5ac85777d ("PCI/ERR: Handle fatal error recovery") > merged fatal and non-fatal error recovery paths, and also made > recovery code depend on hotplug handler for "remove affected > device + rescan" support. But this

Re: [PATCH v6 2/2] PCI/ERR: Split the fatal and non-fatal error recovery handling

2020-10-14 Thread Ethan Zhao
On Thu, Oct 15, 2020 at 11:04 AM Kuppuswamy, Sathyanarayanan wrote: > > > > On 10/14/20 6:58 PM, Ethan Zhao wrote: > > On Thu, Oct 15, 2020 at 1:06 AM Kuppuswamy, Sathyanarayanan > > wrote: > >> > >> > >> > >> On 10/14/20 8:07

Re: [PATCH v6 2/2] PCI/ERR: Split the fatal and non-fatal error recovery handling

2020-10-14 Thread Ethan Zhao
On Thu, Oct 15, 2020 at 1:06 AM Kuppuswamy, Sathyanarayanan wrote: > > > > On 10/14/20 8:07 AM, Ethan Zhao wrote: > > On Wed, Oct 14, 2020 at 5:00 PM Kuppuswamy Sathyanarayanan > > wrote: > >> > >> Commit bdb5ac85777d ("PCI/ERR: Handle fatal error r

Re: [PATCH v6 2/2] PCI/ERR: Split the fatal and non-fatal error recovery handling

2020-10-14 Thread Ethan Zhao
On Wed, Oct 14, 2020 at 5:00 PM Kuppuswamy Sathyanarayanan wrote: > > Commit bdb5ac85777d ("PCI/ERR: Handle fatal error recovery") > merged fatal and non-fatal error recovery paths, and also made > recovery code depend on hotplug handler for "remove affected > device + rescan" support. But this

Re: [PATCH v4 1/2] PCI/ERR: Call pci_bus_reset() before calling ->slot_reset() callback

2020-10-14 Thread Ethan Zhao
Please fix the building issue. drivers/pci/pcie/err.c:144:25: error: static declaration of ‘pcie_do_fatal_recovery’ follows non-static declaration static pci_ers_result_t pcie_do_fatal_recovery(struct pci_dev *dev, ^~ In file included from

Re: [PATCH v4 2/2] PCI/ERR: Split the fatal and non-fatal error recovery handling

2020-10-14 Thread Ethan Zhao
On Mon, Oct 12, 2020 at 1:10 PM wrote: > > From: Kuppuswamy Sathyanarayanan > > Commit bdb5ac85777d ("PCI/ERR: Handle fatal error recovery") > merged fatal and non-fatal error recovery paths, and also made > recovery code depend on hotplug handler for "remove affected > device + rescan" support.

Re: [PATCH v8 2/6] PCI/DPC: define a function to check and wait till port finish DPC handling

2020-10-08 Thread Ethan Zhao
On Thu, Oct 8, 2020 at 2:16 AM Kuppuswamy, Sathyanarayanan wrote: > > > On 10/7/20 4:31 AM, Ethan Zhao wrote: > > Once root port DPC capability is enabled and triggered, at the beginning > > of DPC is triggered, the DPC status bits are set by hardware and then > > sen

Re: [PATCH v8 2/6] PCI/DPC: define a function to check and wait till port finish DPC handling

2020-10-07 Thread Ethan Zhao
On Thu, Oct 8, 2020 at 2:16 AM Kuppuswamy, Sathyanarayanan wrote: > > > On 10/7/20 4:31 AM, Ethan Zhao wrote: > > Once root port DPC capability is enabled and triggered, at the beginning > > of DPC is triggered, the DPC status bits are set by hardware and then > > sen

Re: [PATCH v8 1/6] PCI/ERR: get device before call device driver to avoid NULL pointer dereference

2020-10-07 Thread Ethan Zhao
On Thu, Oct 8, 2020 at 1:24 AM Kuppuswamy, Sathyanarayanan wrote: > > > On 10/7/20 4:31 AM, Ethan Zhao wrote: > > During DPC error injection test we found there is race condition between > > pciehp and DPC driver, NULL pointer dereference caused panic as following > >

[PATCH v8 6/6] PCI/ERR: don't mix io state not changed and no driver together

2020-10-07 Thread Ethan Zhao
When we see 'can't recover (no error_detected callback)' on console, Maybe the reason is io state is not changed by calling pci_dev_set_io_state(), that is confused. fix it. Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang --- Chagnes: v2: no change. v3: no change. v4

[PATCH v8 0/6] Fix DPC hotplug race and enhance error handling

2020-10-07 Thread Ethan Zhao
of pci_dev_set_io_state(). per Ashok's request, add more description to this cover-letter part. Thanks, Ethan Ethan Zhao (6): PCI/ERR: get device before call device driver to avoid NULL pointer dereference PCI/DPC: define a function to check and wait till port finish DPC handling PCI

[PATCH v8 5/6] PCI/ERR: only return true when dev io state is really changed

2020-10-07 Thread Ethan Zhao
igned-off-by: Ethan Zhao --- Changnes: v2: revise description and code according to suggestion from Andy. v3: change code to simpler. v4: no change. v5: no change. v6: no change. v7: changed based on Bjorn's code and truth table. v8: according to Bjorn's suggestion, rebase on another simplification

[PATCH v8 4/6] PCI/ERR: simplify function pci_dev_set_io_state() with if

2020-10-07 Thread Ethan Zhao
No function change. Signed-off-by: Ethan Zhao --- Changes: v8: based on Bjorn's code and truth table, simplify the logic of function pci_dev_set_io_state(), no function change. drivers/pci/pci.h | 54 --- 1 file changed, 23 insertions(+), 31

[PATCH v8 1/6] PCI/ERR: get device before call device driver to avoid NULL pointer dereference

2020-10-07 Thread Ethan Zhao
t(). So does pci_dev_get() before using the device instance to avoid NULL pointer dereference. Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang --- Changes: v2: revise doc according to Andy's suggestion. v3: no change. v4: no change. v5: no change. v6: moved to [1

[PATCH v8 2/6] PCI/DPC: define a function to check and wait till port finish DPC handling

2020-10-07 Thread Ethan Zhao
atus and wait till the hardware and software completed the procedure. Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang --- changes: v2???align ICS code name to public doc. v3: no change. v4: response to Christoph's (Christoph Hellwig ) tip, move pci_wait_port_ou

[PATCH v8 3/6] PCI: pciehp: check and wait port status out of DPC before handling DLLSC and PDC

2020-10-07 Thread Ethan Zhao
ion. Brute DPC error injection script: for i in {0..100} do setpci -s 64:02.0 0x196.w=000a setpci -s 65:00.0 0x04.w=0544 mount /dev/nvme0n1p1 /root/nvme sleep 1 done Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang --- Changes: v2: revise do

Re: [PATCH v7 4/5] PCI: only return true when dev io state is really changed

2020-10-07 Thread Ethan Zhao
Bjorn, On Sun, Oct 4, 2020 at 12:44 AM Bjorn Helgaas wrote: > > On Sat, Oct 03, 2020 at 03:55:13AM -0400, Ethan Zhao wrote: > > When uncorrectable error happens, AER driver and DPC driver interrupt > > handlers likely call > > > >pcie_do_re

Re: [PATCH v7 3/5] PCI: pciehp: check and wait port status out of DPC before handling DLLSC and PDC

2020-10-07 Thread Ethan Zhao
Lukas, On Mon, Oct 5, 2020 at 3:13 AM Lukas Wunner wrote: > > On Sat, Oct 03, 2020 at 03:55:12AM -0400, Ethan Zhao wrote: > > When root port has DPC capability and it is enabled, then triggered by > > errors, DPC DLLSC and PDC etc interrupts will be sent to DPC driver, pciehp

Re: [PATCH v7 0/5] Fix DPC hotplug race and enhance error handling

2020-10-07 Thread Ethan Zhao
Raj, On Sun, Oct 4, 2020 at 12:57 PM Raj, Ashok wrote: > > Hi Ethan > > On Sat, Oct 03, 2020 at 03:55:09AM -0400, Ethan Zhao wrote: > > Hi,folks, > > > > This simple patch set fixed some serious security issues found when DPC > > error injection and NVMe SSD

[PATCH v7 5/5] PCI/ERR: don't mix io state not changed and no driver together

2020-10-03 Thread Ethan Zhao
When we see 'can't recover (no error_detected callback)' on console, Maybe the reason is io state is not changed by calling pci_dev_set_io_state(), that is confused. fix it. Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang --- Chagnes: v2: no change. v3: no change. v4

[PATCH v7 4/5] PCI: only return true when dev io state is really changed

2020-10-03 Thread Ethan Zhao
igned-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang Reviewed-by: Alexandru Gagniuc Reviewed-by: Andy Shevchenko --- Changnes: v2: revise description and code according to suggestion from Andy. v3: change code to simpler. v4: no change. v5: no change. v6: no change. v7: chan

[PATCH v7 1/5] PCI/ERR: get device before call device driver to avoid NULL pointer dereference

2020-10-03 Thread Ethan Zhao
t(). So does pci_dev_get() before using the device instance to avoid NULL pointer dereference. Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang --- Changes: v2: revise doc according to Andy's suggestion. v3: no change. v4: no change. v5: no change. v6: moved to [1

[PATCH v7 3/5] PCI: pciehp: check and wait port status out of DPC before handling DLLSC and PDC

2020-10-03 Thread Ethan Zhao
ion. Brute DPC error injection script: for i in {0..100} do setpci -s 64:02.0 0x196.w=000a setpci -s 65:00.0 0x04.w=0544 mount /dev/nvme0n1p1 /root/nvme sleep 1 done Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang --- Changes: v2: revise do

[PATCH v7 2/5] PCI/DPC: define a function to check and wait till port finish DPC handling

2020-10-03 Thread Ethan Zhao
atus and wait till the hardware and software completed the procedure. Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang --- changes: v2:align ICS code name to public doc. v3: no change. v4: response to Christoph's (Christoph Hellwig ) tip, move pci_wait_port_outdpc() to

[PATCH v7 0/5] Fix DPC hotplug race and enhance error handling

2020-10-03 Thread Ethan Zhao
code and truth table. change the patch[5/5] about the debug output information. Thanks, Ethan Ethan Zhao (5): PCI/ERR: get device before call device driver to avoid NULL pointer dereference PCI/DPC: define a function to check and wait till port finish DPC handling PCI: pciehp

Re: [PATCH v6 4/5] PCI: only return true when dev io state is really changed

2020-10-03 Thread Ethan Zhao
Bjorn, On Sat, Oct 3, 2020 at 1:29 AM Bjorn Helgaas wrote: > > [+cc Sinan] > > On Wed, Sep 30, 2020 at 03:05:36AM -0400, Ethan Zhao wrote: > > When uncorrectable error happens, AER driver and DPC driver interrupt > > handlers likely call > > > >pcie

Re: [PATCH v6 4/5] PCI: only return true when dev io state is really changed

2020-10-02 Thread Ethan Zhao
Sinan, On Sat, Oct 3, 2020 at 12:08 AM Sinan Kaya wrote: > > On 9/30/2020 3:05 AM, Ethan Zhao wrote: > > When uncorrectable error happens, AER driver and DPC driver interrupt > > handlers likely call > > > >pcie_do_recovery() > >->pci_walk_

[PATCH v6 3/5] PCI: pciehp: check and wait port status out of DPC before handling DLLSC and PDC

2020-09-30 Thread Ethan Zhao
ion. Brute DPC error injection script: for i in {0..100} do setpci -s 64:02.0 0x196.w=000a setpci -s 65:00.0 0x04.w=0544 mount /dev/nvme0n1p1 /root/nvme sleep 1 done Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang --- Changes: v2: revise do

[PATCH v6 0/5] Fix DPC hotplug race and enhance error handling

2020-09-30 Thread Ethan Zhao
driver and its declaration to pci.h. (tip from Christoph Hellwig ). v5: fix building issue reported by l...@intel.com with some config. v6: move patch[3/5] as the first patch according to Lukas's suggestion. and rewrite the comment part of patch[3/5]. Ethan Zhao (5): PCI/ERR: get device

[PATCH v6 5/5] PCI/ERR: don't mix io state not changed and no driver together

2020-09-30 Thread Ethan Zhao
When we see 'can't recover (no error_detected callback)' on console, Maybe the reason is io state is not changed by calling pci_dev_set_io_state(), that is confused. fix it. Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang --- Chagnes: v2: no change. v3: no change. v4

[PATCH v6 2/5] PCI/DPC: define a function to check and wait till port finish DPC handling

2020-09-30 Thread Ethan Zhao
atus and wait till the hardware and software completed the procedure. Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang --- changes: v2:align ICS code name to public doc. v3: no change. v4: response to Christoph's (Christoph Hellwig ) tip, move pci_wait_port_outdpc() to

[PATCH v6 4/5] PCI: only return true when dev io state is really changed

2020-09-30 Thread Ethan Zhao
state is pci_channel_io_frozen, that will cause AER or DPC handler re-enter the error detecting and recovery procedure one after another. The result is the recovery flow mixed between AER and DPC. So simplify the pci_dev_set_io_state() function to only return true when dev->error_state is changed. Signed-off-

[PATCH v6 1/5] PCI/ERR: get device before call device driver to avoid NULL pointer dereference

2020-09-30 Thread Ethan Zhao
ev_put(). So does pci_dev_get() before using the device instance to avoid NULL pointer dereference. Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang --- v2: revise doc according to Andy's suggestion. v3: no change. v4: no change. v5: no change. v6: moved to [1

Re: [PATCH 2/5 V2] PCI: pciehp: check and wait port status out of DPC before handling DLLSC and PDC

2020-09-29 Thread Ethan Zhao
On Tue, Sep 29, 2020 at 6:08 PM Lukas Wunner wrote: > > On Tue, Sep 29, 2020 at 05:46:41PM +0800, Ethan Zhao wrote: > > On Tue, Sep 29, 2020 at 4:29 PM Lukas Wunner wrote: > > > On Sun, Sep 27, 2020 at 11:27:46AM -0400, Sinan Kaya wrote: > > > > On 9/2

Re: [PATCH 3/5] PCI/ERR: get device before call device driver to avoid null pointer reference

2020-09-29 Thread Ethan Zhao
On Tue, Sep 29, 2020 at 6:48 PM Andy Shevchenko wrote: > > On Tue, Sep 29, 2020 at 05:38:00PM +0800, Ethan Zhao wrote: > > On Tue, Sep 29, 2020 at 4:51 PM Andy Shevchenko > > wrote: > > > On Tue, Sep 29, 2020 at 10:35:14AM +0800, Ethan Zhao wrote: > > > >

Re: [PATCH 2/5 V2] PCI: pciehp: check and wait port status out of DPC before handling DLLSC and PDC

2020-09-29 Thread Ethan Zhao
On Tue, Sep 29, 2020 at 4:29 PM Lukas Wunner wrote: > > On Sun, Sep 27, 2020 at 11:27:46AM -0400, Sinan Kaya wrote: > > On 9/26/2020 11:28 PM, Ethan Zhao wrote: > > > --- a/drivers/pci/hotplug/pciehp_hpc.c > > > +++ b/drivers/pci/hotplug/pciehp_hpc.c > > >

Re: [PATCH 3/5] PCI/ERR: get device before call device driver to avoid null pointer reference

2020-09-29 Thread Ethan Zhao
Andy, On Tue, Sep 29, 2020 at 4:51 PM Andy Shevchenko wrote: > > On Tue, Sep 29, 2020 at 10:35:14AM +0800, Ethan Zhao wrote: > > Preferred style, there will be cleared comment in v6. > > Avoid top postings. > > > On Sat, Sep 26, 2020 at 12:42 AM Andy Shevchenko >

Re: [PATCH 2/5 V2] PCI: pciehp: check and wait port status out of DPC before handling DLLSC and PDC

2020-09-28 Thread Ethan Zhao
On Tue, Sep 29, 2020 at 12:44 AM Sinan Kaya wrote: > > On 9/28/2020 7:10 AM, Sinan Kaya wrote: > > On 9/27/2020 10:01 PM, Zhao, Haifeng wrote: > >> Sinan, > >>I explained the reason why locks don't protect this case in the patch > >> description part. > >> Write side and read side hold

Re: [PATCH 3/5] PCI/ERR: get device before call device driver to avoid null pointer reference

2020-09-28 Thread Ethan Zhao
Preferred style, there will be cleared comment in v6. Thanks, Ethan On Sat, Sep 26, 2020 at 12:42 AM Andy Shevchenko wrote: > > On Thu, Sep 24, 2020 at 10:34:21PM -0400, Ethan Zhao wrote: > > During DPC error injection test we found there is race condition between > > pci

Re: [PATCH 1/5 V2] PCI: define a function to check and wait till port finish DPC handling

2020-09-28 Thread Ethan Zhao
Fixed this concern by moving the function to DPC driver and its declaration to pci.h. see v5 Thanks, Ethan On Sun, Sep 27, 2020 at 2:27 PM Christoph Hellwig wrote: > > > +#ifdef CONFIG_PCIE_DPC > > +static inline bool pci_wait_port_outdpc(struct pci_dev *pdev) > > +{ > > + u16 cap =

Re: [PATCH 2/5 V2] PCI: pciehp: check and wait port status out of DPC before handling DLLSC and PDC

2020-09-28 Thread Ethan Zhao
On Tue, Sep 29, 2020 at 12:45 AM Kuppuswamy, Sathyanarayanan wrote: > > > On 9/28/20 9:43 AM, Sinan Kaya wrote: > > On 9/28/2020 7:10 AM, Sinan Kaya wrote: > >> On 9/27/2020 10:01 PM, Zhao, Haifeng wrote: > >>> Sinan, > >>> I explained the reason why locks don't protect this case in the patch

Re: [PATCH 3/5 V55555] PCI/ERR: get device before call device driver to avoid NULL pointer reference

2020-09-28 Thread Ethan Zhao
On Mon, Sep 28, 2020 at 4:46 PM Andy Shevchenko wrote: > > On Mon, Sep 28, 2020 at 7:13 AM Ethan Zhao wrote: > > Same comments as per v4. > Also you have an issue in versioning here. Use -v parameter to `git > format-patch`, it will do it for you nicely. Aha, git has go

Re: [PATCH 2/5 V5] PCI: pciehp: check and wait port status out of DPC before handling DLLSC and PDC

2020-09-28 Thread Ethan Zhao
On Mon, Sep 28, 2020 at 4:45 PM Andy Shevchenko wrote: > > On Mon, Sep 28, 2020 at 7:10 AM Ethan Zhao wrote: > > We didn't settle on the v4, why v5? We could fix it with v6, v5 is used to fix other things. > > > When root port has DPC capability and it is enabled, then tr

Re: [PATCH v3 1/1] PCI/ERR: Fix reset logic in pcie_do_recovery() call

2020-09-28 Thread Ethan Zhao
Sathyanarayanan, On Mon, Sep 28, 2020 at 10:44 AM Kuppuswamy, Sathyanarayanan wrote: > > Hi, > > On 9/25/20 11:30 AM, Sinan Kaya wrote: > > On 9/25/2020 2:16 PM, Kuppuswamy, Sathyanarayanan wrote: > >>> > >>> If this is a too involved change, DPC driver should restore state > >>> when hotplug is

[PATCH 5/5 V5] PCI/ERR: don't mix io state not changed and no driver together

2020-09-27 Thread Ethan Zhao
When we see 'can't recover (no error_detected callback)' on console, Maybe the reason is io state is not changed by calling pci_dev_set_io_state(), that is confused. fix it. Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang --- Chagnes: V2: no change. V3: no change. V4

[PATCH 2/5 V5] PCI: pciehp: check and wait port status out of DPC before handling DLLSC and PDC

2020-09-27 Thread Ethan Zhao
ery action first, then DPC driver handling the call-back from device drivers, clear the DPC status, at the end, pciehp handle the DLLSC and PDC etc. Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang --- Changes: V2: revise doc according to Andy's suggestion. V3: no change. V4:

[PATCH 1/5 V5] PCI: define a function to check and wait till port finish DPC handling

2020-09-27 Thread Ethan Zhao
atus and wait till the hardware and software completed the procedure. Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang --- changes: V2:align ICS code name to public doc. V3: no change. V4: response to Christoph's (Christoph Hellwig ) tip, move pci_wait_port_ou

[PATCH 0/5 V5] Fix DPC hotplug race and enhance error handling

2020-09-27 Thread Ethan Zhao
declaration to pci.h. (tip from Christoph Hellwig ). V5: fix building issue reported by l...@intel.com with some config. Thanks, Ethan Ethan Zhao (5): PCI: define a function to check and wait till port finish DPC handling PCI: pciehp: check and wait port status out of DPC before handling

[PATCH 4/5 V4] PCI: only return true when dev io state is really changed

2020-09-27 Thread Ethan Zhao
state is pci_channel_io_frozen, that will cause AER or DPC handler re-enter the error detecting and recovery procedure one after another. The result is the recovery flow mixed between AER and DPC. So simplify the pci_dev_set_io_state() function to only return true when dev->error_state is changed. Signed-off-

[PATCH 3/5 V55555] PCI/ERR: get device before call device driver to avoid NULL pointer reference

2020-09-27 Thread Ethan Zhao
ion script: for i in {0..100} do setpci -s 64:02.0 0x196.w=000a setpci -s 65:00.0 0x04.w=0544 mount /dev/nvme0n1p1 /root/nvme sleep 1 done Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang --- Changes: V2: revise doc according to Andy's sugge

[PATCH 5/5 V4] PCI/ERR: don't mix io state not changed and no driver together

2020-09-27 Thread Ethan Zhao
When we see 'can't recover (no error_detected callback)' on console, Maybe the reason is io state is not changed by calling pci_dev_set_io_state(), that is confused. fix it. Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang --- Chagnes: V2: no change. V3: no change. V4

[PATCH 3/5 V4] PCI/ERR: get device before call device driver to avoid NULL pointer reference

2020-09-27 Thread Ethan Zhao
ion script: for i in {0..100} do setpci -s 64:02.0 0x196.w=000a setpci -s 65:00.0 0x04.w=0544 mount /dev/nvme0n1p1 /root/nvme sleep 1 done Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang Reviewed-by: Andy Shevchenko --- Changes: V2: revise doc

[PATCH 4/5 V4] PCI: only return true when dev io state is really changed

2020-09-27 Thread Ethan Zhao
state is pci_channel_io_frozen, that will cause AER or DPC handler re-enter the error detecting and recovery procedure one after another. The result is the recovery flow mixed between AER and DPC. So simplify the pci_dev_set_io_state() function to only return true when dev->error_state is changed. Signed-off-

[PATCH 2/5 V4] PCI: pciehp: check and wait port status out of DPC before handling DLLSC and PDC

2020-09-27 Thread Ethan Zhao
ery action first, then DPC driver handling the call-back from device drivers, clear the DPC status, at the end, pciehp handle the DLLSC and PDC etc. Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang Reviewed-by: Andy Shevchenko --- Changes: V2: revise doc according to Andy's

[PATCH 0/5 V4] Fix DPC hotplug race and enhance error handling

2020-09-27 Thread Ethan Zhao
declaration to pci.h. (tip from Christoph Hellwig ). Thanks, Ethan Ethan Zhao (5): PCI: define a function to check and wait till port finish DPC handling PCI: pciehp: check and wait port status out of DPC before handling DLLSC and PDC PCI/ERR: get device before call device driver to avoid

[PATCH 1/5 V4] PCI: define a function to check and wait till port finish DPC handling

2020-09-27 Thread Ethan Zhao
atus and wait till the hardware and software completed the procedure. Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang Reviewed-by: Andy Shevchenko Reviewed-by: Christoph Hellwig --- changes: V2:align ICS code name to public doc. V3: no change. V4: response to Christo

[PATCH 5/5 V3] PCI/ERR: don't mix io state not changed and no driver together

2020-09-26 Thread Ethan Zhao
When we see 'can't recover (no error_detected callback)' on console, Maybe the reason is io state is not changed by calling pci_dev_set_io_state(), that is confused. fix it. Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang --- Chagnes: V2: no change. V3: no change

[PATCH 3/5 V3] PCI/ERR: get device before call device driver to avoid NULL pointer reference

2020-09-26 Thread Ethan Zhao
ion script: for i in {0..100} do setpci -s 64:02.0 0x196.w=000a setpci -s 65:00.0 0x04.w=0544 mount /dev/nvme0n1p1 /root/nvme sleep 1 done Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang Reviewed-by: Andy Shevchenko --- Changes: V2: revise doc

[PATCH 4/5 V3] PCI: only return true when dev io state is really changed

2020-09-26 Thread Ethan Zhao
state is pci_channel_io_frozen, that will cause AER or DPC handler re-enter the error detecting and recovery procedure one after another. The result is the recovery flow mixed between AER and DPC. So simplify the pci_dev_set_io_state() function to only return true when dev->error_state is changed. Signed-off-

[PATCH 0/5 V3] Fix DPC hotplug race and enhance error handling

2020-09-26 Thread Ethan Zhao
done Other details see every commits description part. This patch set could be applied to stable 5.9-rc6 directly. Help to review and test. V2: changed according to review by Andy Shevchenko. V3: changed patch 4/5 to simpler coding. Thanks, Ethan Ethan Zhao (5): PCI: define a function

[PATCH 2/5 V3] PCI: pciehp: check and wait port status out of DPC before handling DLLSC and PDC

2020-09-26 Thread Ethan Zhao
ery action first, then DPC driver handling the call-back from device drivers, clear the DPC status, at the end, pciehp handle the DLLSC and PDC etc. Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang Reviewed-by: Andy Shevchenko --- Changes: V2: revise doc according to Andy's

[PATCH 1/5 V3] PCI: define a function to check and wait till port finish DPC handling

2020-09-26 Thread Ethan Zhao
atus and wait till the hardware and software completed the procedure. Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang Reviewed-by: Andy Shevchenko --- changes: V2:align ICS code name to public doc. V3: no change. include/linux/pci.h | 31 ++

[PATCH 2/5 V2] PCI: pciehp: check and wait port status out of DPC before handling DLLSC and PDC

2020-09-26 Thread Ethan Zhao
ery action first, then DPC driver handling the call-back from device drivers, clear the DPC status, at the end, pciehp handle the DLLSC and PDC etc. Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang Reviewed-by: Andy Shevchenko --- Changes: V2: revise doc according to Andy's

[PATCH 3/5 V2] PCI/ERR: get device before call device driver to avoid NULL pointer reference

2020-09-26 Thread Ethan Zhao
ion script: for i in {0..100} do setpci -s 64:02.0 0x196.w=000a setpci -s 65:00.0 0x04.w=0544 mount /dev/nvme0n1p1 /root/nvme sleep 1 done Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang Reviewed-by: Andy Shevchenko --- Changes: V2: revise doc

[PATCH 5/5 V2] PCI/ERR: don't mix io state not changed and no driver together

2020-09-26 Thread Ethan Zhao
When we see 'can't recover (no error_detected callback)' on console, Maybe the reason is io state is not changed by calling pci_dev_set_io_state(), that is confused. fix it. Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang --- Chagnes: V2: no change. drivers/pci/pcie

[PATCH 4/5 V2] PCI: only return true when dev io state is really changed

2020-09-26 Thread Ethan Zhao
state is pci_channel_io_frozen, that will cause AER or DPC handler re-enter the error detecting and recovery procedure one after another. The result is the recovery flow mixed between AER and DPC. So simplify the pci_dev_set_io_state() function to only return true when dev->error_state is changed. Signed-off-

[PATCH 1/5 V2] PCI: define a function to check and wait till port finish DPC handling

2020-09-26 Thread Ethan Zhao
atus and wait till the hardware and software completed the procedure. Signed-off-by: Ethan Zhao Tested-by: Wen Jin Tested-by: Shanshan Zhang Reviewed-by: Andy Shevchenko --- changes: V2:align ICS code name to public doc. include/linux/pci.h | 31 +++ 1 file changed

[PATCH 0/5 V2] Fix DPC hotplug race and enhance error handling

2020-09-26 Thread Ethan Zhao
done Other details see every commits description part. This patch set could be applied to stable 5.9-rc6 directly. Help to review and test. V2: changed according to review by Andy Shevchenko. Thanks, Ethan Ethan Zhao (5): PCI: define a function to check and wait till port finish DPC handling

[PATCH 5/5] PCI/ERR: don't mix io state not changed and no driver together

2020-09-24 Thread Ethan Zhao
When we see 'can't recover (no error_detected callback)' on console, Maybe the reason is io state is not changed by calling pci_dev_set_io_state(), that is confused. fix it. Signed-off-by: Ethan Zhao Tested-by: Wen jin Tested-by: Shanshan Zhang --- drivers/pci/pcie/err.c | 6 -- 1 file

[PATCH 4/5] PCI: only return true when dev io state is really changed

2020-09-24 Thread Ethan Zhao
frozen, that will cause AER or DPC handler re-enter the error detecting and recovery procedure one after another. The result is the recovery flow mixed between AER and DPC. So simplify the pci_dev_set_io_state() function to only return true when dev->error_state is changed. Signed-off-by: Ethan Zha

[PATCH 0/5] Fix DPC hotplug race and enhance error hanlding

2020-09-24 Thread Ethan Zhao
directly. Help to review and test. Thanks, Ethan Ethan Zhao (5): PCI: define a function to check and wait till port finish DPC handling PCI: pciehp: check and wait port status out of DPC before handling DLLSC and PDC PCI/ERR: get device before call device driver to avoid null pointer

[PATCH] Revert "block: revert back to synchronous request_queue removal"

2020-09-08 Thread Ethan Zhao
From: Ethan Zhao 'commit e8c7d14ac6c3 ("block: revert back to synchronous request_queue removal")' introduced panic issue to NVMe hotplug as following(hit after just 2 times NVMe SSD hotplug under stable 5.9-RC2): BUG: sleeping function called from invalid context at block/ge

Re: [PATCH 04/27] Restrict /dev/mem and /dev/kmem when the kernel is locked down

2017-10-24 Thread Ethan Zhao
David, May I ask a question here -- Is it intentionally enabling the read-only mode, so userspace tools like dmidecode could work with kernel_is_locked_down ? while it was impossible to work with the attached patch applied. Is it a security policy change with secure boot ? Thanks, Ethan On

Re: [PATCH 04/27] Restrict /dev/mem and /dev/kmem when the kernel is locked down

2017-10-24 Thread Ethan Zhao
David, May I ask a question here -- Is it intentionally enabling the read-only mode, so userspace tools like dmidecode could work with kernel_is_locked_down ? while it was impossible to work with the attached patch applied. Is it a security policy change with secure boot ? Thanks, Ethan On

[tip:sched/core] sched/sysctl: Check user input value of sysctl_sched_time_avg

2017-09-29 Thread tip-bot for Ethan Zhao
Commit-ID: 5ccba44ba118a500050076b0344632459779 Gitweb: https://git.kernel.org/tip/5ccba44ba118a500050076b0344632459779 Author: Ethan Zhao <ethan.z...@oracle.com> AuthorDate: Mon, 4 Sep 2017 13:59:34 +0800 Committer: Ingo Molnar <mi...@kernel.org> CommitDate: Fri, 29

[tip:sched/core] sched/sysctl: Check user input value of sysctl_sched_time_avg

2017-09-29 Thread tip-bot for Ethan Zhao
Commit-ID: 5ccba44ba118a500050076b0344632459779 Gitweb: https://git.kernel.org/tip/5ccba44ba118a500050076b0344632459779 Author: Ethan Zhao AuthorDate: Mon, 4 Sep 2017 13:59:34 +0800 Committer: Ingo Molnar CommitDate: Fri, 29 Sep 2017 13:20:13 +0200 sched/sysctl: Check user

Re: [PATCH v2] sched: check user input value of sysctl_sched_time_avg

2017-09-06 Thread Ethan Zhao
On 2017/9/7 3:50, Luis R. Rodriguez wrote: On Mon, Sep 04, 2017 at 03:54:23PM +0800, Ethan Zhao wrote: Peter, On 2017/9/4 15:49, Peter Zijlstra wrote: On Sat, Sep 02, 2017 at 02:57:32PM +0800, Ethan Zhao wrote: diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 6648fbb..609bed2 100644

Re: [PATCH v2] sched: check user input value of sysctl_sched_time_avg

2017-09-06 Thread Ethan Zhao
On 2017/9/7 3:50, Luis R. Rodriguez wrote: On Mon, Sep 04, 2017 at 03:54:23PM +0800, Ethan Zhao wrote: Peter, On 2017/9/4 15:49, Peter Zijlstra wrote: On Sat, Sep 02, 2017 at 02:57:32PM +0800, Ethan Zhao wrote: diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 6648fbb..609bed2 100644

Re: [PATCH v2] sched: check user input value of sysctl_sched_time_avg

2017-09-04 Thread Ethan Zhao
Peter, On 2017/9/4 15:49, Peter Zijlstra wrote: On Sat, Sep 02, 2017 at 02:57:32PM +0800, Ethan Zhao wrote: diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 6648fbb..609bed2 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -367,7 +367,7 @@ static int sysrq_sysctl_handler(struct

Re: [PATCH v2] sched: check user input value of sysctl_sched_time_avg

2017-09-04 Thread Ethan Zhao
Peter, On 2017/9/4 15:49, Peter Zijlstra wrote: On Sat, Sep 02, 2017 at 02:57:32PM +0800, Ethan Zhao wrote: diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 6648fbb..609bed2 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -367,7 +367,7 @@ static int sysrq_sysctl_handler(struct

Re: [PATCH] sched: reset sysctl_sched_time_avg to default when

2017-09-04 Thread Ethan Zhao
Peter, On 2017/9/4 15:32, Peter Zijlstra wrote: On Sat, Sep 02, 2017 at 08:33:03AM +0800, Ethan Zhao wrote: Yep, that is the first place I considered to set the limit, but that would break KABI ? nah..   V4 sent, please ignore v2 & v3.   Thanks,   Ethan

Re: [PATCH] sched: reset sysctl_sched_time_avg to default when

2017-09-04 Thread Ethan Zhao
Peter, On 2017/9/4 15:32, Peter Zijlstra wrote: On Sat, Sep 02, 2017 at 08:33:03AM +0800, Ethan Zhao wrote: Yep, that is the first place I considered to set the limit, but that would break KABI ? nah..   V4 sent, please ignore v2 & v3.   Thanks,   Ethan

[PATCH v4] sched: check user input value of sysctl_sched_time_avg

2017-09-04 Thread Ethan Zhao
-by: James Puthukattukaran <james.puthukattuka...@oracle.com> Signed-off-by: Ethan Zhao <ethan.z...@oracle.com> --- v2: Check it at user input side in sysctl table (peterz). v3: Use proc_dointvec_minmax(). v4: Fix a too long line in descripton part. kernel/sysctl.c | 3 ++- 1 fil

[PATCH v4] sched: check user input value of sysctl_sched_time_avg

2017-09-04 Thread Ethan Zhao
-by: James Puthukattukaran Signed-off-by: Ethan Zhao --- v2: Check it at user input side in sysctl table (peterz). v3: Use proc_dointvec_minmax(). v4: Fix a too long line in descripton part. kernel/sysctl.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/sysctl.c b

[PATCH v3] sched: check user input value of sysctl_sched_time_avg

2017-09-04 Thread Ethan Zhao
-by: James Puthukattukaran <james.puthukattuka...@oracle.com> Signed-off-by: Ethan Zhao <ethan.z...@oracle.com> --- v2: Check it at user input side in sysctl table (peterz). v3: Use proc_dointvec_minmax(). kernel/sysctl.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --

[PATCH v3] sched: check user input value of sysctl_sched_time_avg

2017-09-04 Thread Ethan Zhao
-by: James Puthukattukaran Signed-off-by: Ethan Zhao --- v2: Check it at user input side in sysctl table (peterz). v3: Use proc_dointvec_minmax(). kernel/sysctl.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 6648fbb..423554a 100644

[PATCH v2] sched: check user input value of sysctl_sched_time_avg

2017-09-02 Thread Ethan Zhao
com> Signed-off-by: Ethan Zhao <ethan.z...@oracle.com> --- v2: check it in sysctl table (input side) as Peter suggested. Tested on stable 4.1, applied on stable 4.13-rc5 okay. include/linux/sched/sysctl.h | 3 +++ kernel/sched/fair.c | 26 ++ ker

[PATCH v2] sched: check user input value of sysctl_sched_time_avg

2017-09-02 Thread Ethan Zhao
= div_u64(avg, total); ... } Seems this issue could be reproduced on all I tried stable 4.1 - last kernel. To fix this issue, check user input value of sysctl_sched_time_avg, keep it unchanged when hit invalid input. Reported-by: James Puthukattukaran Signed-off-by: Ethan Zhao --- v2

Re: [PATCH] sched: reset sysctl_sched_time_avg to default when

2017-09-02 Thread Ethan Zhao
at 8:33 AM, Ethan Zhao <ethan.ker...@gmail.com> wrote: > Yep, that is the first place I considered to set the limit, but that would > break KABI ? > > Thanks, > Ethan > > On Fri, Sep 1, 2017 at 8:32 PM, Peter Zijlstra <pet...@infradead.org> wrote: >> On Fri, S

Re: [PATCH] sched: reset sysctl_sched_time_avg to default when

2017-09-02 Thread Ethan Zhao
at 8:33 AM, Ethan Zhao wrote: > Yep, that is the first place I considered to set the limit, but that would > break KABI ? > > Thanks, > Ethan > > On Fri, Sep 1, 2017 at 8:32 PM, Peter Zijlstra wrote: >> On Fri, Sep 01, 2017 at 07:31:54PM +0800, Ethan Zhao wr

  1   2   3   4   5   6   7   >