Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-27 Thread Leon Romanovsky
On Fri, Mar 26, 2021 at 08:20:07AM -0600, Alex Williamson wrote:
> On Fri, 26 Mar 2021 09:40:30 +0300
> Leon Romanovsky  wrote:

<...>

> > 
> > It supports by writing: echo "bus,pm" > reset_methods.
> > Regarding comma, IMHO it is easiest pattern for the parsing.
> > 
> > Anyway, The in-kernel implementation is not important to me.
> 
> Too bad, it should have been apparent from the sample code that it was
> using a comma separated list with re-ordering support.  Thanks,

Excellent, both of us think that "bus,pm" is the easiest way to
implement policy decision.

Thanks

> 
> Alex
> 


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-26 Thread Alex Williamson
On Fri, 26 Mar 2021 09:40:30 +0300
Leon Romanovsky  wrote:

> On Thu, Mar 25, 2021 at 11:53:24AM -0600, Alex Williamson wrote:
> > On Thu, 25 Mar 2021 18:09:58 +0200
> > Leon Romanovsky  wrote:
> >   
> > > On Thu, Mar 25, 2021 at 08:55:04AM -0600, Alex Williamson wrote:  
> > > > On Thu, 25 Mar 2021 10:37:54 +0200
> > > > Leon Romanovsky  wrote:
> > > > 
> > > > > On Wed, Mar 24, 2021 at 11:17:29AM -0600, Alex Williamson wrote:
> > > > > > On Wed, 24 Mar 2021 17:13:56 +0200
> > > > > > Leon Romanovsky  wrote:  
> > > > > 
> > > > > <...>
> > > > > 
> > > > > > > Yes, and real testing/debugging almost always requires kernel 
> > > > > > > rebuild.
> > > > > > > Everything else is waste of time.  
> > > > > > 
> > > > > > Sorry, this is nonsense.  Allowing users to debug issues without a 
> > > > > > full
> > > > > > kernel rebuild is a good thing.  
> > > > > 
> > > > > It is far from debug, this interface doesn't give you any answers why
> > > > > the reset didn't work, it just helps you to find the one that works.
> > > > > 
> > > > > Unless you believe that this information will be enough to understand
> > > > > the root cause, you will need to ask from the user to perform extra
> > > > > tests, maybe try some quirk. All of that requires from the users to
> > > > > rebuild their kernel.
> > > > > 
> > > > > So no, it is not debug.
> > > > 
> > > > It allows a user to experiment to determine (a) my device doesn't work
> > > > in a given scenario with the default configuration, but (b) if I change
> > > > the reset to this other thing it does work.  That is a step in
> > > > debugging.
> > > > 
> > > > It's absurd to think that a sysfs attribute could provide root cause,
> > > > but it might be enough for someone to further help that user.  It would
> > > > be a useful clue for a bug report.  Yes, reaching root cause might
> > > > involve building a kernel, but that doesn't invalidate that having a
> > > > step towards debugging in the base kernel might be a useful tool.
> > > 
> > > Let's agree to do not agree.
> > >   
> > > > 
> > > > > > > > > > For policy preference, I already described how I've 
> > > > > > > > > > configured QEMU to
> > > > > > > > > > prefer a bus reset rather than a PM reset due to lack of 
> > > > > > > > > > specification
> > > > > > > > > > regarding the scope of a PM "soft reset".  This interface 
> > > > > > > > > > would allow a
> > > > > > > > > > system policy to do that same thing.
> > > > > > > > > > 
> > > > > > > > > > I don't think anyone is suggesting this as a means to avoid 
> > > > > > > > > > quirks that
> > > > > > > > > > would resolve reset issues and create the best default 
> > > > > > > > > > general behavior.
> > > > > > > > > > This provides a mechanism to test various reset methods, 
> > > > > > > > > > and thereby
> > > > > > > > > > identify broken methods, and set a policy.  Sure, that 
> > > > > > > > > > policy might be
> > > > > > > > > > to avoid a broken reset in the interim before it gets 
> > > > > > > > > > quirked and
> > > > > > > > > > there's potential for abuse there, but I think the benefits 
> > > > > > > > > > outweigh
> > > > > > > > > > the risks.  
> > > > > > > > > 
> > > > > > > > > This interface is proposed as first class citizen in the 
> > > > > > > > > general sysfs
> > > > > > > > > layout. Of course, it will be seen as a way to bypass the 
> > > > > > > > > kernel.
> > > > > > > > > 
> > > > > > > > > At least, put it under CONFIG_EXPERT option, so no distro 
> > > > > > > > > will enable it
> > > > > > > > > by default.
> > > > > > > > 
> > > > > > > > Of course we're proposing it to be accessible, it should also 
> > > > > > > > require
> > > > > > > > admin privileges to modify, sysfs has lots of such things.  If 
> > > > > > > > it's
> > > > > > > > relegated to non-default accessibility, it won't be used for 
> > > > > > > > testing
> > > > > > > > and it won't be available for system policy and it's pointless. 
> > > > > > > >
> > > > > > > 
> > > > > > > We probably have difference in view of what testing is. I expect 
> > > > > > > from
> > > > > > > the users who experience issues with reset to do extra steps and 
> > > > > > > one of
> > > > > > > them is to require from them to compile their kernel.  
> > > > > > 
> > > > > > I would define the ability to generate a CI test that can pick a
> > > > > > device, unbind it from its driver, and iterate reset methods as a
> > > > > > worthwhile improvement in testing.  
> > > > > 
> > > > > Who is going to run this CI? At least all kernel CIs (external and
> > > > > internal to HW vendors) that I'm familiar are building kernel 
> > > > > themselves.
> > > > > 
> > > > > Distro kernel is too bloat to be really usable for CI.
> > > > 
> > > > At this point I'm suspicious you're trolling.  A distro kernel CI
> > > > certainly uses the kernel they intend to ship and support in their
> > > > 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-26 Thread Leon Romanovsky
On Fri, Mar 26, 2021 at 10:18:25AM +0100, Krzysztof Wilczyński wrote:
> Hello,
> 
> [...]
> 
> Aside of the sysfs interface, would this new functionality also require
> anything to be overridden at boot time via passing some command-line
> arguments?  Not sure how relevant such thing would be to device, but,
> whatnot reset, though.

This is per-device property and can't be universally correct like kernel
command-line arguments. I don't think that we need to add such functionality.

> 
> I am curious whether there would be a need for anything like that.

I prefer not.

> 
> Krzysztof


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-26 Thread Krzysztof Wilczyński
Hello,

[...]

Aside of the sysfs interface, would this new functionality also require
anything to be overridden at boot time via passing some command-line
arguments?  Not sure how relevant such thing would be to device, but,
whatnot reset, though.

I am curious whether there would be a need for anything like that.

Krzysztof


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-26 Thread Leon Romanovsky
On Thu, Mar 25, 2021 at 11:53:24AM -0600, Alex Williamson wrote:
> On Thu, 25 Mar 2021 18:09:58 +0200
> Leon Romanovsky  wrote:
> 
> > On Thu, Mar 25, 2021 at 08:55:04AM -0600, Alex Williamson wrote:
> > > On Thu, 25 Mar 2021 10:37:54 +0200
> > > Leon Romanovsky  wrote:
> > >   
> > > > On Wed, Mar 24, 2021 at 11:17:29AM -0600, Alex Williamson wrote:  
> > > > > On Wed, 24 Mar 2021 17:13:56 +0200
> > > > > Leon Romanovsky  wrote:
> > > > 
> > > > <...>
> > > >   
> > > > > > Yes, and real testing/debugging almost always requires kernel 
> > > > > > rebuild.
> > > > > > Everything else is waste of time.
> > > > > 
> > > > > Sorry, this is nonsense.  Allowing users to debug issues without a 
> > > > > full
> > > > > kernel rebuild is a good thing.
> > > > 
> > > > It is far from debug, this interface doesn't give you any answers why
> > > > the reset didn't work, it just helps you to find the one that works.
> > > > 
> > > > Unless you believe that this information will be enough to understand
> > > > the root cause, you will need to ask from the user to perform extra
> > > > tests, maybe try some quirk. All of that requires from the users to
> > > > rebuild their kernel.
> > > > 
> > > > So no, it is not debug.  
> > > 
> > > It allows a user to experiment to determine (a) my device doesn't work
> > > in a given scenario with the default configuration, but (b) if I change
> > > the reset to this other thing it does work.  That is a step in
> > > debugging.
> > > 
> > > It's absurd to think that a sysfs attribute could provide root cause,
> > > but it might be enough for someone to further help that user.  It would
> > > be a useful clue for a bug report.  Yes, reaching root cause might
> > > involve building a kernel, but that doesn't invalidate that having a
> > > step towards debugging in the base kernel might be a useful tool.  
> > 
> > Let's agree to do not agree.
> > 
> > >   
> > > > > > > > > For policy preference, I already described how I've 
> > > > > > > > > configured QEMU to
> > > > > > > > > prefer a bus reset rather than a PM reset due to lack of 
> > > > > > > > > specification
> > > > > > > > > regarding the scope of a PM "soft reset".  This interface 
> > > > > > > > > would allow a
> > > > > > > > > system policy to do that same thing.
> > > > > > > > > 
> > > > > > > > > I don't think anyone is suggesting this as a means to avoid 
> > > > > > > > > quirks that
> > > > > > > > > would resolve reset issues and create the best default 
> > > > > > > > > general behavior.
> > > > > > > > > This provides a mechanism to test various reset methods, and 
> > > > > > > > > thereby
> > > > > > > > > identify broken methods, and set a policy.  Sure, that policy 
> > > > > > > > > might be
> > > > > > > > > to avoid a broken reset in the interim before it gets quirked 
> > > > > > > > > and
> > > > > > > > > there's potential for abuse there, but I think the benefits 
> > > > > > > > > outweigh
> > > > > > > > > the risks.
> > > > > > > > 
> > > > > > > > This interface is proposed as first class citizen in the 
> > > > > > > > general sysfs
> > > > > > > > layout. Of course, it will be seen as a way to bypass the 
> > > > > > > > kernel.
> > > > > > > > 
> > > > > > > > At least, put it under CONFIG_EXPERT option, so no distro will 
> > > > > > > > enable it
> > > > > > > > by default.  
> > > > > > > 
> > > > > > > Of course we're proposing it to be accessible, it should also 
> > > > > > > require
> > > > > > > admin privileges to modify, sysfs has lots of such things.  If 
> > > > > > > it's
> > > > > > > relegated to non-default accessibility, it won't be used for 
> > > > > > > testing
> > > > > > > and it won't be available for system policy and it's pointless.   
> > > > > > >
> > > > > > 
> > > > > > We probably have difference in view of what testing is. I expect 
> > > > > > from
> > > > > > the users who experience issues with reset to do extra steps and 
> > > > > > one of
> > > > > > them is to require from them to compile their kernel.
> > > > > 
> > > > > I would define the ability to generate a CI test that can pick a
> > > > > device, unbind it from its driver, and iterate reset methods as a
> > > > > worthwhile improvement in testing.
> > > > 
> > > > Who is going to run this CI? At least all kernel CIs (external and
> > > > internal to HW vendors) that I'm familiar are building kernel 
> > > > themselves.
> > > > 
> > > > Distro kernel is too bloat to be really usable for CI.  
> > > 
> > > At this point I'm suspicious you're trolling.  A distro kernel CI
> > > certainly uses the kernel they intend to ship and support in their
> > > environment. You're concerned about a bloated kernel, but the proposal
> > > here adds 2-bytes per device to track reset methods and a trivial array
> > > in text memory, meanwhile you're proposing multiple per-device memory
> > > allocations to enhance the feature you think is too bloated for CI.  
> > 
> > I 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-25 Thread Alex Williamson
On Thu, 25 Mar 2021 18:09:58 +0200
Leon Romanovsky  wrote:

> On Thu, Mar 25, 2021 at 08:55:04AM -0600, Alex Williamson wrote:
> > On Thu, 25 Mar 2021 10:37:54 +0200
> > Leon Romanovsky  wrote:
> >   
> > > On Wed, Mar 24, 2021 at 11:17:29AM -0600, Alex Williamson wrote:  
> > > > On Wed, 24 Mar 2021 17:13:56 +0200
> > > > Leon Romanovsky  wrote:
> > > 
> > > <...>
> > >   
> > > > > Yes, and real testing/debugging almost always requires kernel rebuild.
> > > > > Everything else is waste of time.
> > > > 
> > > > Sorry, this is nonsense.  Allowing users to debug issues without a full
> > > > kernel rebuild is a good thing.
> > > 
> > > It is far from debug, this interface doesn't give you any answers why
> > > the reset didn't work, it just helps you to find the one that works.
> > > 
> > > Unless you believe that this information will be enough to understand
> > > the root cause, you will need to ask from the user to perform extra
> > > tests, maybe try some quirk. All of that requires from the users to
> > > rebuild their kernel.
> > > 
> > > So no, it is not debug.  
> > 
> > It allows a user to experiment to determine (a) my device doesn't work
> > in a given scenario with the default configuration, but (b) if I change
> > the reset to this other thing it does work.  That is a step in
> > debugging.
> > 
> > It's absurd to think that a sysfs attribute could provide root cause,
> > but it might be enough for someone to further help that user.  It would
> > be a useful clue for a bug report.  Yes, reaching root cause might
> > involve building a kernel, but that doesn't invalidate that having a
> > step towards debugging in the base kernel might be a useful tool.  
> 
> Let's agree to do not agree.
> 
> >   
> > > > > > > > For policy preference, I already described how I've configured 
> > > > > > > > QEMU to
> > > > > > > > prefer a bus reset rather than a PM reset due to lack of 
> > > > > > > > specification
> > > > > > > > regarding the scope of a PM "soft reset".  This interface would 
> > > > > > > > allow a
> > > > > > > > system policy to do that same thing.
> > > > > > > > 
> > > > > > > > I don't think anyone is suggesting this as a means to avoid 
> > > > > > > > quirks that
> > > > > > > > would resolve reset issues and create the best default general 
> > > > > > > > behavior.
> > > > > > > > This provides a mechanism to test various reset methods, and 
> > > > > > > > thereby
> > > > > > > > identify broken methods, and set a policy.  Sure, that policy 
> > > > > > > > might be
> > > > > > > > to avoid a broken reset in the interim before it gets quirked 
> > > > > > > > and
> > > > > > > > there's potential for abuse there, but I think the benefits 
> > > > > > > > outweigh
> > > > > > > > the risks.
> > > > > > > 
> > > > > > > This interface is proposed as first class citizen in the general 
> > > > > > > sysfs
> > > > > > > layout. Of course, it will be seen as a way to bypass the kernel.
> > > > > > > 
> > > > > > > At least, put it under CONFIG_EXPERT option, so no distro will 
> > > > > > > enable it
> > > > > > > by default.  
> > > > > > 
> > > > > > Of course we're proposing it to be accessible, it should also 
> > > > > > require
> > > > > > admin privileges to modify, sysfs has lots of such things.  If it's
> > > > > > relegated to non-default accessibility, it won't be used for testing
> > > > > > and it won't be available for system policy and it's pointless. 
> > > > > >  
> > > > > 
> > > > > We probably have difference in view of what testing is. I expect from
> > > > > the users who experience issues with reset to do extra steps and one 
> > > > > of
> > > > > them is to require from them to compile their kernel.
> > > > 
> > > > I would define the ability to generate a CI test that can pick a
> > > > device, unbind it from its driver, and iterate reset methods as a
> > > > worthwhile improvement in testing.
> > > 
> > > Who is going to run this CI? At least all kernel CIs (external and
> > > internal to HW vendors) that I'm familiar are building kernel themselves.
> > > 
> > > Distro kernel is too bloat to be really usable for CI.  
> > 
> > At this point I'm suspicious you're trolling.  A distro kernel CI
> > certainly uses the kernel they intend to ship and support in their
> > environment. You're concerned about a bloated kernel, but the proposal
> > here adds 2-bytes per device to track reset methods and a trivial array
> > in text memory, meanwhile you're proposing multiple per-device memory
> > allocations to enhance the feature you think is too bloated for CI.  
> 
> I don't know why you decided to focus on memory footprint which is not
> important at all during CI runs. The bloat is in Kconfig options that
> are not needed. Those extra options add significant overhead during
> builds and runs itself.
> 
> And not, I'm not trolling, but representing HW vendor that pushes its CI
> and developers environment to the limit, by 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-25 Thread Leon Romanovsky
On Thu, Mar 25, 2021 at 10:52:57PM +0530, Amey Narkhede wrote:
> On 21/03/25 06:09PM, Leon Romanovsky wrote:
> > On Thu, Mar 25, 2021 at 08:55:04AM -0600, Alex Williamson wrote:
> > > On Thu, 25 Mar 2021 10:37:54 +0200
> > > Leon Romanovsky  wrote:
> > >
> > > > On Wed, Mar 24, 2021 at 11:17:29AM -0600, Alex Williamson wrote:
> > > > > On Wed, 24 Mar 2021 17:13:56 +0200
> > > > > Leon Romanovsky  wrote:
> > > >
> > > > <...>
> > > >
> > > > > > Yes, and real testing/debugging almost always requires kernel 
> > > > > > rebuild.
> > > > > > Everything else is waste of time.
> > > > >
> > > > > Sorry, this is nonsense.  Allowing users to debug issues without a 
> > > > > full
> > > > > kernel rebuild is a good thing.
> > > >
> > > > It is far from debug, this interface doesn't give you any answers why
> > > > the reset didn't work, it just helps you to find the one that works.
> > > >
> > > > Unless you believe that this information will be enough to understand
> > > > the root cause, you will need to ask from the user to perform extra
> > > > tests, maybe try some quirk. All of that requires from the users to
> > > > rebuild their kernel.
> > > >
> > > > So no, it is not debug.
> > >
> > > It allows a user to experiment to determine (a) my device doesn't work
> > > in a given scenario with the default configuration, but (b) if I change
> > > the reset to this other thing it does work.  That is a step in
> > > debugging.
> > >
> > > It's absurd to think that a sysfs attribute could provide root cause,
> > > but it might be enough for someone to further help that user.  It would
> > > be a useful clue for a bug report.  Yes, reaching root cause might
> > > involve building a kernel, but that doesn't invalidate that having a
> > > step towards debugging in the base kernel might be a useful tool.
> >
> > Let's agree to do not agree.
> >
> > >
> > > > > > > > > For policy preference, I already described how I've 
> > > > > > > > > configured QEMU to
> > > > > > > > > prefer a bus reset rather than a PM reset due to lack of 
> > > > > > > > > specification
> > > > > > > > > regarding the scope of a PM "soft reset".  This interface 
> > > > > > > > > would allow a
> > > > > > > > > system policy to do that same thing.
> > > > > > > > >
> > > > > > > > > I don't think anyone is suggesting this as a means to avoid 
> > > > > > > > > quirks that
> > > > > > > > > would resolve reset issues and create the best default 
> > > > > > > > > general behavior.
> > > > > > > > > This provides a mechanism to test various reset methods, and 
> > > > > > > > > thereby
> > > > > > > > > identify broken methods, and set a policy.  Sure, that policy 
> > > > > > > > > might be
> > > > > > > > > to avoid a broken reset in the interim before it gets quirked 
> > > > > > > > > and
> > > > > > > > > there's potential for abuse there, but I think the benefits 
> > > > > > > > > outweigh
> > > > > > > > > the risks.
> > > > > > > >
> > > > > > > > This interface is proposed as first class citizen in the 
> > > > > > > > general sysfs
> > > > > > > > layout. Of course, it will be seen as a way to bypass the 
> > > > > > > > kernel.
> > > > > > > >
> > > > > > > > At least, put it under CONFIG_EXPERT option, so no distro will 
> > > > > > > > enable it
> > > > > > > > by default.
> > > > > > >
> > > > > > > Of course we're proposing it to be accessible, it should also 
> > > > > > > require
> > > > > > > admin privileges to modify, sysfs has lots of such things.  If 
> > > > > > > it's
> > > > > > > relegated to non-default accessibility, it won't be used for 
> > > > > > > testing
> > > > > > > and it won't be available for system policy and it's pointless.
> > > > > >
> > > > > > We probably have difference in view of what testing is. I expect 
> > > > > > from
> > > > > > the users who experience issues with reset to do extra steps and 
> > > > > > one of
> > > > > > them is to require from them to compile their kernel.
> > > > >
> > > > > I would define the ability to generate a CI test that can pick a
> > > > > device, unbind it from its driver, and iterate reset methods as a
> > > > > worthwhile improvement in testing.
> > > >
> > > > Who is going to run this CI? At least all kernel CIs (external and
> > > > internal to HW vendors) that I'm familiar are building kernel 
> > > > themselves.
> > > >
> > > > Distro kernel is too bloat to be really usable for CI.
> > >
> > > At this point I'm suspicious you're trolling.  A distro kernel CI
> > > certainly uses the kernel they intend to ship and support in their
> > > environment. You're concerned about a bloated kernel, but the proposal
> > > here adds 2-bytes per device to track reset methods and a trivial array
> > > in text memory, meanwhile you're proposing multiple per-device memory
> > > allocations to enhance the feature you think is too bloated for CI.
> >
> > I don't know why you decided to focus on memory footprint which is not
> > important at all during CI runs. The bloat is 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-25 Thread Amey Narkhede
On 21/03/25 06:09PM, Leon Romanovsky wrote:
> On Thu, Mar 25, 2021 at 08:55:04AM -0600, Alex Williamson wrote:
> > On Thu, 25 Mar 2021 10:37:54 +0200
> > Leon Romanovsky  wrote:
> >
> > > On Wed, Mar 24, 2021 at 11:17:29AM -0600, Alex Williamson wrote:
> > > > On Wed, 24 Mar 2021 17:13:56 +0200
> > > > Leon Romanovsky  wrote:
> > >
> > > <...>
> > >
> > > > > Yes, and real testing/debugging almost always requires kernel rebuild.
> > > > > Everything else is waste of time.
> > > >
> > > > Sorry, this is nonsense.  Allowing users to debug issues without a full
> > > > kernel rebuild is a good thing.
> > >
> > > It is far from debug, this interface doesn't give you any answers why
> > > the reset didn't work, it just helps you to find the one that works.
> > >
> > > Unless you believe that this information will be enough to understand
> > > the root cause, you will need to ask from the user to perform extra
> > > tests, maybe try some quirk. All of that requires from the users to
> > > rebuild their kernel.
> > >
> > > So no, it is not debug.
> >
> > It allows a user to experiment to determine (a) my device doesn't work
> > in a given scenario with the default configuration, but (b) if I change
> > the reset to this other thing it does work.  That is a step in
> > debugging.
> >
> > It's absurd to think that a sysfs attribute could provide root cause,
> > but it might be enough for someone to further help that user.  It would
> > be a useful clue for a bug report.  Yes, reaching root cause might
> > involve building a kernel, but that doesn't invalidate that having a
> > step towards debugging in the base kernel might be a useful tool.
>
> Let's agree to do not agree.
>
> >
> > > > > > > > For policy preference, I already described how I've configured 
> > > > > > > > QEMU to
> > > > > > > > prefer a bus reset rather than a PM reset due to lack of 
> > > > > > > > specification
> > > > > > > > regarding the scope of a PM "soft reset".  This interface would 
> > > > > > > > allow a
> > > > > > > > system policy to do that same thing.
> > > > > > > >
> > > > > > > > I don't think anyone is suggesting this as a means to avoid 
> > > > > > > > quirks that
> > > > > > > > would resolve reset issues and create the best default general 
> > > > > > > > behavior.
> > > > > > > > This provides a mechanism to test various reset methods, and 
> > > > > > > > thereby
> > > > > > > > identify broken methods, and set a policy.  Sure, that policy 
> > > > > > > > might be
> > > > > > > > to avoid a broken reset in the interim before it gets quirked 
> > > > > > > > and
> > > > > > > > there's potential for abuse there, but I think the benefits 
> > > > > > > > outweigh
> > > > > > > > the risks.
> > > > > > >
> > > > > > > This interface is proposed as first class citizen in the general 
> > > > > > > sysfs
> > > > > > > layout. Of course, it will be seen as a way to bypass the kernel.
> > > > > > >
> > > > > > > At least, put it under CONFIG_EXPERT option, so no distro will 
> > > > > > > enable it
> > > > > > > by default.
> > > > > >
> > > > > > Of course we're proposing it to be accessible, it should also 
> > > > > > require
> > > > > > admin privileges to modify, sysfs has lots of such things.  If it's
> > > > > > relegated to non-default accessibility, it won't be used for testing
> > > > > > and it won't be available for system policy and it's pointless.
> > > > >
> > > > > We probably have difference in view of what testing is. I expect from
> > > > > the users who experience issues with reset to do extra steps and one 
> > > > > of
> > > > > them is to require from them to compile their kernel.
> > > >
> > > > I would define the ability to generate a CI test that can pick a
> > > > device, unbind it from its driver, and iterate reset methods as a
> > > > worthwhile improvement in testing.
> > >
> > > Who is going to run this CI? At least all kernel CIs (external and
> > > internal to HW vendors) that I'm familiar are building kernel themselves.
> > >
> > > Distro kernel is too bloat to be really usable for CI.
> >
> > At this point I'm suspicious you're trolling.  A distro kernel CI
> > certainly uses the kernel they intend to ship and support in their
> > environment. You're concerned about a bloated kernel, but the proposal
> > here adds 2-bytes per device to track reset methods and a trivial array
> > in text memory, meanwhile you're proposing multiple per-device memory
> > allocations to enhance the feature you think is too bloated for CI.
>
> I don't know why you decided to focus on memory footprint which is not
> important at all during CI runs. The bloat is in Kconfig options that
> are not needed. Those extra options add significant overhead during
> builds and runs itself.
>
> And not, I'm not trolling, but representing HW vendor that pushes its CI
> and developers environment to the limit, by running full kernel builds with
> less than 30 seconds and boot-to-test with less than 6 seconds for full
> 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-25 Thread Leon Romanovsky
On Thu, Mar 25, 2021 at 09:56:37PM +0530, Amey Narkhede wrote:
> On 21/03/25 10:37AM, Leon Romanovsky wrote:
> > On Wed, Mar 24, 2021 at 11:17:29AM -0600, Alex Williamson wrote:
> > > On Wed, 24 Mar 2021 17:13:56 +0200
> > > Leon Romanovsky  wrote:

<...>

> > I expect that QEMU sets same reset policy for all devices at the same
> > time instead of trying per-device to guess which one works.
> >
> The current reset attribute does the same thing internally you described
> at the end.
> > And yes, you will be able to bypass kernel, but at least this interface
> > will be broader than initial one that serves only SO and workarounds.
> >
> What does it mean by "bypassing" kernel?
> I don't see any problem with SO and workaround if that is the only
> way an user can use their device. Why are you expecting every vendor to
> develop quirk? Also I don't see any point of using linked list to
> unnecessarily complicate a simple thing.

Please reread our conversation with Alex, it has answers to both of your
questions.

Thanks

> 
> Thanks,
> Amey


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-25 Thread Amey Narkhede
On 21/03/25 10:37AM, Leon Romanovsky wrote:
> On Wed, Mar 24, 2021 at 11:17:29AM -0600, Alex Williamson wrote:
> > On Wed, 24 Mar 2021 17:13:56 +0200
> > Leon Romanovsky  wrote:
>
> <...>
>
> > > Yes, and real testing/debugging almost always requires kernel rebuild.
> > > Everything else is waste of time.
> >
> > Sorry, this is nonsense.  Allowing users to debug issues without a full
> > kernel rebuild is a good thing.
>
> It is far from debug, this interface doesn't give you any answers why
> the reset didn't work, it just helps you to find the one that works.
>
> Unless you believe that this information will be enough to understand
> the root cause, you will need to ask from the user to perform extra
> tests, maybe try some quirk. All of that requires from the users to
> rebuild their kernel.
>
> So no, it is not debug.
>
> >
> > > > > > For policy preference, I already described how I've configured QEMU 
> > > > > > to
> > > > > > prefer a bus reset rather than a PM reset due to lack of 
> > > > > > specification
> > > > > > regarding the scope of a PM "soft reset".  This interface would 
> > > > > > allow a
> > > > > > system policy to do that same thing.
> > > > > >
> > > > > > I don't think anyone is suggesting this as a means to avoid quirks 
> > > > > > that
> > > > > > would resolve reset issues and create the best default general 
> > > > > > behavior.
> > > > > > This provides a mechanism to test various reset methods, and thereby
> > > > > > identify broken methods, and set a policy.  Sure, that policy might 
> > > > > > be
> > > > > > to avoid a broken reset in the interim before it gets quirked and
> > > > > > there's potential for abuse there, but I think the benefits outweigh
> > > > > > the risks.
> > > > >
> > > > > This interface is proposed as first class citizen in the general sysfs
> > > > > layout. Of course, it will be seen as a way to bypass the kernel.
> > > > >
> > > > > At least, put it under CONFIG_EXPERT option, so no distro will enable 
> > > > > it
> > > > > by default.
> > > >
> > > > Of course we're proposing it to be accessible, it should also require
> > > > admin privileges to modify, sysfs has lots of such things.  If it's
> > > > relegated to non-default accessibility, it won't be used for testing
> > > > and it won't be available for system policy and it's pointless.
> > >
> > > We probably have difference in view of what testing is. I expect from
> > > the users who experience issues with reset to do extra steps and one of
> > > them is to require from them to compile their kernel.
> >
> > I would define the ability to generate a CI test that can pick a
> > device, unbind it from its driver, and iterate reset methods as a
> > worthwhile improvement in testing.
>
> Who is going to run this CI? At least all kernel CIs (external and
> internal to HW vendors) that I'm familiar are building kernel themselves.
>
> Distro kernel is too bloat to be really usable for CI.
>
> >
> > > The root permissions doesn't protect from anything, SO lovers will use
> > > root without even thinking twice.
> >
> > Yes, with great power comes great responsibility.  Many admins ignore
> > this.  That's far beyond the scope of this series.
>
> <...>
>
> > > I'm trying to help you with your use case of providing reset policy
> > > mechanism, which can be without CONFIG_EXPERT. However if you want
> > > to continue path of having specific reset type only, please ensure
> > > that this is not taken to the "bypass kernel" direction.
> >
> > You've lost me, are you saying you'd be in favor of an interface that
> > allows an admin to specify an arbitrary list of reset methods because
> > that's somehow more in line with a policy choice than a userspace
> > workaround?  This seems like unnecessary bloat because (a) it allows
> > the same bypass mechanism, and (b) a given device is only going to use
> > a single method anyway, so the functionality is unnecessary.  Please
> > help me understand how this favors the policy use case.  Thanks,
>
> The policy decision is global logic that is easier to grasp. At some
> point of our discussion, you presented the case where PM reset is not
> defined well and you prefer to do bus reset (something like that).
>
> I expect that QEMU sets same reset policy for all devices at the same
> time instead of trying per-device to guess which one works.
>
The current reset attribute does the same thing internally you described
at the end.
> And yes, you will be able to bypass kernel, but at least this interface
> will be broader than initial one that serves only SO and workarounds.
>
What does it mean by "bypassing" kernel?
I don't see any problem with SO and workaround if that is the only
way an user can use their device. Why are you expecting every vendor to
develop quirk? Also I don't see any point of using linked list to
unnecessarily complicate a simple thing.

Thanks,
Amey


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-25 Thread Leon Romanovsky
On Thu, Mar 25, 2021 at 08:55:04AM -0600, Alex Williamson wrote:
> On Thu, 25 Mar 2021 10:37:54 +0200
> Leon Romanovsky  wrote:
> 
> > On Wed, Mar 24, 2021 at 11:17:29AM -0600, Alex Williamson wrote:
> > > On Wed, 24 Mar 2021 17:13:56 +0200
> > > Leon Romanovsky  wrote:  
> > 
> > <...>
> > 
> > > > Yes, and real testing/debugging almost always requires kernel rebuild.
> > > > Everything else is waste of time.  
> > > 
> > > Sorry, this is nonsense.  Allowing users to debug issues without a full
> > > kernel rebuild is a good thing.  
> > 
> > It is far from debug, this interface doesn't give you any answers why
> > the reset didn't work, it just helps you to find the one that works.
> > 
> > Unless you believe that this information will be enough to understand
> > the root cause, you will need to ask from the user to perform extra
> > tests, maybe try some quirk. All of that requires from the users to
> > rebuild their kernel.
> > 
> > So no, it is not debug.
> 
> It allows a user to experiment to determine (a) my device doesn't work
> in a given scenario with the default configuration, but (b) if I change
> the reset to this other thing it does work.  That is a step in
> debugging.
> 
> It's absurd to think that a sysfs attribute could provide root cause,
> but it might be enough for someone to further help that user.  It would
> be a useful clue for a bug report.  Yes, reaching root cause might
> involve building a kernel, but that doesn't invalidate that having a
> step towards debugging in the base kernel might be a useful tool.

Let's agree to do not agree.

> 
> > > > > > > For policy preference, I already described how I've configured 
> > > > > > > QEMU to
> > > > > > > prefer a bus reset rather than a PM reset due to lack of 
> > > > > > > specification
> > > > > > > regarding the scope of a PM "soft reset".  This interface would 
> > > > > > > allow a
> > > > > > > system policy to do that same thing.
> > > > > > > 
> > > > > > > I don't think anyone is suggesting this as a means to avoid 
> > > > > > > quirks that
> > > > > > > would resolve reset issues and create the best default general 
> > > > > > > behavior.
> > > > > > > This provides a mechanism to test various reset methods, and 
> > > > > > > thereby
> > > > > > > identify broken methods, and set a policy.  Sure, that policy 
> > > > > > > might be
> > > > > > > to avoid a broken reset in the interim before it gets quirked and
> > > > > > > there's potential for abuse there, but I think the benefits 
> > > > > > > outweigh
> > > > > > > the risks.  
> > > > > > 
> > > > > > This interface is proposed as first class citizen in the general 
> > > > > > sysfs
> > > > > > layout. Of course, it will be seen as a way to bypass the kernel.
> > > > > > 
> > > > > > At least, put it under CONFIG_EXPERT option, so no distro will 
> > > > > > enable it
> > > > > > by default.
> > > > > 
> > > > > Of course we're proposing it to be accessible, it should also require
> > > > > admin privileges to modify, sysfs has lots of such things.  If it's
> > > > > relegated to non-default accessibility, it won't be used for testing
> > > > > and it won't be available for system policy and it's pointless.
> > > > 
> > > > We probably have difference in view of what testing is. I expect from
> > > > the users who experience issues with reset to do extra steps and one of
> > > > them is to require from them to compile their kernel.  
> > > 
> > > I would define the ability to generate a CI test that can pick a
> > > device, unbind it from its driver, and iterate reset methods as a
> > > worthwhile improvement in testing.  
> > 
> > Who is going to run this CI? At least all kernel CIs (external and
> > internal to HW vendors) that I'm familiar are building kernel themselves.
> > 
> > Distro kernel is too bloat to be really usable for CI.
> 
> At this point I'm suspicious you're trolling.  A distro kernel CI
> certainly uses the kernel they intend to ship and support in their
> environment. You're concerned about a bloated kernel, but the proposal
> here adds 2-bytes per device to track reset methods and a trivial array
> in text memory, meanwhile you're proposing multiple per-device memory
> allocations to enhance the feature you think is too bloated for CI.

I don't know why you decided to focus on memory footprint which is not
important at all during CI runs. The bloat is in Kconfig options that
are not needed. Those extra options add significant overhead during
builds and runs itself.

And not, I'm not trolling, but representing HW vendor that pushes its CI
and developers environment to the limit, by running full kernel builds with
less than 30 seconds and boot-to-test with less than 6 seconds for full
Fedora VM.

> 
> > > > The root permissions doesn't protect from anything, SO lovers will use
> > > > root without even thinking twice.  
> > > 
> > > Yes, with great power comes great responsibility.  Many admins ignore
> > > this.  That's far 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-25 Thread Alex Williamson
On Thu, 25 Mar 2021 10:37:54 +0200
Leon Romanovsky  wrote:

> On Wed, Mar 24, 2021 at 11:17:29AM -0600, Alex Williamson wrote:
> > On Wed, 24 Mar 2021 17:13:56 +0200
> > Leon Romanovsky  wrote:  
> 
> <...>
> 
> > > Yes, and real testing/debugging almost always requires kernel rebuild.
> > > Everything else is waste of time.  
> > 
> > Sorry, this is nonsense.  Allowing users to debug issues without a full
> > kernel rebuild is a good thing.  
> 
> It is far from debug, this interface doesn't give you any answers why
> the reset didn't work, it just helps you to find the one that works.
> 
> Unless you believe that this information will be enough to understand
> the root cause, you will need to ask from the user to perform extra
> tests, maybe try some quirk. All of that requires from the users to
> rebuild their kernel.
> 
> So no, it is not debug.

It allows a user to experiment to determine (a) my device doesn't work
in a given scenario with the default configuration, but (b) if I change
the reset to this other thing it does work.  That is a step in
debugging.

It's absurd to think that a sysfs attribute could provide root cause,
but it might be enough for someone to further help that user.  It would
be a useful clue for a bug report.  Yes, reaching root cause might
involve building a kernel, but that doesn't invalidate that having a
step towards debugging in the base kernel might be a useful tool.

> > > > > > For policy preference, I already described how I've configured QEMU 
> > > > > > to
> > > > > > prefer a bus reset rather than a PM reset due to lack of 
> > > > > > specification
> > > > > > regarding the scope of a PM "soft reset".  This interface would 
> > > > > > allow a
> > > > > > system policy to do that same thing.
> > > > > > 
> > > > > > I don't think anyone is suggesting this as a means to avoid quirks 
> > > > > > that
> > > > > > would resolve reset issues and create the best default general 
> > > > > > behavior.
> > > > > > This provides a mechanism to test various reset methods, and thereby
> > > > > > identify broken methods, and set a policy.  Sure, that policy might 
> > > > > > be
> > > > > > to avoid a broken reset in the interim before it gets quirked and
> > > > > > there's potential for abuse there, but I think the benefits outweigh
> > > > > > the risks.  
> > > > > 
> > > > > This interface is proposed as first class citizen in the general sysfs
> > > > > layout. Of course, it will be seen as a way to bypass the kernel.
> > > > > 
> > > > > At least, put it under CONFIG_EXPERT option, so no distro will enable 
> > > > > it
> > > > > by default.
> > > > 
> > > > Of course we're proposing it to be accessible, it should also require
> > > > admin privileges to modify, sysfs has lots of such things.  If it's
> > > > relegated to non-default accessibility, it won't be used for testing
> > > > and it won't be available for system policy and it's pointless.
> > > 
> > > We probably have difference in view of what testing is. I expect from
> > > the users who experience issues with reset to do extra steps and one of
> > > them is to require from them to compile their kernel.  
> > 
> > I would define the ability to generate a CI test that can pick a
> > device, unbind it from its driver, and iterate reset methods as a
> > worthwhile improvement in testing.  
> 
> Who is going to run this CI? At least all kernel CIs (external and
> internal to HW vendors) that I'm familiar are building kernel themselves.
> 
> Distro kernel is too bloat to be really usable for CI.

At this point I'm suspicious you're trolling.  A distro kernel CI
certainly uses the kernel they intend to ship and support in their
environment. You're concerned about a bloated kernel, but the proposal
here adds 2-bytes per device to track reset methods and a trivial array
in text memory, meanwhile you're proposing multiple per-device memory
allocations to enhance the feature you think is too bloated for CI.

> > > The root permissions doesn't protect from anything, SO lovers will use
> > > root without even thinking twice.  
> > 
> > Yes, with great power comes great responsibility.  Many admins ignore
> > this.  That's far beyond the scope of this series.  
> 
> <...>
> 
> > > I'm trying to help you with your use case of providing reset policy
> > > mechanism, which can be without CONFIG_EXPERT. However if you want
> > > to continue path of having specific reset type only, please ensure
> > > that this is not taken to the "bypass kernel" direction.  
> > 
> > You've lost me, are you saying you'd be in favor of an interface that
> > allows an admin to specify an arbitrary list of reset methods because
> > that's somehow more in line with a policy choice than a userspace
> > workaround?  This seems like unnecessary bloat because (a) it allows
> > the same bypass mechanism, and (b) a given device is only going to use
> > a single method anyway, so the functionality is unnecessary.  Please
> > help me 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-25 Thread Leon Romanovsky
On Wed, Mar 24, 2021 at 11:17:29AM -0600, Alex Williamson wrote:
> On Wed, 24 Mar 2021 17:13:56 +0200
> Leon Romanovsky  wrote:

<...>

> > Yes, and real testing/debugging almost always requires kernel rebuild.
> > Everything else is waste of time.
> 
> Sorry, this is nonsense.  Allowing users to debug issues without a full
> kernel rebuild is a good thing.

It is far from debug, this interface doesn't give you any answers why
the reset didn't work, it just helps you to find the one that works.

Unless you believe that this information will be enough to understand
the root cause, you will need to ask from the user to perform extra
tests, maybe try some quirk. All of that requires from the users to
rebuild their kernel.

So no, it is not debug.

> 
> > > > > For policy preference, I already described how I've configured QEMU to
> > > > > prefer a bus reset rather than a PM reset due to lack of specification
> > > > > regarding the scope of a PM "soft reset".  This interface would allow 
> > > > > a
> > > > > system policy to do that same thing.
> > > > > 
> > > > > I don't think anyone is suggesting this as a means to avoid quirks 
> > > > > that
> > > > > would resolve reset issues and create the best default general 
> > > > > behavior.
> > > > > This provides a mechanism to test various reset methods, and thereby
> > > > > identify broken methods, and set a policy.  Sure, that policy might be
> > > > > to avoid a broken reset in the interim before it gets quirked and
> > > > > there's potential for abuse there, but I think the benefits outweigh
> > > > > the risks.
> > > > 
> > > > This interface is proposed as first class citizen in the general sysfs
> > > > layout. Of course, it will be seen as a way to bypass the kernel.
> > > > 
> > > > At least, put it under CONFIG_EXPERT option, so no distro will enable it
> > > > by default.  
> > > 
> > > Of course we're proposing it to be accessible, it should also require
> > > admin privileges to modify, sysfs has lots of such things.  If it's
> > > relegated to non-default accessibility, it won't be used for testing
> > > and it won't be available for system policy and it's pointless.  
> > 
> > We probably have difference in view of what testing is. I expect from
> > the users who experience issues with reset to do extra steps and one of
> > them is to require from them to compile their kernel.
> 
> I would define the ability to generate a CI test that can pick a
> device, unbind it from its driver, and iterate reset methods as a
> worthwhile improvement in testing.

Who is going to run this CI? At least all kernel CIs (external and
internal to HW vendors) that I'm familiar are building kernel themselves.

Distro kernel is too bloat to be really usable for CI.

> 
> > The root permissions doesn't protect from anything, SO lovers will use
> > root without even thinking twice.
> 
> Yes, with great power comes great responsibility.  Many admins ignore
> this.  That's far beyond the scope of this series.

<...>

> > I'm trying to help you with your use case of providing reset policy
> > mechanism, which can be without CONFIG_EXPERT. However if you want
> > to continue path of having specific reset type only, please ensure
> > that this is not taken to the "bypass kernel" direction.
> 
> You've lost me, are you saying you'd be in favor of an interface that
> allows an admin to specify an arbitrary list of reset methods because
> that's somehow more in line with a policy choice than a userspace
> workaround?  This seems like unnecessary bloat because (a) it allows
> the same bypass mechanism, and (b) a given device is only going to use
> a single method anyway, so the functionality is unnecessary.  Please
> help me understand how this favors the policy use case.  Thanks,

The policy decision is global logic that is easier to grasp. At some
point of our discussion, you presented the case where PM reset is not
defined well and you prefer to do bus reset (something like that).

I expect that QEMU sets same reset policy for all devices at the same
time instead of trying per-device to guess which one works.

And yes, you will be able to bypass kernel, but at least this interface
will be broader than initial one that serves only SO and workarounds.

Thanks

> 
> Alex
> 


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-24 Thread Alex Williamson
On Wed, 24 Mar 2021 17:13:56 +0200
Leon Romanovsky  wrote:

> On Wed, Mar 24, 2021 at 08:37:43AM -0600, Alex Williamson wrote:
> > On Wed, 24 Mar 2021 12:03:00 +0200
> > Leon Romanovsky  wrote:
> >   
> > > On Mon, Mar 22, 2021 at 11:10:03AM -0600, Alex Williamson wrote:  
> > > > On Sun, 21 Mar 2021 10:40:55 +0200
> > > > Leon Romanovsky  wrote:
> > > > 
> > > > > On Sat, Mar 20, 2021 at 08:59:42AM -0600, Alex Williamson wrote:
> > > > > > On Sat, 20 Mar 2021 11:10:08 +0200
> > > > > > Leon Romanovsky  wrote:  
> > > > > > > On Fri, Mar 19, 2021 at 10:23:13AM -0600, Alex Williamson wrote:  
> > > > > > >  
> > > > > > > > 
> > > > > > > > What if we taint the kernel or pci_warn() for cases where 
> > > > > > > > either all
> > > > > > > > the reset methods are disabled, ie. 'echo none > reset_method', 
> > > > > > > > or any
> > > > > > > > time a device specific method is disabled?
> > > > > > > 
> > > > > > > What does it mean "none"? Does it mean nothing supported? If yes, 
> > > > > > > I think that
> > > > > > > pci_warn() will be enough. At least for me, taint is usable 
> > > > > > > during debug stages,
> > > > > > > probably if device doesn't crash no one will look to see 
> > > > > > > /proc/sys/kernel/tainted.  
> > > > > > 
> > > > > > "none" as implemented in this patch, clearing the enabled function
> > > > > > reset methods.  
> > > > > 
> > > > > It is far from intuitive, the empty string will be easier to 
> > > > > understand,
> > > > > because "none" means no reset at all.
> > > > 
> > > > "No reset at all" is what "none" achieves, the
> > > > pci_dev.reset_methods_enabled bitmap is cleared.  We can use an empty
> > > > string, but I think we want a way to clear all enabled resets and a way
> > > > to return it to the default.  I could see arguments for an empty string
> > > > serving either purpose, so this version proposed explicitly using
> > > > "none" and "default", as included in the ABI update.
> > > 
> > > I will stick with "default" only and leave "none" for something else.  
> > 
> > Are you suggesting writing "default" restores the unmodified behavior
> > and writing an empty string clears all enabled reset methods?
> >
> > > > > > > > I'd almost go so far as to prevent disabling a device specific 
> > > > > > > > reset
> > > > > > > > altogether, but for example should a device specific reset that 
> > > > > > > > fixes
> > > > > > > > an aspect of FLR behavior prevent using a bus reset?  I'd 
> > > > > > > > prefer in that
> > > > > > > > case if direct FLR were disabled via a device flag introduced 
> > > > > > > > with the
> > > > > > > > quirk and the remaining resets can still be selected by 
> > > > > > > > preference.
> > > > > > > 
> > > > > > > I don't know enough to discuss the PCI details, but you raised 
> > > > > > > good point.
> > > > > > > This sysfs is user visible API that is presented as is from 
> > > > > > > device point
> > > > > > > of view. It can be easily run into problems if PCI/core doesn't 
> > > > > > > work with
> > > > > > > user's choice.
> > > > > > >   
> > > > > > > > 
> > > > > > > > Theoretically all the other reset methods work and are 
> > > > > > > > available, it's
> > > > > > > > only a policy decision which to use, right?
> > > > > > > 
> > > > > > > But this patch was presented as a way to overcome situations where
> > > > > > > supported != working and user magically knows which reset type to 
> > > > > > > set.  
> > > > > > 
> > > > > > It's not magic, the new sysfs attributes expose which resets are
> > > > > > enabled and the order that they're used, the user can simply select 
> > > > > > the
> > > > > > next one.  Being able to bypass a broken reset method is a helpful 
> > > > > > side
> > > > > > effect of getting to select a preferred reset method.  
> > > > > 
> > > > > Magic in a sense that user has no idea what those resets mean, the
> > > > > expectation is that he will blindly iterate till something works.
> > > > 
> > > > Which ought to actually be a safe thing to do.  We should have quirks to
> > > > exclude resets that are known broken but still probe as present and I'd
> > > > be perfectly fine if we issue a warning if the user disables all resets
> > > > for a given device.
> > > >  
> > > > > > > If you want to take this patch to be policy decision tool,
> > > > > > > it will need to accept "reset_type1,reset_type2,..." sort of 
> > > > > > > input,
> > > > > > > so fallback will work natively.  
> > > > > > 
> > > > > > I don't see that as a requirement.  We have fall-through support in 
> > > > > > the
> > > > > > kernel, but for a given device we're really only ever going to make 
> > > > > > use
> > > > > > of one of those methods.  If a user knows enough about a device to 
> > > > > > have
> > > > > > a preference, I think it can be singular.  That also significantly
> > > > > > simplifies the interface and supporting 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-24 Thread Leon Romanovsky
On Wed, Mar 24, 2021 at 08:37:43AM -0600, Alex Williamson wrote:
> On Wed, 24 Mar 2021 12:03:00 +0200
> Leon Romanovsky  wrote:
> 
> > On Mon, Mar 22, 2021 at 11:10:03AM -0600, Alex Williamson wrote:
> > > On Sun, 21 Mar 2021 10:40:55 +0200
> > > Leon Romanovsky  wrote:
> > >   
> > > > On Sat, Mar 20, 2021 at 08:59:42AM -0600, Alex Williamson wrote:  
> > > > > On Sat, 20 Mar 2021 11:10:08 +0200
> > > > > Leon Romanovsky  wrote:
> > > > > > On Fri, Mar 19, 2021 at 10:23:13AM -0600, Alex Williamson wrote:
> > > > > >  
> > > > > > > 
> > > > > > > What if we taint the kernel or pci_warn() for cases where either 
> > > > > > > all
> > > > > > > the reset methods are disabled, ie. 'echo none > reset_method', 
> > > > > > > or any
> > > > > > > time a device specific method is disabled?  
> > > > > > 
> > > > > > What does it mean "none"? Does it mean nothing supported? If yes, I 
> > > > > > think that
> > > > > > pci_warn() will be enough. At least for me, taint is usable during 
> > > > > > debug stages,
> > > > > > probably if device doesn't crash no one will look to see 
> > > > > > /proc/sys/kernel/tainted.
> > > > > 
> > > > > "none" as implemented in this patch, clearing the enabled function
> > > > > reset methods.
> > > > 
> > > > It is far from intuitive, the empty string will be easier to understand,
> > > > because "none" means no reset at all.  
> > > 
> > > "No reset at all" is what "none" achieves, the
> > > pci_dev.reset_methods_enabled bitmap is cleared.  We can use an empty
> > > string, but I think we want a way to clear all enabled resets and a way
> > > to return it to the default.  I could see arguments for an empty string
> > > serving either purpose, so this version proposed explicitly using
> > > "none" and "default", as included in the ABI update.  
> > 
> > I will stick with "default" only and leave "none" for something else.
> 
> Are you suggesting writing "default" restores the unmodified behavior
> and writing an empty string clears all enabled reset methods?
>  
> > > > > > > I'd almost go so far as to prevent disabling a device specific 
> > > > > > > reset
> > > > > > > altogether, but for example should a device specific reset that 
> > > > > > > fixes
> > > > > > > an aspect of FLR behavior prevent using a bus reset?  I'd prefer 
> > > > > > > in that
> > > > > > > case if direct FLR were disabled via a device flag introduced 
> > > > > > > with the
> > > > > > > quirk and the remaining resets can still be selected by 
> > > > > > > preference.  
> > > > > > 
> > > > > > I don't know enough to discuss the PCI details, but you raised good 
> > > > > > point.
> > > > > > This sysfs is user visible API that is presented as is from device 
> > > > > > point
> > > > > > of view. It can be easily run into problems if PCI/core doesn't 
> > > > > > work with
> > > > > > user's choice.
> > > > > > 
> > > > > > > 
> > > > > > > Theoretically all the other reset methods work and are available, 
> > > > > > > it's
> > > > > > > only a policy decision which to use, right?  
> > > > > > 
> > > > > > But this patch was presented as a way to overcome situations where
> > > > > > supported != working and user magically knows which reset type to 
> > > > > > set.
> > > > > 
> > > > > It's not magic, the new sysfs attributes expose which resets are
> > > > > enabled and the order that they're used, the user can simply select 
> > > > > the
> > > > > next one.  Being able to bypass a broken reset method is a helpful 
> > > > > side
> > > > > effect of getting to select a preferred reset method.
> > > > 
> > > > Magic in a sense that user has no idea what those resets mean, the
> > > > expectation is that he will blindly iterate till something works.  
> > > 
> > > Which ought to actually be a safe thing to do.  We should have quirks to
> > > exclude resets that are known broken but still probe as present and I'd
> > > be perfectly fine if we issue a warning if the user disables all resets
> > > for a given device.
> > >
> > > > > > If you want to take this patch to be policy decision tool,
> > > > > > it will need to accept "reset_type1,reset_type2,..." sort of input,
> > > > > > so fallback will work natively.
> > > > > 
> > > > > I don't see that as a requirement.  We have fall-through support in 
> > > > > the
> > > > > kernel, but for a given device we're really only ever going to make 
> > > > > use
> > > > > of one of those methods.  If a user knows enough about a device to 
> > > > > have
> > > > > a preference, I think it can be singular.  That also significantly
> > > > > simplifies the interface and supporting code.  Thanks,
> > > > 
> > > > I'm struggling to get requirements from this thread. You talked about
> > > > policy decision to overtake fallback mechanism, Amey wanted to avoid
> > > > quirks.
> > > > 
> > > > Do you have an example of such devices or we are talking about
> > > > theoretical case?  
> > > 
> > > Look at any 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-24 Thread Alex Williamson
On Wed, 24 Mar 2021 12:03:00 +0200
Leon Romanovsky  wrote:

> On Mon, Mar 22, 2021 at 11:10:03AM -0600, Alex Williamson wrote:
> > On Sun, 21 Mar 2021 10:40:55 +0200
> > Leon Romanovsky  wrote:
> >   
> > > On Sat, Mar 20, 2021 at 08:59:42AM -0600, Alex Williamson wrote:  
> > > > On Sat, 20 Mar 2021 11:10:08 +0200
> > > > Leon Romanovsky  wrote:
> > > > > On Fri, Mar 19, 2021 at 10:23:13AM -0600, Alex Williamson wrote: 
> > > > > > 
> > > > > > What if we taint the kernel or pci_warn() for cases where either all
> > > > > > the reset methods are disabled, ie. 'echo none > reset_method', or 
> > > > > > any
> > > > > > time a device specific method is disabled?  
> > > > > 
> > > > > What does it mean "none"? Does it mean nothing supported? If yes, I 
> > > > > think that
> > > > > pci_warn() will be enough. At least for me, taint is usable during 
> > > > > debug stages,
> > > > > probably if device doesn't crash no one will look to see 
> > > > > /proc/sys/kernel/tainted.
> > > > 
> > > > "none" as implemented in this patch, clearing the enabled function
> > > > reset methods.
> > > 
> > > It is far from intuitive, the empty string will be easier to understand,
> > > because "none" means no reset at all.  
> > 
> > "No reset at all" is what "none" achieves, the
> > pci_dev.reset_methods_enabled bitmap is cleared.  We can use an empty
> > string, but I think we want a way to clear all enabled resets and a way
> > to return it to the default.  I could see arguments for an empty string
> > serving either purpose, so this version proposed explicitly using
> > "none" and "default", as included in the ABI update.  
> 
> I will stick with "default" only and leave "none" for something else.

Are you suggesting writing "default" restores the unmodified behavior
and writing an empty string clears all enabled reset methods?
 
> > > > > > I'd almost go so far as to prevent disabling a device specific reset
> > > > > > altogether, but for example should a device specific reset that 
> > > > > > fixes
> > > > > > an aspect of FLR behavior prevent using a bus reset?  I'd prefer in 
> > > > > > that
> > > > > > case if direct FLR were disabled via a device flag introduced with 
> > > > > > the
> > > > > > quirk and the remaining resets can still be selected by preference. 
> > > > > >  
> > > > > 
> > > > > I don't know enough to discuss the PCI details, but you raised good 
> > > > > point.
> > > > > This sysfs is user visible API that is presented as is from device 
> > > > > point
> > > > > of view. It can be easily run into problems if PCI/core doesn't work 
> > > > > with
> > > > > user's choice.
> > > > > 
> > > > > > 
> > > > > > Theoretically all the other reset methods work and are available, 
> > > > > > it's
> > > > > > only a policy decision which to use, right?  
> > > > > 
> > > > > But this patch was presented as a way to overcome situations where
> > > > > supported != working and user magically knows which reset type to 
> > > > > set.
> > > > 
> > > > It's not magic, the new sysfs attributes expose which resets are
> > > > enabled and the order that they're used, the user can simply select the
> > > > next one.  Being able to bypass a broken reset method is a helpful side
> > > > effect of getting to select a preferred reset method.
> > > 
> > > Magic in a sense that user has no idea what those resets mean, the
> > > expectation is that he will blindly iterate till something works.  
> > 
> > Which ought to actually be a safe thing to do.  We should have quirks to
> > exclude resets that are known broken but still probe as present and I'd
> > be perfectly fine if we issue a warning if the user disables all resets
> > for a given device.
> >
> > > > > If you want to take this patch to be policy decision tool,
> > > > > it will need to accept "reset_type1,reset_type2,..." sort of input,
> > > > > so fallback will work natively.
> > > > 
> > > > I don't see that as a requirement.  We have fall-through support in the
> > > > kernel, but for a given device we're really only ever going to make use
> > > > of one of those methods.  If a user knows enough about a device to have
> > > > a preference, I think it can be singular.  That also significantly
> > > > simplifies the interface and supporting code.  Thanks,
> > > 
> > > I'm struggling to get requirements from this thread. You talked about
> > > policy decision to overtake fallback mechanism, Amey wanted to avoid
> > > quirks.
> > > 
> > > Do you have an example of such devices or we are talking about
> > > theoretical case?  
> > 
> > Look at any device that already has a reset quirk and the process it
> > took to get there.  Those are more than just theoretical cases.  
> 
> So let's fix the process. The long standing kernel policy is that kernel
> bugs (and missing quirk can be seen as such bug) should be fixed in the
> kernel and not workaround by the users.

I don't see an actual proposal here to 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-24 Thread Leon Romanovsky
On Mon, Mar 22, 2021 at 11:10:03AM -0600, Alex Williamson wrote:
> On Sun, 21 Mar 2021 10:40:55 +0200
> Leon Romanovsky  wrote:
> 
> > On Sat, Mar 20, 2021 at 08:59:42AM -0600, Alex Williamson wrote:
> > > On Sat, 20 Mar 2021 11:10:08 +0200
> > > Leon Romanovsky  wrote:  
> > > > On Fri, Mar 19, 2021 at 10:23:13AM -0600, Alex Williamson wrote:   
> > > > > 
> > > > > What if we taint the kernel or pci_warn() for cases where either all
> > > > > the reset methods are disabled, ie. 'echo none > reset_method', or any
> > > > > time a device specific method is disabled?
> > > > 
> > > > What does it mean "none"? Does it mean nothing supported? If yes, I 
> > > > think that
> > > > pci_warn() will be enough. At least for me, taint is usable during 
> > > > debug stages,
> > > > probably if device doesn't crash no one will look to see 
> > > > /proc/sys/kernel/tainted.  
> > > 
> > > "none" as implemented in this patch, clearing the enabled function
> > > reset methods.  
> > 
> > It is far from intuitive, the empty string will be easier to understand,
> > because "none" means no reset at all.
> 
> "No reset at all" is what "none" achieves, the
> pci_dev.reset_methods_enabled bitmap is cleared.  We can use an empty
> string, but I think we want a way to clear all enabled resets and a way
> to return it to the default.  I could see arguments for an empty string
> serving either purpose, so this version proposed explicitly using
> "none" and "default", as included in the ABI update.

I will stick with "default" only and leave "none" for something else.

> 
> > > > > I'd almost go so far as to prevent disabling a device specific reset
> > > > > altogether, but for example should a device specific reset that fixes
> > > > > an aspect of FLR behavior prevent using a bus reset?  I'd prefer in 
> > > > > that
> > > > > case if direct FLR were disabled via a device flag introduced with the
> > > > > quirk and the remaining resets can still be selected by preference.   
> > > > >  
> > > > 
> > > > I don't know enough to discuss the PCI details, but you raised good 
> > > > point.
> > > > This sysfs is user visible API that is presented as is from device point
> > > > of view. It can be easily run into problems if PCI/core doesn't work 
> > > > with
> > > > user's choice.
> > > >   
> > > > > 
> > > > > Theoretically all the other reset methods work and are available, it's
> > > > > only a policy decision which to use, right?
> > > > 
> > > > But this patch was presented as a way to overcome situations where
> > > > supported != working and user magically knows which reset type to set.  
> > > 
> > > It's not magic, the new sysfs attributes expose which resets are
> > > enabled and the order that they're used, the user can simply select the
> > > next one.  Being able to bypass a broken reset method is a helpful side
> > > effect of getting to select a preferred reset method.  
> > 
> > Magic in a sense that user has no idea what those resets mean, the
> > expectation is that he will blindly iterate till something works.
> 
> Which ought to actually be a safe thing to do.  We should have quirks to
> exclude resets that are known broken but still probe as present and I'd
> be perfectly fine if we issue a warning if the user disables all resets
> for a given device.
>  
> > > > If you want to take this patch to be policy decision tool,
> > > > it will need to accept "reset_type1,reset_type2,..." sort of input,
> > > > so fallback will work natively.  
> > > 
> > > I don't see that as a requirement.  We have fall-through support in the
> > > kernel, but for a given device we're really only ever going to make use
> > > of one of those methods.  If a user knows enough about a device to have
> > > a preference, I think it can be singular.  That also significantly
> > > simplifies the interface and supporting code.  Thanks,  
> > 
> > I'm struggling to get requirements from this thread. You talked about
> > policy decision to overtake fallback mechanism, Amey wanted to avoid
> > quirks.
> > 
> > Do you have an example of such devices or we are talking about
> > theoretical case?
> 
> Look at any device that already has a reset quirk and the process it
> took to get there.  Those are more than just theoretical cases.

So let's fix the process. The long standing kernel policy is that kernel
bugs (and missing quirk can be seen as such bug) should be fixed in the
kernel and not workaround by the users.

> 
> For policy preference, I already described how I've configured QEMU to
> prefer a bus reset rather than a PM reset due to lack of specification
> regarding the scope of a PM "soft reset".  This interface would allow a
> system policy to do that same thing.
> 
> I don't think anyone is suggesting this as a means to avoid quirks that
> would resolve reset issues and create the best default general behavior.
> This provides a mechanism to test various reset methods, and thereby
> identify broken methods, and set a policy. 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-23 Thread Alex Williamson
On Tue, 23 Mar 2021 10:06:25 -0600
Alex Williamson  wrote:

> On Tue, 23 Mar 2021 21:02:21 +0530
> Amey Narkhede  wrote:
> 
> > On 21/03/23 08:44AM, Alex Williamson wrote:  
> > > On Tue, 23 Mar 2021 15:34:19 +0100
> > > Pali Rohár  wrote:
> > >
> > > > On Thursday 18 March 2021 20:01:55 Amey Narkhede wrote:
> > > > > On 21/03/17 09:13PM, Pali Rohár wrote:
> > > > > > On Wednesday 17 March 2021 14:00:20 Alex Williamson wrote:
> > > > > > > On Wed, 17 Mar 2021 20:40:24 +0100
> > > > > > > Pali Rohár  wrote:
> > > > > > >
> > > > > > > > On Wednesday 17 March 2021 13:32:45 Alex Williamson wrote:
> > > > > > > > > On Wed, 17 Mar 2021 20:24:24 +0100
> > > > > > > > > Pali Rohár  wrote:
> > > > > > > > >
> > > > > > > > > > On Wednesday 17 March 2021 13:15:36 Alex Williamson wrote:  
> > > > > > > > > >   
> > > > > > > > > > > On Wed, 17 Mar 2021 20:02:06 +0100
> > > > > > > > > > > Pali Rohár  wrote:
> > > > > > > > > > >
> > > > > > > > > > > > On Monday 15 March 2021 09:03:39 Alex Williamson wrote: 
> > > > > > > > > > > >
> > > > > > > > > > > > > On Mon, 15 Mar 2021 15:52:38 +0100
> > > > > > > > > > > > > Pali Rohár  wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > On Monday 15 March 2021 08:34:09 Alex Williamson 
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > > > > > > > > > > > > Pali Rohár  wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede 
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > slot reset (pci_dev_reset_slot_function) and 
> > > > > > > > > > > > > > > > > secondary bus
> > > > > > > > > > > > > > > > > reset(pci_parent_bus_reset) which I think are 
> > > > > > > > > > > > > > > > > hot reset and
> > > > > > > > > > > > > > > > > warm reset respectively.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. 
> > > > > > > > > > > > > > > > Slot reset is just another
> > > > > > > > > > > > > > > > type of reset, which is currently implemented 
> > > > > > > > > > > > > > > > only for PCIe hot plug
> > > > > > > > > > > > > > > > bridges and for PowerPC PowerNV platform and it 
> > > > > > > > > > > > > > > > just call PCI secondary
> > > > > > > > > > > > > > > > bus reset with some other hook. PCIe Warm Reset 
> > > > > > > > > > > > > > > > does not have API in
> > > > > > > > > > > > > > > > kernel and therefore drivers do not export this 
> > > > > > > > > > > > > > > > type of reset via any
> > > > > > > > > > > > > > > > kernel function (yet).
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Warm reset is beyond the scope of this series, 
> > > > > > > > > > > > > > > but could be implemented
> > > > > > > > > > > > > > > in a compatible way to fit within the 
> > > > > > > > > > > > > > > pci_reset_fn_methods[] array
> > > > > > > > > > > > > > > defined here.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Ok!
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Note that with this series the resets available 
> > > > > > > > > > > > > > > through
> > > > > > > > > > > > > > > pci_reset_function() and the per device reset 
> > > > > > > > > > > > > > > attribute is sysfs remain
> > > > > > > > > > > > > > > exactly the same as they are currently.  The bus 
> > > > > > > > > > > > > > > and slot reset
> > > > > > > > > > > > > > > methods used here are limited to devices where 
> > > > > > > > > > > > > > > only a single function is
> > > > > > > > > > > > > > > affected by the reset, therefore it is not like 
> > > > > > > > > > > > > > > the patch you proposed
> > > > > > > > > > > > > > > which performed a reset irrespective of the 
> > > > > > > > > > > > > > > downstream devices.  This
> > > > > > > > > > > > > > > series only enables selection of the existing 
> > > > > > > > > > > > > > > methods.  Thanks,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Alex
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > But with this patch series, there is still an issue 
> > > > > > > > > > > > > > with PCI secondary
> > > > > > > > > > > > > > bus reset mechanism as exported sysfs attribute 
> > > > > > > > > > > > > > does not do that
> > > > > > > > > > > > > > remove-reset-rescan procedure. As discussed in 
> > > > > > > > > > > > > > other thread, this reset
> > > > > > > > > > > > > > let device in unconfigured / broken state.
> > > > > > > > > > > > >
> > > > > > > > > > > > > No, there's not:
> > > > > > > > > > > > >
> > > > > > > > > > > > > int pci_reset_function(struct pci_dev *dev)
> > > > > > > > > > > > > {
> > > > > > > > > > > > > int rc;
> > > > > > > > > > > > >
> > > > > > > > > > > > > if (!dev->reset_fn)
> > > > > > > > > > > > > return -ENOTTY;
> > > > > > > > > > > > >
> > 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-23 Thread Alex Williamson
On Tue, 23 Mar 2021 21:02:21 +0530
Amey Narkhede  wrote:

> On 21/03/23 08:44AM, Alex Williamson wrote:
> > On Tue, 23 Mar 2021 15:34:19 +0100
> > Pali Rohár  wrote:
> >  
> > > On Thursday 18 March 2021 20:01:55 Amey Narkhede wrote:  
> > > > On 21/03/17 09:13PM, Pali Rohár wrote:  
> > > > > On Wednesday 17 March 2021 14:00:20 Alex Williamson wrote:  
> > > > > > On Wed, 17 Mar 2021 20:40:24 +0100
> > > > > > Pali Rohár  wrote:
> > > > > >  
> > > > > > > On Wednesday 17 March 2021 13:32:45 Alex Williamson wrote:  
> > > > > > > > On Wed, 17 Mar 2021 20:24:24 +0100
> > > > > > > > Pali Rohár  wrote:
> > > > > > > >  
> > > > > > > > > On Wednesday 17 March 2021 13:15:36 Alex Williamson wrote:  
> > > > > > > > > > On Wed, 17 Mar 2021 20:02:06 +0100
> > > > > > > > > > Pali Rohár  wrote:
> > > > > > > > > >  
> > > > > > > > > > > On Monday 15 March 2021 09:03:39 Alex Williamson wrote:  
> > > > > > > > > > > > On Mon, 15 Mar 2021 15:52:38 +0100
> > > > > > > > > > > > Pali Rohár  wrote:
> > > > > > > > > > > >  
> > > > > > > > > > > > > On Monday 15 March 2021 08:34:09 Alex Williamson 
> > > > > > > > > > > > > wrote:  
> > > > > > > > > > > > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > > > > > > > > > > > Pali Rohár  wrote:
> > > > > > > > > > > > > >  
> > > > > > > > > > > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede 
> > > > > > > > > > > > > > > wrote:  
> > > > > > > > > > > > > > > > slot reset (pci_dev_reset_slot_function) and 
> > > > > > > > > > > > > > > > secondary bus
> > > > > > > > > > > > > > > > reset(pci_parent_bus_reset) which I think are 
> > > > > > > > > > > > > > > > hot reset and
> > > > > > > > > > > > > > > > warm reset respectively.  
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. 
> > > > > > > > > > > > > > > Slot reset is just another
> > > > > > > > > > > > > > > type of reset, which is currently implemented 
> > > > > > > > > > > > > > > only for PCIe hot plug
> > > > > > > > > > > > > > > bridges and for PowerPC PowerNV platform and it 
> > > > > > > > > > > > > > > just call PCI secondary
> > > > > > > > > > > > > > > bus reset with some other hook. PCIe Warm Reset 
> > > > > > > > > > > > > > > does not have API in
> > > > > > > > > > > > > > > kernel and therefore drivers do not export this 
> > > > > > > > > > > > > > > type of reset via any
> > > > > > > > > > > > > > > kernel function (yet).  
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Warm reset is beyond the scope of this series, but 
> > > > > > > > > > > > > > could be implemented
> > > > > > > > > > > > > > in a compatible way to fit within the 
> > > > > > > > > > > > > > pci_reset_fn_methods[] array
> > > > > > > > > > > > > > defined here.  
> > > > > > > > > > > > >
> > > > > > > > > > > > > Ok!
> > > > > > > > > > > > >  
> > > > > > > > > > > > > > Note that with this series the resets available 
> > > > > > > > > > > > > > through
> > > > > > > > > > > > > > pci_reset_function() and the per device reset 
> > > > > > > > > > > > > > attribute is sysfs remain
> > > > > > > > > > > > > > exactly the same as they are currently.  The bus 
> > > > > > > > > > > > > > and slot reset
> > > > > > > > > > > > > > methods used here are limited to devices where only 
> > > > > > > > > > > > > > a single function is
> > > > > > > > > > > > > > affected by the reset, therefore it is not like the 
> > > > > > > > > > > > > > patch you proposed
> > > > > > > > > > > > > > which performed a reset irrespective of the 
> > > > > > > > > > > > > > downstream devices.  This
> > > > > > > > > > > > > > series only enables selection of the existing 
> > > > > > > > > > > > > > methods.  Thanks,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Alex
> > > > > > > > > > > > > >  
> > > > > > > > > > > > >
> > > > > > > > > > > > > But with this patch series, there is still an issue 
> > > > > > > > > > > > > with PCI secondary
> > > > > > > > > > > > > bus reset mechanism as exported sysfs attribute does 
> > > > > > > > > > > > > not do that
> > > > > > > > > > > > > remove-reset-rescan procedure. As discussed in other 
> > > > > > > > > > > > > thread, this reset
> > > > > > > > > > > > > let device in unconfigured / broken state.  
> > > > > > > > > > > >
> > > > > > > > > > > > No, there's not:
> > > > > > > > > > > >
> > > > > > > > > > > > int pci_reset_function(struct pci_dev *dev)
> > > > > > > > > > > > {
> > > > > > > > > > > > int rc;
> > > > > > > > > > > >
> > > > > > > > > > > > if (!dev->reset_fn)
> > > > > > > > > > > > return -ENOTTY;
> > > > > > > > > > > >
> > > > > > > > > > > > pci_dev_lock(dev);  
> > > > > > > > > > > > >>> pci_dev_save_and_disable(dev);  
> > > > > > > > > > > >
> > > > > > > > > > > > rc = __pci_reset_function_locked(dev);
> > > > > > > > > > > >  
> > > > > > > > > > > > >>> pci_dev_restore(dev);  
> > > > > > > > > > > > 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-23 Thread Amey Narkhede
On 21/03/23 08:44AM, Alex Williamson wrote:
> On Tue, 23 Mar 2021 15:34:19 +0100
> Pali Rohár  wrote:
>
> > On Thursday 18 March 2021 20:01:55 Amey Narkhede wrote:
> > > On 21/03/17 09:13PM, Pali Rohár wrote:
> > > > On Wednesday 17 March 2021 14:00:20 Alex Williamson wrote:
> > > > > On Wed, 17 Mar 2021 20:40:24 +0100
> > > > > Pali Rohár  wrote:
> > > > >
> > > > > > On Wednesday 17 March 2021 13:32:45 Alex Williamson wrote:
> > > > > > > On Wed, 17 Mar 2021 20:24:24 +0100
> > > > > > > Pali Rohár  wrote:
> > > > > > >
> > > > > > > > On Wednesday 17 March 2021 13:15:36 Alex Williamson wrote:
> > > > > > > > > On Wed, 17 Mar 2021 20:02:06 +0100
> > > > > > > > > Pali Rohár  wrote:
> > > > > > > > >
> > > > > > > > > > On Monday 15 March 2021 09:03:39 Alex Williamson wrote:
> > > > > > > > > > > On Mon, 15 Mar 2021 15:52:38 +0100
> > > > > > > > > > > Pali Rohár  wrote:
> > > > > > > > > > >
> > > > > > > > > > > > On Monday 15 March 2021 08:34:09 Alex Williamson wrote:
> > > > > > > > > > > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > > > > > > > > > > Pali Rohár  wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede 
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > slot reset (pci_dev_reset_slot_function) and 
> > > > > > > > > > > > > > > secondary bus
> > > > > > > > > > > > > > > reset(pci_parent_bus_reset) which I think are hot 
> > > > > > > > > > > > > > > reset and
> > > > > > > > > > > > > > > warm reset respectively.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot 
> > > > > > > > > > > > > > reset is just another
> > > > > > > > > > > > > > type of reset, which is currently implemented only 
> > > > > > > > > > > > > > for PCIe hot plug
> > > > > > > > > > > > > > bridges and for PowerPC PowerNV platform and it 
> > > > > > > > > > > > > > just call PCI secondary
> > > > > > > > > > > > > > bus reset with some other hook. PCIe Warm Reset 
> > > > > > > > > > > > > > does not have API in
> > > > > > > > > > > > > > kernel and therefore drivers do not export this 
> > > > > > > > > > > > > > type of reset via any
> > > > > > > > > > > > > > kernel function (yet).
> > > > > > > > > > > > >
> > > > > > > > > > > > > Warm reset is beyond the scope of this series, but 
> > > > > > > > > > > > > could be implemented
> > > > > > > > > > > > > in a compatible way to fit within the 
> > > > > > > > > > > > > pci_reset_fn_methods[] array
> > > > > > > > > > > > > defined here.
> > > > > > > > > > > >
> > > > > > > > > > > > Ok!
> > > > > > > > > > > >
> > > > > > > > > > > > > Note that with this series the resets available 
> > > > > > > > > > > > > through
> > > > > > > > > > > > > pci_reset_function() and the per device reset 
> > > > > > > > > > > > > attribute is sysfs remain
> > > > > > > > > > > > > exactly the same as they are currently.  The bus and 
> > > > > > > > > > > > > slot reset
> > > > > > > > > > > > > methods used here are limited to devices where only a 
> > > > > > > > > > > > > single function is
> > > > > > > > > > > > > affected by the reset, therefore it is not like the 
> > > > > > > > > > > > > patch you proposed
> > > > > > > > > > > > > which performed a reset irrespective of the 
> > > > > > > > > > > > > downstream devices.  This
> > > > > > > > > > > > > series only enables selection of the existing 
> > > > > > > > > > > > > methods.  Thanks,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Alex
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > But with this patch series, there is still an issue 
> > > > > > > > > > > > with PCI secondary
> > > > > > > > > > > > bus reset mechanism as exported sysfs attribute does 
> > > > > > > > > > > > not do that
> > > > > > > > > > > > remove-reset-rescan procedure. As discussed in other 
> > > > > > > > > > > > thread, this reset
> > > > > > > > > > > > let device in unconfigured / broken state.
> > > > > > > > > > >
> > > > > > > > > > > No, there's not:
> > > > > > > > > > >
> > > > > > > > > > > int pci_reset_function(struct pci_dev *dev)
> > > > > > > > > > > {
> > > > > > > > > > > int rc;
> > > > > > > > > > >
> > > > > > > > > > > if (!dev->reset_fn)
> > > > > > > > > > > return -ENOTTY;
> > > > > > > > > > >
> > > > > > > > > > > pci_dev_lock(dev);
> > > > > > > > > > > >>> pci_dev_save_and_disable(dev);
> > > > > > > > > > >
> > > > > > > > > > > rc = __pci_reset_function_locked(dev);
> > > > > > > > > > >
> > > > > > > > > > > >>> pci_dev_restore(dev);
> > > > > > > > > > > pci_dev_unlock(dev);
> > > > > > > > > > >
> > > > > > > > > > > return rc;
> > > > > > > > > > > }
> > > > > > > > > > >
> > > > > > > > > > > The remove/re-scan was discussed primarily because your 
> > > > > > > > > > > patch performed
> > > > > > > > > > > a bus reset regardless of what devices were affected by 
> > > > > 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-23 Thread Alex Williamson
On Tue, 23 Mar 2021 15:34:19 +0100
Pali Rohár  wrote:

> On Thursday 18 March 2021 20:01:55 Amey Narkhede wrote:
> > On 21/03/17 09:13PM, Pali Rohár wrote:  
> > > On Wednesday 17 March 2021 14:00:20 Alex Williamson wrote:  
> > > > On Wed, 17 Mar 2021 20:40:24 +0100
> > > > Pali Rohár  wrote:
> > > >  
> > > > > On Wednesday 17 March 2021 13:32:45 Alex Williamson wrote:  
> > > > > > On Wed, 17 Mar 2021 20:24:24 +0100
> > > > > > Pali Rohár  wrote:
> > > > > >  
> > > > > > > On Wednesday 17 March 2021 13:15:36 Alex Williamson wrote:  
> > > > > > > > On Wed, 17 Mar 2021 20:02:06 +0100
> > > > > > > > Pali Rohár  wrote:
> > > > > > > >  
> > > > > > > > > On Monday 15 March 2021 09:03:39 Alex Williamson wrote:  
> > > > > > > > > > On Mon, 15 Mar 2021 15:52:38 +0100
> > > > > > > > > > Pali Rohár  wrote:
> > > > > > > > > >  
> > > > > > > > > > > On Monday 15 March 2021 08:34:09 Alex Williamson wrote:  
> > > > > > > > > > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > > > > > > > > > Pali Rohár  wrote:
> > > > > > > > > > > >  
> > > > > > > > > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote: 
> > > > > > > > > > > > >  
> > > > > > > > > > > > > > slot reset (pci_dev_reset_slot_function) and 
> > > > > > > > > > > > > > secondary bus
> > > > > > > > > > > > > > reset(pci_parent_bus_reset) which I think are hot 
> > > > > > > > > > > > > > reset and
> > > > > > > > > > > > > > warm reset respectively.  
> > > > > > > > > > > > >
> > > > > > > > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot 
> > > > > > > > > > > > > reset is just another
> > > > > > > > > > > > > type of reset, which is currently implemented only 
> > > > > > > > > > > > > for PCIe hot plug
> > > > > > > > > > > > > bridges and for PowerPC PowerNV platform and it just 
> > > > > > > > > > > > > call PCI secondary
> > > > > > > > > > > > > bus reset with some other hook. PCIe Warm Reset does 
> > > > > > > > > > > > > not have API in
> > > > > > > > > > > > > kernel and therefore drivers do not export this type 
> > > > > > > > > > > > > of reset via any
> > > > > > > > > > > > > kernel function (yet).  
> > > > > > > > > > > >
> > > > > > > > > > > > Warm reset is beyond the scope of this series, but 
> > > > > > > > > > > > could be implemented
> > > > > > > > > > > > in a compatible way to fit within the 
> > > > > > > > > > > > pci_reset_fn_methods[] array
> > > > > > > > > > > > defined here.  
> > > > > > > > > > >
> > > > > > > > > > > Ok!
> > > > > > > > > > >  
> > > > > > > > > > > > Note that with this series the resets available through
> > > > > > > > > > > > pci_reset_function() and the per device reset attribute 
> > > > > > > > > > > > is sysfs remain
> > > > > > > > > > > > exactly the same as they are currently.  The bus and 
> > > > > > > > > > > > slot reset
> > > > > > > > > > > > methods used here are limited to devices where only a 
> > > > > > > > > > > > single function is
> > > > > > > > > > > > affected by the reset, therefore it is not like the 
> > > > > > > > > > > > patch you proposed
> > > > > > > > > > > > which performed a reset irrespective of the downstream 
> > > > > > > > > > > > devices.  This
> > > > > > > > > > > > series only enables selection of the existing methods.  
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > >
> > > > > > > > > > > > Alex
> > > > > > > > > > > >  
> > > > > > > > > > >
> > > > > > > > > > > But with this patch series, there is still an issue with 
> > > > > > > > > > > PCI secondary
> > > > > > > > > > > bus reset mechanism as exported sysfs attribute does not 
> > > > > > > > > > > do that
> > > > > > > > > > > remove-reset-rescan procedure. As discussed in other 
> > > > > > > > > > > thread, this reset
> > > > > > > > > > > let device in unconfigured / broken state.  
> > > > > > > > > >
> > > > > > > > > > No, there's not:
> > > > > > > > > >
> > > > > > > > > > int pci_reset_function(struct pci_dev *dev)
> > > > > > > > > > {
> > > > > > > > > > int rc;
> > > > > > > > > >
> > > > > > > > > > if (!dev->reset_fn)
> > > > > > > > > > return -ENOTTY;
> > > > > > > > > >
> > > > > > > > > > pci_dev_lock(dev);  
> > > > > > > > > > >>> pci_dev_save_and_disable(dev);  
> > > > > > > > > >
> > > > > > > > > > rc = __pci_reset_function_locked(dev);
> > > > > > > > > >  
> > > > > > > > > > >>> pci_dev_restore(dev);  
> > > > > > > > > > pci_dev_unlock(dev);
> > > > > > > > > >
> > > > > > > > > > return rc;
> > > > > > > > > > }
> > > > > > > > > >
> > > > > > > > > > The remove/re-scan was discussed primarily because your 
> > > > > > > > > > patch performed
> > > > > > > > > > a bus reset regardless of what devices were affected by 
> > > > > > > > > > that reset and
> > > > > > > > > > it's difficult to manage the scope where multiple devices 
> > > > > > > > > > are affected.
> > > > > > > > > > Here, the bus and slot reset functions will fail unless the 
> > > > 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-23 Thread Pali Rohár
On Thursday 18 March 2021 20:01:55 Amey Narkhede wrote:
> On 21/03/17 09:13PM, Pali Rohár wrote:
> > On Wednesday 17 March 2021 14:00:20 Alex Williamson wrote:
> > > On Wed, 17 Mar 2021 20:40:24 +0100
> > > Pali Rohár  wrote:
> > >
> > > > On Wednesday 17 March 2021 13:32:45 Alex Williamson wrote:
> > > > > On Wed, 17 Mar 2021 20:24:24 +0100
> > > > > Pali Rohár  wrote:
> > > > >
> > > > > > On Wednesday 17 March 2021 13:15:36 Alex Williamson wrote:
> > > > > > > On Wed, 17 Mar 2021 20:02:06 +0100
> > > > > > > Pali Rohár  wrote:
> > > > > > >
> > > > > > > > On Monday 15 March 2021 09:03:39 Alex Williamson wrote:
> > > > > > > > > On Mon, 15 Mar 2021 15:52:38 +0100
> > > > > > > > > Pali Rohár  wrote:
> > > > > > > > >
> > > > > > > > > > On Monday 15 March 2021 08:34:09 Alex Williamson wrote:
> > > > > > > > > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > > > > > > > > Pali Rohár  wrote:
> > > > > > > > > > >
> > > > > > > > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote:
> > > > > > > > > > > > > slot reset (pci_dev_reset_slot_function) and 
> > > > > > > > > > > > > secondary bus
> > > > > > > > > > > > > reset(pci_parent_bus_reset) which I think are hot 
> > > > > > > > > > > > > reset and
> > > > > > > > > > > > > warm reset respectively.
> > > > > > > > > > > >
> > > > > > > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot 
> > > > > > > > > > > > reset is just another
> > > > > > > > > > > > type of reset, which is currently implemented only for 
> > > > > > > > > > > > PCIe hot plug
> > > > > > > > > > > > bridges and for PowerPC PowerNV platform and it just 
> > > > > > > > > > > > call PCI secondary
> > > > > > > > > > > > bus reset with some other hook. PCIe Warm Reset does 
> > > > > > > > > > > > not have API in
> > > > > > > > > > > > kernel and therefore drivers do not export this type of 
> > > > > > > > > > > > reset via any
> > > > > > > > > > > > kernel function (yet).
> > > > > > > > > > >
> > > > > > > > > > > Warm reset is beyond the scope of this series, but could 
> > > > > > > > > > > be implemented
> > > > > > > > > > > in a compatible way to fit within the 
> > > > > > > > > > > pci_reset_fn_methods[] array
> > > > > > > > > > > defined here.
> > > > > > > > > >
> > > > > > > > > > Ok!
> > > > > > > > > >
> > > > > > > > > > > Note that with this series the resets available through
> > > > > > > > > > > pci_reset_function() and the per device reset attribute 
> > > > > > > > > > > is sysfs remain
> > > > > > > > > > > exactly the same as they are currently.  The bus and slot 
> > > > > > > > > > > reset
> > > > > > > > > > > methods used here are limited to devices where only a 
> > > > > > > > > > > single function is
> > > > > > > > > > > affected by the reset, therefore it is not like the patch 
> > > > > > > > > > > you proposed
> > > > > > > > > > > which performed a reset irrespective of the downstream 
> > > > > > > > > > > devices.  This
> > > > > > > > > > > series only enables selection of the existing methods.  
> > > > > > > > > > > Thanks,
> > > > > > > > > > >
> > > > > > > > > > > Alex
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > But with this patch series, there is still an issue with 
> > > > > > > > > > PCI secondary
> > > > > > > > > > bus reset mechanism as exported sysfs attribute does not do 
> > > > > > > > > > that
> > > > > > > > > > remove-reset-rescan procedure. As discussed in other 
> > > > > > > > > > thread, this reset
> > > > > > > > > > let device in unconfigured / broken state.
> > > > > > > > >
> > > > > > > > > No, there's not:
> > > > > > > > >
> > > > > > > > > int pci_reset_function(struct pci_dev *dev)
> > > > > > > > > {
> > > > > > > > > int rc;
> > > > > > > > >
> > > > > > > > > if (!dev->reset_fn)
> > > > > > > > > return -ENOTTY;
> > > > > > > > >
> > > > > > > > > pci_dev_lock(dev);
> > > > > > > > > >>> pci_dev_save_and_disable(dev);
> > > > > > > > >
> > > > > > > > > rc = __pci_reset_function_locked(dev);
> > > > > > > > >
> > > > > > > > > >>> pci_dev_restore(dev);
> > > > > > > > > pci_dev_unlock(dev);
> > > > > > > > >
> > > > > > > > > return rc;
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > The remove/re-scan was discussed primarily because your patch 
> > > > > > > > > performed
> > > > > > > > > a bus reset regardless of what devices were affected by that 
> > > > > > > > > reset and
> > > > > > > > > it's difficult to manage the scope where multiple devices are 
> > > > > > > > > affected.
> > > > > > > > > Here, the bus and slot reset functions will fail unless the 
> > > > > > > > > scope is
> > > > > > > > > limited to the single device triggering this reset.  Thanks,
> > > > > > > > >
> > > > > > > > > Alex
> > > > > > > > >
> > > > > > > >
> > > > > > > > I was thinking a bit more about it and I'm really sure how it 
> > > > > > > > would
> > > > > > > > behave with hotplugging PCIe bridge.
> 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-22 Thread Alex Williamson
On Sun, 21 Mar 2021 10:40:55 +0200
Leon Romanovsky  wrote:

> On Sat, Mar 20, 2021 at 08:59:42AM -0600, Alex Williamson wrote:
> > On Sat, 20 Mar 2021 11:10:08 +0200
> > Leon Romanovsky  wrote:  
> > > On Fri, Mar 19, 2021 at 10:23:13AM -0600, Alex Williamson wrote:   
> > > > 
> > > > What if we taint the kernel or pci_warn() for cases where either all
> > > > the reset methods are disabled, ie. 'echo none > reset_method', or any
> > > > time a device specific method is disabled?
> > > 
> > > What does it mean "none"? Does it mean nothing supported? If yes, I think 
> > > that
> > > pci_warn() will be enough. At least for me, taint is usable during debug 
> > > stages,
> > > probably if device doesn't crash no one will look to see 
> > > /proc/sys/kernel/tainted.  
> > 
> > "none" as implemented in this patch, clearing the enabled function
> > reset methods.  
> 
> It is far from intuitive, the empty string will be easier to understand,
> because "none" means no reset at all.

"No reset at all" is what "none" achieves, the
pci_dev.reset_methods_enabled bitmap is cleared.  We can use an empty
string, but I think we want a way to clear all enabled resets and a way
to return it to the default.  I could see arguments for an empty string
serving either purpose, so this version proposed explicitly using
"none" and "default", as included in the ABI update.

> > > > I'd almost go so far as to prevent disabling a device specific reset
> > > > altogether, but for example should a device specific reset that fixes
> > > > an aspect of FLR behavior prevent using a bus reset?  I'd prefer in that
> > > > case if direct FLR were disabled via a device flag introduced with the
> > > > quirk and the remaining resets can still be selected by preference.
> > > 
> > > I don't know enough to discuss the PCI details, but you raised good point.
> > > This sysfs is user visible API that is presented as is from device point
> > > of view. It can be easily run into problems if PCI/core doesn't work with
> > > user's choice.
> > >   
> > > > 
> > > > Theoretically all the other reset methods work and are available, it's
> > > > only a policy decision which to use, right?
> > > 
> > > But this patch was presented as a way to overcome situations where
> > > supported != working and user magically knows which reset type to set.  
> > 
> > It's not magic, the new sysfs attributes expose which resets are
> > enabled and the order that they're used, the user can simply select the
> > next one.  Being able to bypass a broken reset method is a helpful side
> > effect of getting to select a preferred reset method.  
> 
> Magic in a sense that user has no idea what those resets mean, the
> expectation is that he will blindly iterate till something works.

Which ought to actually be a safe thing to do.  We should have quirks to
exclude resets that are known broken but still probe as present and I'd
be perfectly fine if we issue a warning if the user disables all resets
for a given device.
 
> > > If you want to take this patch to be policy decision tool,
> > > it will need to accept "reset_type1,reset_type2,..." sort of input,
> > > so fallback will work natively.  
> > 
> > I don't see that as a requirement.  We have fall-through support in the
> > kernel, but for a given device we're really only ever going to make use
> > of one of those methods.  If a user knows enough about a device to have
> > a preference, I think it can be singular.  That also significantly
> > simplifies the interface and supporting code.  Thanks,  
> 
> I'm struggling to get requirements from this thread. You talked about
> policy decision to overtake fallback mechanism, Amey wanted to avoid
> quirks.
> 
> Do you have an example of such devices or we are talking about
> theoretical case?

Look at any device that already has a reset quirk and the process it
took to get there.  Those are more than just theoretical cases.

For policy preference, I already described how I've configured QEMU to
prefer a bus reset rather than a PM reset due to lack of specification
regarding the scope of a PM "soft reset".  This interface would allow a
system policy to do that same thing.

I don't think anyone is suggesting this as a means to avoid quirks that
would resolve reset issues and create the best default general behavior.
This provides a mechanism to test various reset methods, and thereby
identify broken methods, and set a policy.  Sure, that policy might be
to avoid a broken reset in the interim before it gets quirked and
there's potential for abuse there, but I think the benefits outweigh
the risks.

> And I don't see why simple line parser with loop iterator over strchr()
> suddenly becomes complicated code.

Setting multiple bits in a bitmap is easy.  How do you then go on to
allow the user to specify an ordering preference?  If you have an
algorithm you'd like to propose that allows the user to manage the
ordering when enabling multiple methods without 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-21 Thread Amey Narkhede
On 21/03/21 10:40AM, Leon Romanovsky wrote:
> On Sat, Mar 20, 2021 at 08:59:42AM -0600, Alex Williamson wrote:
> > On Sat, 20 Mar 2021 11:10:08 +0200
> > Leon Romanovsky  wrote:
> > > On Fri, Mar 19, 2021 at 10:23:13AM -0600, Alex Williamson wrote:
> > > >
> > > > What if we taint the kernel or pci_warn() for cases where either all
> > > > the reset methods are disabled, ie. 'echo none > reset_method', or any
> > > > time a device specific method is disabled?
> > >
> > > What does it mean "none"? Does it mean nothing supported? If yes, I think 
> > > that
> > > pci_warn() will be enough. At least for me, taint is usable during debug 
> > > stages,
> > > probably if device doesn't crash no one will look to see 
> > > /proc/sys/kernel/tainted.
> >
> > "none" as implemented in this patch, clearing the enabled function
> > reset methods.
>
> It is far from intuitive, the empty string will be easier to understand,
> because "none" means no reset at all.
>
> >
> > > > I'd almost go so far as to prevent disabling a device specific reset
> > > > altogether, but for example should a device specific reset that fixes
> > > > an aspect of FLR behavior prevent using a bus reset?  I'd prefer in that
> > > > case if direct FLR were disabled via a device flag introduced with the
> > > > quirk and the remaining resets can still be selected by preference.
> > >
> > > I don't know enough to discuss the PCI details, but you raised good point.
> > > This sysfs is user visible API that is presented as is from device point
> > > of view. It can be easily run into problems if PCI/core doesn't work with
> > > user's choice.
> > >
> > > >
> > > > Theoretically all the other reset methods work and are available, it's
> > > > only a policy decision which to use, right?
> > >
> > > But this patch was presented as a way to overcome situations where
> > > supported != working and user magically knows which reset type to set.
> >
> > It's not magic, the new sysfs attributes expose which resets are
> > enabled and the order that they're used, the user can simply select the
> > next one.  Being able to bypass a broken reset method is a helpful side
> > effect of getting to select a preferred reset method.
>
> Magic in a sense that user has no idea what those resets mean, the
> expectation is that he will blindly iterate till something works.
>
> >
> > > If you want to take this patch to be policy decision tool,
> > > it will need to accept "reset_type1,reset_type2,..." sort of input,
> > > so fallback will work natively.
> >
> > I don't see that as a requirement.  We have fall-through support in the
> > kernel, but for a given device we're really only ever going to make use
> > of one of those methods.  If a user knows enough about a device to have
> > a preference, I think it can be singular.  That also significantly
> > simplifies the interface and supporting code.  Thanks,
>
> I'm struggling to get requirements from this thread. You talked about
> policy decision to overtake fallback mechanism, Amey wanted to avoid
> quirks.
Just to clarify I don't want to avoid quirks. I just want device
to be usable even if it doesn't have quirk as the quirk for that
particular device may not be developed at all for different reasons
mentioned earlier.
[...]

Thanks,
Amey


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-21 Thread Leon Romanovsky
On Sat, Mar 20, 2021 at 08:59:42AM -0600, Alex Williamson wrote:
> On Sat, 20 Mar 2021 11:10:08 +0200
> Leon Romanovsky  wrote:
> > On Fri, Mar 19, 2021 at 10:23:13AM -0600, Alex Williamson wrote: 
> > > 
> > > What if we taint the kernel or pci_warn() for cases where either all
> > > the reset methods are disabled, ie. 'echo none > reset_method', or any
> > > time a device specific method is disabled?  
> > 
> > What does it mean "none"? Does it mean nothing supported? If yes, I think 
> > that
> > pci_warn() will be enough. At least for me, taint is usable during debug 
> > stages,
> > probably if device doesn't crash no one will look to see 
> > /proc/sys/kernel/tainted.
> 
> "none" as implemented in this patch, clearing the enabled function
> reset methods.

It is far from intuitive, the empty string will be easier to understand,
because "none" means no reset at all.

> 
> > > I'd almost go so far as to prevent disabling a device specific reset
> > > altogether, but for example should a device specific reset that fixes
> > > an aspect of FLR behavior prevent using a bus reset?  I'd prefer in that
> > > case if direct FLR were disabled via a device flag introduced with the
> > > quirk and the remaining resets can still be selected by preference.  
> > 
> > I don't know enough to discuss the PCI details, but you raised good point.
> > This sysfs is user visible API that is presented as is from device point
> > of view. It can be easily run into problems if PCI/core doesn't work with
> > user's choice.
> > 
> > > 
> > > Theoretically all the other reset methods work and are available, it's
> > > only a policy decision which to use, right?  
> > 
> > But this patch was presented as a way to overcome situations where
> > supported != working and user magically knows which reset type to set.
> 
> It's not magic, the new sysfs attributes expose which resets are
> enabled and the order that they're used, the user can simply select the
> next one.  Being able to bypass a broken reset method is a helpful side
> effect of getting to select a preferred reset method.

Magic in a sense that user has no idea what those resets mean, the
expectation is that he will blindly iterate till something works.

> 
> > If you want to take this patch to be policy decision tool,
> > it will need to accept "reset_type1,reset_type2,..." sort of input,
> > so fallback will work natively.
> 
> I don't see that as a requirement.  We have fall-through support in the
> kernel, but for a given device we're really only ever going to make use
> of one of those methods.  If a user knows enough about a device to have
> a preference, I think it can be singular.  That also significantly
> simplifies the interface and supporting code.  Thanks,

I'm struggling to get requirements from this thread. You talked about
policy decision to overtake fallback mechanism, Amey wanted to avoid
quirks.

Do you have an example of such devices or we are talking about
theoretical case?

And I don't see why simple line parser with loop iterator over strchr()
suddenly becomes complicated code.

Thanks

> 
> Alex
> 


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-20 Thread Alex Williamson
On Sat, 20 Mar 2021 11:10:08 +0200
Leon Romanovsky  wrote:
> On Fri, Mar 19, 2021 at 10:23:13AM -0600, Alex Williamson wrote: 
> > 
> > What if we taint the kernel or pci_warn() for cases where either all
> > the reset methods are disabled, ie. 'echo none > reset_method', or any
> > time a device specific method is disabled?  
> 
> What does it mean "none"? Does it mean nothing supported? If yes, I think that
> pci_warn() will be enough. At least for me, taint is usable during debug 
> stages,
> probably if device doesn't crash no one will look to see 
> /proc/sys/kernel/tainted.

"none" as implemented in this patch, clearing the enabled function
reset methods.

> > I'd almost go so far as to prevent disabling a device specific reset
> > altogether, but for example should a device specific reset that fixes
> > an aspect of FLR behavior prevent using a bus reset?  I'd prefer in that
> > case if direct FLR were disabled via a device flag introduced with the
> > quirk and the remaining resets can still be selected by preference.  
> 
> I don't know enough to discuss the PCI details, but you raised good point.
> This sysfs is user visible API that is presented as is from device point
> of view. It can be easily run into problems if PCI/core doesn't work with
> user's choice.
> 
> > 
> > Theoretically all the other reset methods work and are available, it's
> > only a policy decision which to use, right?  
> 
> But this patch was presented as a way to overcome situations where
> supported != working and user magically knows which reset type to set.

It's not magic, the new sysfs attributes expose which resets are
enabled and the order that they're used, the user can simply select the
next one.  Being able to bypass a broken reset method is a helpful side
effect of getting to select a preferred reset method.

> If you want to take this patch to be policy decision tool,
> it will need to accept "reset_type1,reset_type2,..." sort of input,
> so fallback will work natively.

I don't see that as a requirement.  We have fall-through support in the
kernel, but for a given device we're really only ever going to make use
of one of those methods.  If a user knows enough about a device to have
a preference, I think it can be singular.  That also significantly
simplifies the interface and supporting code.  Thanks,

Alex



Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-20 Thread Leon Romanovsky
On Fri, Mar 19, 2021 at 10:23:13AM -0600, Alex Williamson wrote:
> On Fri, 19 Mar 2021 14:59:47 +0200
> Leon Romanovsky  wrote:
> 
> > On Thu, Mar 18, 2021 at 07:34:56PM +0100, Enrico Weigelt, metux IT consult 
> > wrote:
> > > On 18.03.21 18:22, Leon Romanovsky wrote:
> > >   
> > > > Which email client do you use?
> > > > Your responses are grouped as one huge block without any chance to 
> > > > respond
> > > > to you on specific point or answer to your question.  
> > > 
> > > I'm reading this thread in Tbird, and threading / quoting all looks
> > > nice.  
> > 
> > I'm not talking about threading or quoting but about response itself.
> > See it here 
> > https://lore.kernel.org/lkml/20210318103935.2ec32...@omen.home.shazbot.org/
> > Alex's response is one big chunk without any separations to paragraphs.
> 
> I've never known paragraph breaks to be required to interject a reply.

Of course not, but as Bjorn said if you don't do paragraphs, we will
need manually break your message, fix ">" quotation marks and half
sentences.

I just wanted to be sure that this is not my mail client.

> 
> Back on topic...
> 
> > >   
> > > > I see your flow and understand your position, but will repeat my
> > > > position. We need to make sure that vendors will have incentive to
> > > > supply quirks.  
> 
> What if we taint the kernel or pci_warn() for cases where either all
> the reset methods are disabled, ie. 'echo none > reset_method', or any
> time a device specific method is disabled?

What does it mean "none"? Does it mean nothing supported? If yes, I think that
pci_warn() will be enough. At least for me, taint is usable during debug stages,
probably if device doesn't crash no one will look to see 
/proc/sys/kernel/tainted.

> 
> I'd almost go so far as to prevent disabling a device specific reset
> altogether, but for example should a device specific reset that fixes
> an aspect of FLR behavior prevent using a bus reset?  I'd prefer in that
> case if direct FLR were disabled via a device flag introduced with the
> quirk and the remaining resets can still be selected by preference.

I don't know enough to discuss the PCI details, but you raised good point.
This sysfs is user visible API that is presented as is from device point
of view. It can be easily run into problems if PCI/core doesn't work with
user's choice.

> 
> Theoretically all the other reset methods work and are available, it's
> only a policy decision which to use, right?

But this patch was presented as a way to overcome situations where
supported != working and user magically knows which reset type to set.

If you want to take this patch to be policy decision tool,
it will need to accept "reset_type1,reset_type2,..." sort of input,
so fallback will work natively.

I think that it will be much more robust and cleaner solution than it is now.
Something like that:
cat /sys//reset_policy
reset_type1,reset_type2,...,reset_typeX
echo "reset_type3,reset_type1" > /sys//reset_policy
cat /sys//reset_policy
reset_type3,reset_type1

Thanks


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-19 Thread Leon Romanovsky
On Fri, Mar 19, 2021 at 10:57:11AM -0500, Bjorn Helgaas wrote:
> On Fri, Mar 19, 2021 at 02:59:47PM +0200, Leon Romanovsky wrote:
> > On Thu, Mar 18, 2021 at 07:34:56PM +0100, Enrico Weigelt, metux IT consult 
> > wrote:
> > > On 18.03.21 18:22, Leon Romanovsky wrote:
> > > 
> > > > Which email client do you use?  Your responses are grouped as
> > > > one huge block without any chance to respond to you on specific
> > > > point or answer to your question.
> > > 
> > > I'm reading this thread in Tbird, and threading / quoting all
> > > looks nice.
> > 
> > I'm not talking about threading or quoting but about response
> > itself.  See it here
> > https://lore.kernel.org/lkml/20210318103935.2ec32...@omen.home.shazbot.org/
> > Alex's response is one big chunk without any separations to
> > paragraphs.
> 
> Don't make this harder than it needs to be.  I think it's totally
> acceptable to just split Alex's text where you need to respond.  For
> example, Alex wrote this:
> 
>   vfio-pci uses the internal kernel API, ie. the variants of
>   pci_reset_function(), which is the same interface used by the existing
>   sysfs reset mechanism.  This proposed configuration of the reset method
>   would affect any driver using that same core infrastructure and from my
>   perspective that's really the goal.  ...
> 
> If I wanted to respond to the first sentence, I would just do this:
> 
> aw> vfio-pci uses the internal kernel API, ie. the variants of
> aw> pci_reset_function(), which is the same interface used by the existing
> aw> sysfs reset mechanism.  
> 
> I would write my response to the above here.  The rest of the quote
> continues on below.  If the rest of Alex's message isn't relevant to
> my response, I would remove it completely.
> 
> aw> This proposed configuration of the reset method
> aw> would affect any driver using that same core infrastructure and from my
> aw> perspective that's really the goal.  ...
> 
> Bjorn

Thanks Bjorn, you presented me how to respond on such messages, however
I was more afraid if my setup needs some adjustments and it is only me
who sees it as one chunk.

Thanks



Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-19 Thread Alex Williamson
On Fri, 19 Mar 2021 14:59:47 +0200
Leon Romanovsky  wrote:

> On Thu, Mar 18, 2021 at 07:34:56PM +0100, Enrico Weigelt, metux IT consult 
> wrote:
> > On 18.03.21 18:22, Leon Romanovsky wrote:
> >   
> > > Which email client do you use?
> > > Your responses are grouped as one huge block without any chance to respond
> > > to you on specific point or answer to your question.  
> > 
> > I'm reading this thread in Tbird, and threading / quoting all looks
> > nice.  
> 
> I'm not talking about threading or quoting but about response itself.
> See it here 
> https://lore.kernel.org/lkml/20210318103935.2ec32...@omen.home.shazbot.org/
> Alex's response is one big chunk without any separations to paragraphs.

I've never known paragraph breaks to be required to interject a reply.

Back on topic...

> >   
> > > I see your flow and understand your position, but will repeat my
> > > position. We need to make sure that vendors will have incentive to
> > > supply quirks.  

What if we taint the kernel or pci_warn() for cases where either all
the reset methods are disabled, ie. 'echo none > reset_method', or any
time a device specific method is disabled?

I'd almost go so far as to prevent disabling a device specific reset
altogether, but for example should a device specific reset that fixes
an aspect of FLR behavior prevent using a bus reset?  I'd prefer in that
case if direct FLR were disabled via a device flag introduced with the
quirk and the remaining resets can still be selected by preference.

Theoretically all the other reset methods work and are available, it's
only a policy decision which to use, right?

If a device probes for a reset that's broken and distros start
including systemd scripts to apply a preference to avoid it, (a) that
enables them to work with existing kernels, and (b) indicates to us to
add the trivial quirk to flag that reset as broken.

The other side of the argument that this discourages quirks is that
this interface actually makes it significantly easier to report specific
reset methods as broken for a given device.

Thanks,
Alex



Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-19 Thread Bjorn Helgaas
On Fri, Mar 19, 2021 at 02:59:47PM +0200, Leon Romanovsky wrote:
> On Thu, Mar 18, 2021 at 07:34:56PM +0100, Enrico Weigelt, metux IT consult 
> wrote:
> > On 18.03.21 18:22, Leon Romanovsky wrote:
> > 
> > > Which email client do you use?  Your responses are grouped as
> > > one huge block without any chance to respond to you on specific
> > > point or answer to your question.
> > 
> > I'm reading this thread in Tbird, and threading / quoting all
> > looks nice.
> 
> I'm not talking about threading or quoting but about response
> itself.  See it here
> https://lore.kernel.org/lkml/20210318103935.2ec32...@omen.home.shazbot.org/
> Alex's response is one big chunk without any separations to
> paragraphs.

Don't make this harder than it needs to be.  I think it's totally
acceptable to just split Alex's text where you need to respond.  For
example, Alex wrote this:

  vfio-pci uses the internal kernel API, ie. the variants of
  pci_reset_function(), which is the same interface used by the existing
  sysfs reset mechanism.  This proposed configuration of the reset method
  would affect any driver using that same core infrastructure and from my
  perspective that's really the goal.  ...

If I wanted to respond to the first sentence, I would just do this:

aw> vfio-pci uses the internal kernel API, ie. the variants of
aw> pci_reset_function(), which is the same interface used by the existing
aw> sysfs reset mechanism.  

I would write my response to the above here.  The rest of the quote
continues on below.  If the rest of Alex's message isn't relevant to
my response, I would remove it completely.

aw> This proposed configuration of the reset method
aw> would affect any driver using that same core infrastructure and from my
aw> perspective that's really the goal.  ...

Bjorn


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-19 Thread Amey Narkhede
On 21/03/19 05:37PM, Leon Romanovsky wrote:
> On Fri, Mar 19, 2021 at 08:53:17PM +0530, Amey Narkhede wrote:
> > On 21/03/19 03:05PM, Leon Romanovsky wrote:
>
> <...>
>
> > > > > It was exactly the reason why I think that VM usecase presented by
> > > > > you is not viable.
> > > > >
> > > > Well I didn't present it as new use case. I just gave existing
> > > > usecase based on existing reset attribute. Nothing new here.
> > > > Nothing really changes wrt that use case.
> > >
> > > Of course it is new, please see Alex's response, he said that vfio uses
> > > in-kernel API and not sysfs.
> > >
> > Still it doesn't change in-kernel API either.
>
> Right, but the issue is with user space part of this proposal and not
> in-kernel API.
Userspace part just inhances existing reset attribute still no
significant changes there.
>
>
> <...>
>
> > > > As mentioned earlier not all vendors care about Linux and not
> > > > all of the population can afford to buy new HW just to run Linux.
> > >
> > > Sorry, but you are not consistent. At the beginning, we talked about new 
> > > HW
> > > that has bugs but don't have quirks yet. Here we are talking about old HW
> > > that still doesn't have quirks.
> > >
> > > Thanks
> > >
> > Does it really matter whether HW is old or new?
> > If old HW doesn't have quirks yet how can we expect
> > new one to have quirks? What if new HW is made by same vendors
> > who don't have any interest in Linux?
>
> It is pretty clear that this sysfs won't improve quirks situation but
> has all potential to reduce their amount even more.
>
> Let's stop this discussion here.
>
> Thanks
>
IMO it does improve usability of devices which I consider to be more
important than developing quirks which are just bandages in the end
not HW fix. There's no point in using Linux if
I can't use the device in the first place and expecting to wait
for some community member to develop quirk without vendor support
is simply unrealistic.
So let's stop this discussion here.

Thanks,
Amey


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-19 Thread Leon Romanovsky
On Fri, Mar 19, 2021 at 02:48:12PM +0100, Enrico Weigelt, metux IT consult 
wrote:
> On 19.03.21 13:59, Leon Romanovsky wrote:

<...>

> In any case, I still fail to see why giving operators an debug knob
> should make anything worse.

I see this patch as a workaround to stop and provide quirks for reset issues.
As a way forward, we can do this sysfs visible for DEBUG/EXPERT .config builds.
What do you think?

> 
> > > [ And often, even a combination of them isn't enough. Did you know that
> > >even Google doesn't get all specs necessary to replace away the ugly
> > >FSP blob ? (it's the same w/ AMD, but meanwhile I'm pissed enought to
> > >reverse engineer their AGESA blob). ]
> > 
> > I don't know about this specific Google case, but from my previous 
> > experience.
> > The reasons why vendor says no to Google are usually due to licensing and 
> > legal
> > issues and not open source vs. proprietary.
> 
> In short words: Google did (still does?) build their own mainboards and
> FW (IIRC that's where LinuxBoot came from), but even with their HUGE
> quantities (they buy cpus in quantities of truck loads) they still did
> not manage to get any specs for writing their own early init w/o the
> proprietary FSP.
> 
> The licensing / legal issues can either be:
> 
> a) we, the mightly Intel Corp., have been so extremly stupid for
>licensing some vital IP stuff (what exactly could that be, in exactly
>the prime domain of Intel ?) and signing such insane crontracts, that
>we're not allowed to tell anybody how to actually use our own
>products (yes: initializing the CPU and built-in interfaces belongs
>exactly into that category)
> b) we, the mighty Intel Corp., couldn't build something on our own, but
>just stolen IP (in our primary domain) and are scared that anybody
>could find out from just reading some early setup code.
> c) we, the mighty Intel Corp., rule the world and we give a phrack on
>what some tiny Customers like Google want from us.
> d) we, the mightly Intel Corp., did do what our name tells: INTEL,
>and we don't want anybody raise unpleasant questions.

I would say
 e) We, Intel, have fixes and optimization logic (patented or specific to 
different
 customers) that is applicable  to our HW and we can't open it to Google 
because it
 will be used against us, in procurement and development. See recent article 
about
 ex-Intel employee who used this information when placed bids in Microsoft.
 
https://www.usnews.com/news/best-states/oregon/articles/2021-02-08/intel-sues-engineer-who-went-to-microsoft-over-trade-secrets

> 
> 
> choose your poison :P
> 
> 
> --mtx
> 
> -- 
> ---
> Hinweis: unverschlüsselte E-Mails können leicht abgehört und manipuliert
> werden ! Für eine vertrauliche Kommunikation senden Sie bitte ihren
> GPG/PGP-Schlüssel zu.
> ---
> Enrico Weigelt, metux IT consult
> Free software and Linux embedded engineering
> i...@metux.net -- +49-151-27565287


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-19 Thread Leon Romanovsky
On Fri, Mar 19, 2021 at 08:53:17PM +0530, Amey Narkhede wrote:
> On 21/03/19 03:05PM, Leon Romanovsky wrote:

<...>

> > > > It was exactly the reason why I think that VM usecase presented by
> > > > you is not viable.
> > > >
> > > Well I didn't present it as new use case. I just gave existing
> > > usecase based on existing reset attribute. Nothing new here.
> > > Nothing really changes wrt that use case.
> >
> > Of course it is new, please see Alex's response, he said that vfio uses
> > in-kernel API and not sysfs.
> >
> Still it doesn't change in-kernel API either.

Right, but the issue is with user space part of this proposal and not
in-kernel API.


<...>

> > > As mentioned earlier not all vendors care about Linux and not
> > > all of the population can afford to buy new HW just to run Linux.
> >
> > Sorry, but you are not consistent. At the beginning, we talked about new HW
> > that has bugs but don't have quirks yet. Here we are talking about old HW
> > that still doesn't have quirks.
> >
> > Thanks
> >
> Does it really matter whether HW is old or new?
> If old HW doesn't have quirks yet how can we expect
> new one to have quirks? What if new HW is made by same vendors
> who don't have any interest in Linux?

It is pretty clear that this sysfs won't improve quirks situation but
has all potential to reduce their amount even more.

Let's stop this discussion here.

Thanks

> 
> Thanks,
> Amey


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-19 Thread Amey Narkhede
On 21/03/19 03:05PM, Leon Romanovsky wrote:
> On Thu, Mar 18, 2021 at 11:13:44PM +0530, Amey Narkhede wrote:
> > On 21/03/18 07:35PM, Leon Romanovsky wrote:
> > > On Thu, Mar 18, 2021 at 10:31:43PM +0530, Amey Narkhede wrote:
> > > > On 21/03/18 04:57PM, Leon Romanovsky wrote:
> > > > > On Thu, Mar 18, 2021 at 07:52:52PM +0530, Amey Narkhede wrote:
> > > > > > On 21/03/18 11:09AM, Leon Romanovsky wrote:
> > > > > > > On Wed, Mar 17, 2021 at 11:31:40AM -0600, Alex Williamson wrote:
> > > > > > > > On Wed, 17 Mar 2021 15:58:40 +0200
> > > > > > > > Leon Romanovsky  wrote:
> > >
> > > <...>
> > >
> > > > > > > I'm lost here, does vfio-pci use sysfs interface or internal to 
> > > > > > > the kernel API?
> > > > > > >
> > > > > > > If it is latter then we don't really need sysfs, if not, we still 
> > > > > > > need
> > > > > > > some sort of DB to create second policy, because "supported != 
> > > > > > > working".
> > > > > > > What am I missing?
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > Can you explain bit more about why supported != working?
> > > > >
> > > > > It is written in the commit message of this patch.
> > > > > https://lore.kernel.org/lkml/20210312173452.3855-1-ameynarkhed...@gmail.com/
> > > > > "This feature aims to allow greater control of a device for use cases
> > > > > as device assignment, where specific device or platform issues may
> > > > > interact poorly with a given reset method, and for which device 
> > > > > specific
> > > > > quirks have not been developed."
> > > > >
> > > > > You wrote it and also repeated it a couple of times during the 
> > > > > discussion.
> > > > >
> > > > > If device can understand that specific reset doesn't work, it won't
> > > > > perform it in first place.
> > > > >
> > > > > Thanks
> > > > Is it possible for device to understand whether or not specific reset
> > > > will work or not prior to performing reset and after it indicates
> > > > support for that reset method? Maybe theres problem with that particular
> > > > piece of hardware in that machine.
> > > > How can database be maintained if a particular machines have
> > > > particular piece of faulty HW?
> > >
> > > It was exactly the reason why I think that VM usecase presented by
> > > you is not viable.
> > >
> > Well I didn't present it as new use case. I just gave existing
> > usecase based on existing reset attribute. Nothing new here.
> > Nothing really changes wrt that use case.
>
> Of course it is new, please see Alex's response, he said that vfio uses
> in-kernel API and not sysfs.
>
Still it doesn't change in-kernel API either.
> > > > If for some reason reset doesn't work it will just give -ENOTTY.
> > > > This isn't any different from existing behavior.Actually it informs user
> > > > that the reset method didn't reset the device and user can use different
> > > > reset method instead of implicitly using different reset method.
> > > > If user doesn't explicitly set preferred reset method then
> > > > we go ahead with existing implicit fall through behavior which will try 
> > > > all
> > > > available reset methods until any one of them works.
> > > > If you have device that doesn't support reset at all then you have
> > > > option to completely disable it unlike existing reset attribute where
> > > > you cannot disable reset. So it gives greater control where you can
> > > > disable the reset altogether when quirk isn't developed yet.
> > >
> > > I explicitly asked to hear usecase, right now, I got an explanation from
> > > Alex for policy decision (which doesn't need sysfs) and from you about
> > > overcoming HW bugs with expectation that user will be guru of PCI reset
> > > methods.
> > >
> > > >
> > > > We can't expect to develop quirk for every device in existence.
> > >
> > > It doesn't give us an excuse do not try.
> > >
> > > > For example on my laptop elantech touchpad still doesn't work in 2021
> > > > with vanilla kernel, arch linux applies the patch which was reverted in
> > > > mainline kernel for some reason.
> > >
> > > I see it as a good example of cheap solution. Vendor won't fix your
> > > touchpad because distros provide workaround. The same will be with reset.
> > >
> > > Thanks
> > >
> > As mentioned earlier not all vendors care about Linux and not
> > all of the population can afford to buy new HW just to run Linux.
>
> Sorry, but you are not consistent. At the beginning, we talked about new HW
> that has bugs but don't have quirks yet. Here we are talking about old HW
> that still doesn't have quirks.
>
> Thanks
>
Does it really matter whether HW is old or new?
If old HW doesn't have quirks yet how can we expect
new one to have quirks? What if new HW is made by same vendors
who don't have any interest in Linux?

Thanks,
Amey


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-19 Thread Enrico Weigelt, metux IT consult

On 19.03.21 13:59, Leon Romanovsky wrote:


I really doubt we can influence that by any technical decision here in
the kernel.


There are subsystems that succeeded to do it, for example netdev, RDMA e.t.c.


I'd guess either hi-end / server or embedded products - already
mentioned that these are different fields. I've been talking about the
average consumer products.

OTOH, there're also very expensive vendors that are exceptionally bad,
eg. National instruments (who even are capable of breaking rpm so badly
with their proprietary packages that they open up 0day holes - i once
filed a report @FD on such a case).


IMHO, the expensive ones don't care either.

Does eg. Dell publish board schematics ? Do they even publish exact part
lists (exact chipsets) along with their brochures, so customers can
check wether their HW is supported, before buying and trying out ?


They do it because they are allowed to do it and not because they
explicitly want to annoyance their customers.


Yes, they're just ignorant. They can still do that, because buy their
pretty expensive cheap-hardware. And that's mostly driven by purchase
people inside the customer organisations, who just don't care how much
damage they do to their own employers, by dictating purchase of
expensive broken-by-design hardware. ... but that's nothing we here have
any influence on - except for dissuasion and purchase boycott ...

In any case, I still fail to see why giving operators an debug knob
should make anything worse.


[ And often, even a combination of them isn't enough. Did you know that
   even Google doesn't get all specs necessary to replace away the ugly
   FSP blob ? (it's the same w/ AMD, but meanwhile I'm pissed enought to
   reverse engineer their AGESA blob). ]


I don't know about this specific Google case, but from my previous experience.
The reasons why vendor says no to Google are usually due to licensing and legal
issues and not open source vs. proprietary.


In short words: Google did (still does?) build their own mainboards and
FW (IIRC that's where LinuxBoot came from), but even with their HUGE
quantities (they buy cpus in quantities of truck loads) they still did
not manage to get any specs for writing their own early init w/o the
proprietary FSP.

The licensing / legal issues can either be:

a) we, the mightly Intel Corp., have been so extremly stupid for
   licensing some vital IP stuff (what exactly could that be, in exactly
   the prime domain of Intel ?) and signing such insane crontracts, that
   we're not allowed to tell anybody how to actually use our own
   products (yes: initializing the CPU and built-in interfaces belongs
   exactly into that category)
b) we, the mighty Intel Corp., couldn't build something on our own, but
   just stolen IP (in our primary domain) and are scared that anybody
   could find out from just reading some early setup code.
c) we, the mighty Intel Corp., rule the world and we give a phrack on
   what some tiny Customers like Google want from us.
d) we, the mightly Intel Corp., did do what our name tells: INTEL,
   and we don't want anybody raise unpleasant questions.


choose your poison :P


--mtx

--
---
Hinweis: unverschlüsselte E-Mails können leicht abgehört und manipuliert
werden ! Für eine vertrauliche Kommunikation senden Sie bitte ihren
GPG/PGP-Schlüssel zu.
---
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
i...@metux.net -- +49-151-27565287


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-19 Thread Leon Romanovsky
On Thu, Mar 18, 2021 at 06:58:25PM +0100, Enrico Weigelt, metux IT consult 
wrote:
> On 18.03.21 18:35, Leon Romanovsky wrote:
> 
> > I see it as a good example of cheap solution. Vendor won't fix your
> > touchpad because distros provide workaround. The same will be with reset.
> 
> Usually, vendor won't fix it, anyways, regardless of any kernel
> workarounds.

It is not only vendors, but enthusiasts won't fix too, because their
distro works.

Thanks


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-19 Thread Leon Romanovsky
On Thu, Mar 18, 2021 at 11:13:44PM +0530, Amey Narkhede wrote:
> On 21/03/18 07:35PM, Leon Romanovsky wrote:
> > On Thu, Mar 18, 2021 at 10:31:43PM +0530, Amey Narkhede wrote:
> > > On 21/03/18 04:57PM, Leon Romanovsky wrote:
> > > > On Thu, Mar 18, 2021 at 07:52:52PM +0530, Amey Narkhede wrote:
> > > > > On 21/03/18 11:09AM, Leon Romanovsky wrote:
> > > > > > On Wed, Mar 17, 2021 at 11:31:40AM -0600, Alex Williamson wrote:
> > > > > > > On Wed, 17 Mar 2021 15:58:40 +0200
> > > > > > > Leon Romanovsky  wrote:
> >
> > <...>
> >
> > > > > > I'm lost here, does vfio-pci use sysfs interface or internal to the 
> > > > > > kernel API?
> > > > > >
> > > > > > If it is latter then we don't really need sysfs, if not, we still 
> > > > > > need
> > > > > > some sort of DB to create second policy, because "supported != 
> > > > > > working".
> > > > > > What am I missing?
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > Can you explain bit more about why supported != working?
> > > >
> > > > It is written in the commit message of this patch.
> > > > https://lore.kernel.org/lkml/20210312173452.3855-1-ameynarkhed...@gmail.com/
> > > > "This feature aims to allow greater control of a device for use cases
> > > > as device assignment, where specific device or platform issues may
> > > > interact poorly with a given reset method, and for which device specific
> > > > quirks have not been developed."
> > > >
> > > > You wrote it and also repeated it a couple of times during the 
> > > > discussion.
> > > >
> > > > If device can understand that specific reset doesn't work, it won't
> > > > perform it in first place.
> > > >
> > > > Thanks
> > > Is it possible for device to understand whether or not specific reset
> > > will work or not prior to performing reset and after it indicates
> > > support for that reset method? Maybe theres problem with that particular
> > > piece of hardware in that machine.
> > > How can database be maintained if a particular machines have
> > > particular piece of faulty HW?
> >
> > It was exactly the reason why I think that VM usecase presented by
> > you is not viable.
> >
> Well I didn't present it as new use case. I just gave existing
> usecase based on existing reset attribute. Nothing new here.
> Nothing really changes wrt that use case.

Of course it is new, please see Alex's response, he said that vfio uses
in-kernel API and not sysfs.

> > > If for some reason reset doesn't work it will just give -ENOTTY.
> > > This isn't any different from existing behavior.Actually it informs user
> > > that the reset method didn't reset the device and user can use different
> > > reset method instead of implicitly using different reset method.
> > > If user doesn't explicitly set preferred reset method then
> > > we go ahead with existing implicit fall through behavior which will try 
> > > all
> > > available reset methods until any one of them works.
> > > If you have device that doesn't support reset at all then you have
> > > option to completely disable it unlike existing reset attribute where
> > > you cannot disable reset. So it gives greater control where you can
> > > disable the reset altogether when quirk isn't developed yet.
> >
> > I explicitly asked to hear usecase, right now, I got an explanation from
> > Alex for policy decision (which doesn't need sysfs) and from you about
> > overcoming HW bugs with expectation that user will be guru of PCI reset
> > methods.
> >
> > >
> > > We can't expect to develop quirk for every device in existence.
> >
> > It doesn't give us an excuse do not try.
> >
> > > For example on my laptop elantech touchpad still doesn't work in 2021
> > > with vanilla kernel, arch linux applies the patch which was reverted in
> > > mainline kernel for some reason.
> >
> > I see it as a good example of cheap solution. Vendor won't fix your
> > touchpad because distros provide workaround. The same will be with reset.
> >
> > Thanks
> >
> As mentioned earlier not all vendors care about Linux and not
> all of the population can afford to buy new HW just to run Linux.

Sorry, but you are not consistent. At the beginning, we talked about new HW
that has bugs but don't have quirks yet. Here we are talking about old HW
that still doesn't have quirks.

Thanks

> 
> Thanks,
> Amey


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-19 Thread Leon Romanovsky
On Thu, Mar 18, 2021 at 07:34:56PM +0100, Enrico Weigelt, metux IT consult 
wrote:
> On 18.03.21 18:22, Leon Romanovsky wrote:
> 
> > Which email client do you use?
> > Your responses are grouped as one huge block without any chance to respond
> > to you on specific point or answer to your question.
> 
> I'm reading this thread in Tbird, and threading / quoting all looks
> nice.

I'm not talking about threading or quoting but about response itself.
See it here 
https://lore.kernel.org/lkml/20210318103935.2ec32...@omen.home.shazbot.org/
Alex's response is one big chunk without any separations to paragraphs.

> 
> > I see your flow and understand your position, but will repeat my
> > position. We need to make sure that vendors will have incentive to
> > supply quirks.
> 
> I really doubt we can influence that by any technical decision here in
> the kernel.

There are subsystems that succeeded to do it, for example netdev, RDMA e.t.c.

> 
> > And regarding vendors, see Amey response below about his touchpad troubles.
> > The cheap electronics vendors don't care about their users.
> 
> IMHO, the expensive ones don't care either.
> 
> Does eg. Dell publish board schematics ? Do they even publish exact part
> lists (exact chipsets) along with their brochures, so customers can
> check wether their HW is supported, before buying and trying out ?

They do it because they are allowed to do it and not because they
explicitly want to annoyance their customers. 

> 
> Doesn't seem so. I've personally seen a lot cases where some supposedly
> supported HW turned out to be some completely different and unsupported
> HW that's sold under exactly the same product ID. One of many reasons
> for not giving them a single penny anymore.
> 
> IMHO, there're only very few changes of convincing some HW vendor for
> doing a better job on driver side:
> 
> a) product is targeted for a niche that can't live without Linux
>(eg. embedded)
> b) it's really *dangerous* for your market share if anything doesn't
>work properly on Linux (eg. certan server machines)
> c) somebody *really* big (like Google) is gun-pointing at some supplier,
>who's got a lot to loose
> d) a *massive* worldwide shitstorm against the vendor
> 
> [ And often, even a combination of them isn't enough. Did you know that
>   even Google doesn't get all specs necessary to replace away the ugly
>   FSP blob ? (it's the same w/ AMD, but meanwhile I'm pissed enought to
>   reverse engineer their AGESA blob). ]

I don't know about this specific Google case, but from my previous experience.
The reasons why vendor says no to Google are usually due to licensing and legal
issues and not open source vs. proprietary.

> 
> You see, what we do here in the kernel has no practical influence on
> those hw vendors.

I see it differently, but it doesn't matter. This is too theoretical
discussion to my taste.

> 
> 
> --mtx
> 
> -- 
> ---
> Hinweis: unverschlüsselte E-Mails können leicht abgehört und manipuliert
> werden ! Für eine vertrauliche Kommunikation senden Sie bitte ihren
> GPG/PGP-Schlüssel zu.
> ---
> Enrico Weigelt, metux IT consult
> Free software and Linux embedded engineering
> i...@metux.net -- +49-151-27565287


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-18 Thread Enrico Weigelt, metux IT consult

On 18.03.21 18:22, Leon Romanovsky wrote:


Which email client do you use?
Your responses are grouped as one huge block without any chance to respond
to you on specific point or answer to your question.


I'm reading this thread in Tbird, and threading / quoting all looks
nice.


I see your flow and understand your position, but will repeat my
position. We need to make sure that vendors will have incentive to
supply quirks.


I really doubt we can influence that by any technical decision here in
the kernel.


And regarding vendors, see Amey response below about his touchpad troubles.
The cheap electronics vendors don't care about their users.


IMHO, the expensive ones don't care either.

Does eg. Dell publish board schematics ? Do they even publish exact part
lists (exact chipsets) along with their brochures, so customers can
check wether their HW is supported, before buying and trying out ?

Doesn't seem so. I've personally seen a lot cases where some supposedly
supported HW turned out to be some completely different and unsupported
HW that's sold under exactly the same product ID. One of many reasons
for not giving them a single penny anymore.

IMHO, there're only very few changes of convincing some HW vendor for
doing a better job on driver side:

a) product is targeted for a niche that can't live without Linux
   (eg. embedded)
b) it's really *dangerous* for your market share if anything doesn't
   work properly on Linux (eg. certan server machines)
c) somebody *really* big (like Google) is gun-pointing at some supplier,
   who's got a lot to loose
d) a *massive* worldwide shitstorm against the vendor

[ And often, even a combination of them isn't enough. Did you know that
  even Google doesn't get all specs necessary to replace away the ugly
  FSP blob ? (it's the same w/ AMD, but meanwhile I'm pissed enought to
  reverse engineer their AGESA blob). ]

You see, what we do here in the kernel has no practical influence on
those hw vendors.


--mtx

--
---
Hinweis: unverschlüsselte E-Mails können leicht abgehört und manipuliert
werden ! Für eine vertrauliche Kommunikation senden Sie bitte ihren
GPG/PGP-Schlüssel zu.
---
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
i...@metux.net -- +49-151-27565287


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-18 Thread Enrico Weigelt, metux IT consult

On 18.03.21 18:43, Amey Narkhede wrote:


Well I didn't present it as new use case. I just gave existing
usecase based on existing reset attribute. Nothing new here.
Nothing really changes wrt that use case.


As a board driver maintainer, I fully support your case. At least as a
development/debugging. And even if people out there play around and find
their own workarounds, these can give us maintainers valuable insights
and save us a lot of time.


As mentioned earlier not all vendors care about Linux and not
all of the population can afford to buy new HW just to run Linux.


At least in the x86 world (arm is *much* better here), even the
(supposedly) Linux-friendly ones often don't really care, especially if
the board isn't the newerst model anymore.

Unfortunately, what we do or don't do in the kernel has practically no
influence on board vendor decisions. The best we can practically achieve
at their side is slowing them down on smearing bullshit into FW and acpi
tables. Even getting some useful documentation from vendors is a really
rare thing.

ARM world with device tree, of course, is much better (except for closed
consumer devices like "smartphones" or acpi-poisoned arm64 boxes). At
least for profession embedded boards.


--mtx

--
---
Hinweis: unverschlüsselte E-Mails können leicht abgehört und manipuliert
werden ! Für eine vertrauliche Kommunikation senden Sie bitte ihren
GPG/PGP-Schlüssel zu.
---
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
i...@metux.net -- +49-151-27565287


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-18 Thread Enrico Weigelt, metux IT consult

On 18.03.21 18:35, Leon Romanovsky wrote:


I see it as a good example of cheap solution. Vendor won't fix your
touchpad because distros provide workaround. The same will be with reset.


Usually, vendor won't fix it, anyways, regardless of any kernel
workarounds.

Most Vendors are already completely overstrained w/ anything
software-related. A good reason why we should try to get rid firmware,
as much as we can.

It's really sad. A *decent* vendor would just provide a clean DT and
(actually matching!) schematics. But that's really hard to find, these
days :(


--mtx

--
---
Hinweis: unverschlüsselte E-Mails können leicht abgehört und manipuliert
werden ! Für eine vertrauliche Kommunikation senden Sie bitte ihren
GPG/PGP-Schlüssel zu.
---
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
i...@metux.net -- +49-151-27565287


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-18 Thread Enrico Weigelt, metux IT consult

On 15.03.21 00:55, Pali Rohár wrote:


Moreover for mPCIe form factor cards, boards can share one PERST# signal
with more PCIe cards and control this signal via GPIO. So asserting
PERST# GPIO can trigger Warm reset for more PCIe cards, not just one. It
depends on board or topology.


The pcengines apu* boards happen to be such candidates: they've got
three m.2 slots, but not all wired in the same way (depending on actual
model, not all have pcie wired). Reset lines are driven via gpio, and
some devices (I recall some lte basebands) sometimes need an explicit
reset in order to come up properly.

I have to check the schematics for the diffrent models, how exactly
these gpios are wired. (i've got reports that some production lines
don't have them wired at all - but couldn't confirm this on my own).

BTW: any idea how to inject board specific reset methods, after the
host brigde driver is already active ? In my case, apu boards, the
pci host bridge is probed via acpi and the apu board driver (which sets
up gpios, leds, keys, ...) comes much later.


--mtx

--
---
Hinweis: unverschlüsselte E-Mails können leicht abgehört und manipuliert
werden ! Für eine vertrauliche Kommunikation senden Sie bitte ihren
GPG/PGP-Schlüssel zu.
---
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
i...@metux.net -- +49-151-27565287


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-18 Thread Amey Narkhede
On 21/03/18 07:35PM, Leon Romanovsky wrote:
> On Thu, Mar 18, 2021 at 10:31:43PM +0530, Amey Narkhede wrote:
> > On 21/03/18 04:57PM, Leon Romanovsky wrote:
> > > On Thu, Mar 18, 2021 at 07:52:52PM +0530, Amey Narkhede wrote:
> > > > On 21/03/18 11:09AM, Leon Romanovsky wrote:
> > > > > On Wed, Mar 17, 2021 at 11:31:40AM -0600, Alex Williamson wrote:
> > > > > > On Wed, 17 Mar 2021 15:58:40 +0200
> > > > > > Leon Romanovsky  wrote:
>
> <...>
>
> > > > > I'm lost here, does vfio-pci use sysfs interface or internal to the 
> > > > > kernel API?
> > > > >
> > > > > If it is latter then we don't really need sysfs, if not, we still need
> > > > > some sort of DB to create second policy, because "supported != 
> > > > > working".
> > > > > What am I missing?
> > > > >
> > > > > Thanks
> > > > >
> > > > Can you explain bit more about why supported != working?
> > >
> > > It is written in the commit message of this patch.
> > > https://lore.kernel.org/lkml/20210312173452.3855-1-ameynarkhed...@gmail.com/
> > > "This feature aims to allow greater control of a device for use cases
> > > as device assignment, where specific device or platform issues may
> > > interact poorly with a given reset method, and for which device specific
> > > quirks have not been developed."
> > >
> > > You wrote it and also repeated it a couple of times during the discussion.
> > >
> > > If device can understand that specific reset doesn't work, it won't
> > > perform it in first place.
> > >
> > > Thanks
> > Is it possible for device to understand whether or not specific reset
> > will work or not prior to performing reset and after it indicates
> > support for that reset method? Maybe theres problem with that particular
> > piece of hardware in that machine.
> > How can database be maintained if a particular machines have
> > particular piece of faulty HW?
>
> It was exactly the reason why I think that VM usecase presented by
> you is not viable.
>
Well I didn't present it as new use case. I just gave existing
usecase based on existing reset attribute. Nothing new here.
Nothing really changes wrt that use case.
> > If for some reason reset doesn't work it will just give -ENOTTY.
> > This isn't any different from existing behavior.Actually it informs user
> > that the reset method didn't reset the device and user can use different
> > reset method instead of implicitly using different reset method.
> > If user doesn't explicitly set preferred reset method then
> > we go ahead with existing implicit fall through behavior which will try all
> > available reset methods until any one of them works.
> > If you have device that doesn't support reset at all then you have
> > option to completely disable it unlike existing reset attribute where
> > you cannot disable reset. So it gives greater control where you can
> > disable the reset altogether when quirk isn't developed yet.
>
> I explicitly asked to hear usecase, right now, I got an explanation from
> Alex for policy decision (which doesn't need sysfs) and from you about
> overcoming HW bugs with expectation that user will be guru of PCI reset
> methods.
>
> >
> > We can't expect to develop quirk for every device in existence.
>
> It doesn't give us an excuse do not try.
>
> > For example on my laptop elantech touchpad still doesn't work in 2021
> > with vanilla kernel, arch linux applies the patch which was reverted in
> > mainline kernel for some reason.
>
> I see it as a good example of cheap solution. Vendor won't fix your
> touchpad because distros provide workaround. The same will be with reset.
>
> Thanks
>
As mentioned earlier not all vendors care about Linux and not
all of the population can afford to buy new HW just to run Linux.

Thanks,
Amey


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-18 Thread Amey Narkhede
On 21/03/18 07:22PM, Leon Romanovsky wrote:
> On Thu, Mar 18, 2021 at 10:39:35AM -0600, Alex Williamson wrote:
> > On Thu, 18 Mar 2021 11:09:34 +0200
> > Leon Romanovsky  wrote:
>
> <...>
>
> > > I'm lost here, does vfio-pci use sysfs interface or internal to the 
> > > kernel API?
> > >
> > > If it is latter then we don't really need sysfs, if not, we still need
> > > some sort of DB to create second policy, because "supported != working".
> > > What am I missing?
> >
> > vfio-pci uses the internal kernel API, ie. the variants of
> > pci_reset_function(), which is the same interface used by the existing
> > sysfs reset mechanism.  This proposed configuration of the reset method
> > would affect any driver using that same core infrastructure and from my
> > perspective that's really the goal.  In the case where a supported
> > reset mechanism fails for a device, continuing to quirk those out for
> > the best default behavior makes sense, I'd be disappointed for a vendor
> > to not pursue improving the default behavior where it clearly makes
> > sense.  However, there's also a policy decision, the kernel imposes a
> > preferential ordering of reset mechanism.  Is that ordering the best
> > case for all users?  I've presented above a case where a userspace may
> > prefer a policy of preferring a bus reset to a PM reset.  So I think
> > the question is not only are there supported mechanisms that don't
> > work, where this interface allows userspace to more readily identify
> > and work around those sorts of issues, but it also enables user
> > preference and easier evaluation whether all of the supported reset
> > mechanisms work rather than just the first one we encounter in the
> > ordering we've decided to impose today.  Thanks,
>
>
[...]
> And regarding vendors, see Amey response below about his touchpad troubles.
> The cheap electronics vendors don't care about their users.
>
> Thanks
>
On the side note that vendor probably doesn't care about
Linux users because even that reverted patch was submitted
by community member.
Many vendors are satisfied with windows only drivers.
They don't have any reason to support Linux. That doesn't
mean we should also abandon those users.

Thanks,
Amey


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-18 Thread Leon Romanovsky
On Thu, Mar 18, 2021 at 10:31:43PM +0530, Amey Narkhede wrote:
> On 21/03/18 04:57PM, Leon Romanovsky wrote:
> > On Thu, Mar 18, 2021 at 07:52:52PM +0530, Amey Narkhede wrote:
> > > On 21/03/18 11:09AM, Leon Romanovsky wrote:
> > > > On Wed, Mar 17, 2021 at 11:31:40AM -0600, Alex Williamson wrote:
> > > > > On Wed, 17 Mar 2021 15:58:40 +0200
> > > > > Leon Romanovsky  wrote:

<...>

> > > > I'm lost here, does vfio-pci use sysfs interface or internal to the 
> > > > kernel API?
> > > >
> > > > If it is latter then we don't really need sysfs, if not, we still need
> > > > some sort of DB to create second policy, because "supported != working".
> > > > What am I missing?
> > > >
> > > > Thanks
> > > >
> > > Can you explain bit more about why supported != working?
> >
> > It is written in the commit message of this patch.
> > https://lore.kernel.org/lkml/20210312173452.3855-1-ameynarkhed...@gmail.com/
> > "This feature aims to allow greater control of a device for use cases
> > as device assignment, where specific device or platform issues may
> > interact poorly with a given reset method, and for which device specific
> > quirks have not been developed."
> >
> > You wrote it and also repeated it a couple of times during the discussion.
> >
> > If device can understand that specific reset doesn't work, it won't
> > perform it in first place.
> >
> > Thanks
> Is it possible for device to understand whether or not specific reset
> will work or not prior to performing reset and after it indicates
> support for that reset method? Maybe theres problem with that particular
> piece of hardware in that machine.
> How can database be maintained if a particular machines have
> particular piece of faulty HW?

It was exactly the reason why I think that VM usecase presented by
you is not viable.

> If for some reason reset doesn't work it will just give -ENOTTY.
> This isn't any different from existing behavior.Actually it informs user
> that the reset method didn't reset the device and user can use different
> reset method instead of implicitly using different reset method.
> If user doesn't explicitly set preferred reset method then
> we go ahead with existing implicit fall through behavior which will try all
> available reset methods until any one of them works.
> If you have device that doesn't support reset at all then you have
> option to completely disable it unlike existing reset attribute where
> you cannot disable reset. So it gives greater control where you can
> disable the reset altogether when quirk isn't developed yet.

I explicitly asked to hear usecase, right now, I got an explanation from
Alex for policy decision (which doesn't need sysfs) and from you about
overcoming HW bugs with expectation that user will be guru of PCI reset
methods.

>
> We can't expect to develop quirk for every device in existence.

It doesn't give us an excuse do not try.

> For example on my laptop elantech touchpad still doesn't work in 2021
> with vanilla kernel, arch linux applies the patch which was reverted in
> mainline kernel for some reason.

I see it as a good example of cheap solution. Vendor won't fix your
touchpad because distros provide workaround. The same will be with reset.

Thanks

>
> Thanks,
> Amey


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-18 Thread Leon Romanovsky
On Thu, Mar 18, 2021 at 10:39:35AM -0600, Alex Williamson wrote:
> On Thu, 18 Mar 2021 11:09:34 +0200
> Leon Romanovsky  wrote:

<...>

> > I'm lost here, does vfio-pci use sysfs interface or internal to the kernel 
> > API?
> >
> > If it is latter then we don't really need sysfs, if not, we still need
> > some sort of DB to create second policy, because "supported != working".
> > What am I missing?
>
> vfio-pci uses the internal kernel API, ie. the variants of
> pci_reset_function(), which is the same interface used by the existing
> sysfs reset mechanism.  This proposed configuration of the reset method
> would affect any driver using that same core infrastructure and from my
> perspective that's really the goal.  In the case where a supported
> reset mechanism fails for a device, continuing to quirk those out for
> the best default behavior makes sense, I'd be disappointed for a vendor
> to not pursue improving the default behavior where it clearly makes
> sense.  However, there's also a policy decision, the kernel imposes a
> preferential ordering of reset mechanism.  Is that ordering the best
> case for all users?  I've presented above a case where a userspace may
> prefer a policy of preferring a bus reset to a PM reset.  So I think
> the question is not only are there supported mechanisms that don't
> work, where this interface allows userspace to more readily identify
> and work around those sorts of issues, but it also enables user
> preference and easier evaluation whether all of the supported reset
> mechanisms work rather than just the first one we encounter in the
> ordering we've decided to impose today.  Thanks,

Alex,

Which email client do you use?
Your responses are grouped as one huge block without any chance to respond
to you on specific point or answer to your question.

I see your flow and understand your position, but will repeat my
position. We need to make sure that vendors will have incentive to
supply quirks.

And regarding vendors, see Amey response below about his touchpad troubles.
The cheap electronics vendors don't care about their users.

Thanks

>
> Alex
>


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-18 Thread Amey Narkhede
On 21/03/18 04:57PM, Leon Romanovsky wrote:
> On Thu, Mar 18, 2021 at 07:52:52PM +0530, Amey Narkhede wrote:
> > On 21/03/18 11:09AM, Leon Romanovsky wrote:
> > > On Wed, Mar 17, 2021 at 11:31:40AM -0600, Alex Williamson wrote:
> > > > On Wed, 17 Mar 2021 15:58:40 +0200
> > > > Leon Romanovsky  wrote:
> > > >
> > > > > On Wed, Mar 17, 2021 at 06:47:18PM +0530, Amey Narkhede wrote:
> > > > > > On 21/03/17 01:47PM, Leon Romanovsky wrote:
> > > > > > > On Wed, Mar 17, 2021 at 04:53:09PM +0530, Amey Narkhede wrote:
> > > > > > > > On 21/03/17 01:02PM, Leon Romanovsky wrote:
> > > > > > > > > On Wed, Mar 17, 2021 at 03:54:47PM +0530, Amey Narkhede wrote:
> > > > > > > > > > On 21/03/17 06:20AM, Leon Romanovsky wrote:
> > > > > > > > > > > On Mon, Mar 15, 2021 at 06:32:32PM +, Raphael Norwitz 
> > > > > > > > > > > wrote:
> > > > > > > > > > > > On Mon, Mar 15, 2021 at 10:29:50AM -0600, Alex 
> > > > > > > > > > > > Williamson wrote:
> > > > > > > > > > > > > On Mon, 15 Mar 2021 21:03:41 +0530
> > > > > > > > > > > > > Amey Narkhede  wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > On 21/03/15 05:07PM, Leon Romanovsky wrote:
> > > > > > > > > > > > > > > On Mon, Mar 15, 2021 at 08:34:09AM -0600, Alex 
> > > > > > > > > > > > > > > Williamson wrote:
> > > > > > > > > > > > > > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > > > > > > > > > > > > > Pali Rohár  wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Monday 15 March 2021 19:13:23 Amey 
> > > > > > > > > > > > > > > > > Narkhede wrote:
> > > > > > > > > > > > > > > > > > slot reset (pci_dev_reset_slot_function) 
> > > > > > > > > > > > > > > > > > and secondary bus
> > > > > > > > > > > > > > > > > > reset(pci_parent_bus_reset) which I think 
> > > > > > > > > > > > > > > > > > are hot reset and
> > > > > > > > > > > > > > > > > > warm reset respectively.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. 
> > > > > > > > > > > > > > > > > Slot reset is just another
> > > > > > > > > > > > > > > > > type of reset, which is currently implemented 
> > > > > > > > > > > > > > > > > only for PCIe hot plug
> > > > > > > > > > > > > > > > > bridges and for PowerPC PowerNV platform and 
> > > > > > > > > > > > > > > > > it just call PCI secondary
> > > > > > > > > > > > > > > > > bus reset with some other hook. PCIe Warm 
> > > > > > > > > > > > > > > > > Reset does not have API in
> > > > > > > > > > > > > > > > > kernel and therefore drivers do not export 
> > > > > > > > > > > > > > > > > this type of reset via any
> > > > > > > > > > > > > > > > > kernel function (yet).
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Warm reset is beyond the scope of this series, 
> > > > > > > > > > > > > > > > but could be implemented
> > > > > > > > > > > > > > > > in a compatible way to fit within the 
> > > > > > > > > > > > > > > > pci_reset_fn_methods[] array
> > > > > > > > > > > > > > > > defined here.  Note that with this series the 
> > > > > > > > > > > > > > > > resets available through
> > > > > > > > > > > > > > > > pci_reset_function() and the per device reset 
> > > > > > > > > > > > > > > > attribute is sysfs remain
> > > > > > > > > > > > > > > > exactly the same as they are currently.  The 
> > > > > > > > > > > > > > > > bus and slot reset
> > > > > > > > > > > > > > > > methods used here are limited to devices where 
> > > > > > > > > > > > > > > > only a single function is
> > > > > > > > > > > > > > > > affected by the reset, therefore it is not like 
> > > > > > > > > > > > > > > > the patch you proposed
> > > > > > > > > > > > > > > > which performed a reset irrespective of the 
> > > > > > > > > > > > > > > > downstream devices.  This
> > > > > > > > > > > > > > > > series only enables selection of the existing 
> > > > > > > > > > > > > > > > methods.  Thanks,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Alex,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I asked the patch author here [1], but didn't get 
> > > > > > > > > > > > > > > any response, maybe
> > > > > > > > > > > > > > > you can answer me. What is the use case scenario 
> > > > > > > > > > > > > > > for this functionality?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > [1] 
> > > > > > > > > > > > > > > https://lore.kernel.org/lkml/YE389lAqjJSeTolM@unreal/
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > Sorry for not responding immediately. There were 
> > > > > > > > > > > > > > some buggy wifi cards
> > > > > > > > > > > > > > which needed FLR explicitly not sure if that 
> > > > > > > > > > > > > > behavior is fixed in
> > > > > > > > > > > > > > drivers. Also there is use a case at Nutanix but 
> > > > > > > > > > > > > > the engineer who
> > > > > > > > > > > > > > is involved is on PTO that is why I did not respond 
> > > > > > > > > > > > > > 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-18 Thread Alex Williamson
On Thu, 18 Mar 2021 11:09:34 +0200
Leon Romanovsky  wrote:

> On Wed, Mar 17, 2021 at 11:31:40AM -0600, Alex Williamson wrote:
> > On Wed, 17 Mar 2021 15:58:40 +0200
> > Leon Romanovsky  wrote:
> >  
> > > On Wed, Mar 17, 2021 at 06:47:18PM +0530, Amey Narkhede wrote:  
> > > > On 21/03/17 01:47PM, Leon Romanovsky wrote:  
> > > > > On Wed, Mar 17, 2021 at 04:53:09PM +0530, Amey Narkhede wrote:  
> > > > > > On 21/03/17 01:02PM, Leon Romanovsky wrote:  
> > > > > > > On Wed, Mar 17, 2021 at 03:54:47PM +0530, Amey Narkhede wrote:  
> > > > > > > > On 21/03/17 06:20AM, Leon Romanovsky wrote:  
> > > > > > > > > On Mon, Mar 15, 2021 at 06:32:32PM +, Raphael Norwitz 
> > > > > > > > > wrote:  
> > > > > > > > > > On Mon, Mar 15, 2021 at 10:29:50AM -0600, Alex Williamson 
> > > > > > > > > > wrote:  
> > > > > > > > > > > On Mon, 15 Mar 2021 21:03:41 +0530
> > > > > > > > > > > Amey Narkhede  wrote:
> > > > > > > > > > >  
> > > > > > > > > > > > On 21/03/15 05:07PM, Leon Romanovsky wrote:  
> > > > > > > > > > > > > On Mon, Mar 15, 2021 at 08:34:09AM -0600, Alex 
> > > > > > > > > > > > > Williamson wrote:  
> > > > > > > > > > > > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > > > > > > > > > > > Pali Rohár  wrote:
> > > > > > > > > > > > > >  
> > > > > > > > > > > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede 
> > > > > > > > > > > > > > > wrote:  
> > > > > > > > > > > > > > > > slot reset (pci_dev_reset_slot_function) and 
> > > > > > > > > > > > > > > > secondary bus
> > > > > > > > > > > > > > > > reset(pci_parent_bus_reset) which I think are 
> > > > > > > > > > > > > > > > hot reset and
> > > > > > > > > > > > > > > > warm reset respectively.  
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. 
> > > > > > > > > > > > > > > Slot reset is just another
> > > > > > > > > > > > > > > type of reset, which is currently implemented 
> > > > > > > > > > > > > > > only for PCIe hot plug
> > > > > > > > > > > > > > > bridges and for PowerPC PowerNV platform and it 
> > > > > > > > > > > > > > > just call PCI secondary
> > > > > > > > > > > > > > > bus reset with some other hook. PCIe Warm Reset 
> > > > > > > > > > > > > > > does not have API in
> > > > > > > > > > > > > > > kernel and therefore drivers do not export this 
> > > > > > > > > > > > > > > type of reset via any
> > > > > > > > > > > > > > > kernel function (yet).  
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Warm reset is beyond the scope of this series, but 
> > > > > > > > > > > > > > could be implemented
> > > > > > > > > > > > > > in a compatible way to fit within the 
> > > > > > > > > > > > > > pci_reset_fn_methods[] array
> > > > > > > > > > > > > > defined here.  Note that with this series the 
> > > > > > > > > > > > > > resets available through
> > > > > > > > > > > > > > pci_reset_function() and the per device reset 
> > > > > > > > > > > > > > attribute is sysfs remain
> > > > > > > > > > > > > > exactly the same as they are currently.  The bus 
> > > > > > > > > > > > > > and slot reset
> > > > > > > > > > > > > > methods used here are limited to devices where only 
> > > > > > > > > > > > > > a single function is
> > > > > > > > > > > > > > affected by the reset, therefore it is not like the 
> > > > > > > > > > > > > > patch you proposed
> > > > > > > > > > > > > > which performed a reset irrespective of the 
> > > > > > > > > > > > > > downstream devices.  This
> > > > > > > > > > > > > > series only enables selection of the existing 
> > > > > > > > > > > > > > methods.  Thanks,  
> > > > > > > > > > > > >
> > > > > > > > > > > > > Alex,
> > > > > > > > > > > > >
> > > > > > > > > > > > > I asked the patch author here [1], but didn't get any 
> > > > > > > > > > > > > response, maybe
> > > > > > > > > > > > > you can answer me. What is the use case scenario for 
> > > > > > > > > > > > > this functionality?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks
> > > > > > > > > > > > >
> > > > > > > > > > > > > [1] 
> > > > > > > > > > > > > https://lore.kernel.org/lkml/YE389lAqjJSeTolM@unreal/
> > > > > > > > > > > > >  
> > > > > > > > > > > > Sorry for not responding immediately. There were some 
> > > > > > > > > > > > buggy wifi cards
> > > > > > > > > > > > which needed FLR explicitly not sure if that behavior 
> > > > > > > > > > > > is fixed in
> > > > > > > > > > > > drivers. Also there is use a case at Nutanix but the 
> > > > > > > > > > > > engineer who
> > > > > > > > > > > > is involved is on PTO that is why I did not respond 
> > > > > > > > > > > > immediately as
> > > > > > > > > > > > I don't know the details yet.  
> > > > > > > > > > >
> > > > > > > > > > > And more generally, devices continue to have reset issues 
> > > > > > > > > > > and we
> > > > > > > > > > > impose a fixed priority in our ordering.  We can and 
> > > > > > > > > > > probably should
> > > > > > > > > > > continue to quirk devices when we find broken 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-18 Thread Leon Romanovsky
On Thu, Mar 18, 2021 at 07:52:52PM +0530, Amey Narkhede wrote:
> On 21/03/18 11:09AM, Leon Romanovsky wrote:
> > On Wed, Mar 17, 2021 at 11:31:40AM -0600, Alex Williamson wrote:
> > > On Wed, 17 Mar 2021 15:58:40 +0200
> > > Leon Romanovsky  wrote:
> > >
> > > > On Wed, Mar 17, 2021 at 06:47:18PM +0530, Amey Narkhede wrote:
> > > > > On 21/03/17 01:47PM, Leon Romanovsky wrote:
> > > > > > On Wed, Mar 17, 2021 at 04:53:09PM +0530, Amey Narkhede wrote:
> > > > > > > On 21/03/17 01:02PM, Leon Romanovsky wrote:
> > > > > > > > On Wed, Mar 17, 2021 at 03:54:47PM +0530, Amey Narkhede wrote:
> > > > > > > > > On 21/03/17 06:20AM, Leon Romanovsky wrote:
> > > > > > > > > > On Mon, Mar 15, 2021 at 06:32:32PM +, Raphael Norwitz 
> > > > > > > > > > wrote:
> > > > > > > > > > > On Mon, Mar 15, 2021 at 10:29:50AM -0600, Alex Williamson 
> > > > > > > > > > > wrote:
> > > > > > > > > > > > On Mon, 15 Mar 2021 21:03:41 +0530
> > > > > > > > > > > > Amey Narkhede  wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > On 21/03/15 05:07PM, Leon Romanovsky wrote:
> > > > > > > > > > > > > > On Mon, Mar 15, 2021 at 08:34:09AM -0600, Alex 
> > > > > > > > > > > > > > Williamson wrote:
> > > > > > > > > > > > > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > > > > > > > > > > > > Pali Rohár  wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede 
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > slot reset (pci_dev_reset_slot_function) and 
> > > > > > > > > > > > > > > > > secondary bus
> > > > > > > > > > > > > > > > > reset(pci_parent_bus_reset) which I think are 
> > > > > > > > > > > > > > > > > hot reset and
> > > > > > > > > > > > > > > > > warm reset respectively.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. 
> > > > > > > > > > > > > > > > Slot reset is just another
> > > > > > > > > > > > > > > > type of reset, which is currently implemented 
> > > > > > > > > > > > > > > > only for PCIe hot plug
> > > > > > > > > > > > > > > > bridges and for PowerPC PowerNV platform and it 
> > > > > > > > > > > > > > > > just call PCI secondary
> > > > > > > > > > > > > > > > bus reset with some other hook. PCIe Warm Reset 
> > > > > > > > > > > > > > > > does not have API in
> > > > > > > > > > > > > > > > kernel and therefore drivers do not export this 
> > > > > > > > > > > > > > > > type of reset via any
> > > > > > > > > > > > > > > > kernel function (yet).
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Warm reset is beyond the scope of this series, 
> > > > > > > > > > > > > > > but could be implemented
> > > > > > > > > > > > > > > in a compatible way to fit within the 
> > > > > > > > > > > > > > > pci_reset_fn_methods[] array
> > > > > > > > > > > > > > > defined here.  Note that with this series the 
> > > > > > > > > > > > > > > resets available through
> > > > > > > > > > > > > > > pci_reset_function() and the per device reset 
> > > > > > > > > > > > > > > attribute is sysfs remain
> > > > > > > > > > > > > > > exactly the same as they are currently.  The bus 
> > > > > > > > > > > > > > > and slot reset
> > > > > > > > > > > > > > > methods used here are limited to devices where 
> > > > > > > > > > > > > > > only a single function is
> > > > > > > > > > > > > > > affected by the reset, therefore it is not like 
> > > > > > > > > > > > > > > the patch you proposed
> > > > > > > > > > > > > > > which performed a reset irrespective of the 
> > > > > > > > > > > > > > > downstream devices.  This
> > > > > > > > > > > > > > > series only enables selection of the existing 
> > > > > > > > > > > > > > > methods.  Thanks,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Alex,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I asked the patch author here [1], but didn't get 
> > > > > > > > > > > > > > any response, maybe
> > > > > > > > > > > > > > you can answer me. What is the use case scenario 
> > > > > > > > > > > > > > for this functionality?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > [1] 
> > > > > > > > > > > > > > https://lore.kernel.org/lkml/YE389lAqjJSeTolM@unreal/
> > > > > > > > > > > > > >
> > > > > > > > > > > > > Sorry for not responding immediately. There were some 
> > > > > > > > > > > > > buggy wifi cards
> > > > > > > > > > > > > which needed FLR explicitly not sure if that behavior 
> > > > > > > > > > > > > is fixed in
> > > > > > > > > > > > > drivers. Also there is use a case at Nutanix but the 
> > > > > > > > > > > > > engineer who
> > > > > > > > > > > > > is involved is on PTO that is why I did not respond 
> > > > > > > > > > > > > immediately as
> > > > > > > > > > > > > I don't know the details yet.
> > > > > > > > > > > >
> > > > > > > > > > > > And more generally, devices continue to have reset 
> > > > > > > > > > > > issues and we
> > 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-18 Thread Amey Narkhede
On 21/03/17 09:13PM, Pali Rohár wrote:
> On Wednesday 17 March 2021 14:00:20 Alex Williamson wrote:
> > On Wed, 17 Mar 2021 20:40:24 +0100
> > Pali Rohár  wrote:
> >
> > > On Wednesday 17 March 2021 13:32:45 Alex Williamson wrote:
> > > > On Wed, 17 Mar 2021 20:24:24 +0100
> > > > Pali Rohár  wrote:
> > > >
> > > > > On Wednesday 17 March 2021 13:15:36 Alex Williamson wrote:
> > > > > > On Wed, 17 Mar 2021 20:02:06 +0100
> > > > > > Pali Rohár  wrote:
> > > > > >
> > > > > > > On Monday 15 March 2021 09:03:39 Alex Williamson wrote:
> > > > > > > > On Mon, 15 Mar 2021 15:52:38 +0100
> > > > > > > > Pali Rohár  wrote:
> > > > > > > >
> > > > > > > > > On Monday 15 March 2021 08:34:09 Alex Williamson wrote:
> > > > > > > > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > > > > > > > Pali Rohár  wrote:
> > > > > > > > > >
> > > > > > > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote:
> > > > > > > > > > > > slot reset (pci_dev_reset_slot_function) and secondary 
> > > > > > > > > > > > bus
> > > > > > > > > > > > reset(pci_parent_bus_reset) which I think are hot reset 
> > > > > > > > > > > > and
> > > > > > > > > > > > warm reset respectively.
> > > > > > > > > > >
> > > > > > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot reset 
> > > > > > > > > > > is just another
> > > > > > > > > > > type of reset, which is currently implemented only for 
> > > > > > > > > > > PCIe hot plug
> > > > > > > > > > > bridges and for PowerPC PowerNV platform and it just call 
> > > > > > > > > > > PCI secondary
> > > > > > > > > > > bus reset with some other hook. PCIe Warm Reset does not 
> > > > > > > > > > > have API in
> > > > > > > > > > > kernel and therefore drivers do not export this type of 
> > > > > > > > > > > reset via any
> > > > > > > > > > > kernel function (yet).
> > > > > > > > > >
> > > > > > > > > > Warm reset is beyond the scope of this series, but could be 
> > > > > > > > > > implemented
> > > > > > > > > > in a compatible way to fit within the 
> > > > > > > > > > pci_reset_fn_methods[] array
> > > > > > > > > > defined here.
> > > > > > > > >
> > > > > > > > > Ok!
> > > > > > > > >
> > > > > > > > > > Note that with this series the resets available through
> > > > > > > > > > pci_reset_function() and the per device reset attribute is 
> > > > > > > > > > sysfs remain
> > > > > > > > > > exactly the same as they are currently.  The bus and slot 
> > > > > > > > > > reset
> > > > > > > > > > methods used here are limited to devices where only a 
> > > > > > > > > > single function is
> > > > > > > > > > affected by the reset, therefore it is not like the patch 
> > > > > > > > > > you proposed
> > > > > > > > > > which performed a reset irrespective of the downstream 
> > > > > > > > > > devices.  This
> > > > > > > > > > series only enables selection of the existing methods.  
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Alex
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > But with this patch series, there is still an issue with PCI 
> > > > > > > > > secondary
> > > > > > > > > bus reset mechanism as exported sysfs attribute does not do 
> > > > > > > > > that
> > > > > > > > > remove-reset-rescan procedure. As discussed in other thread, 
> > > > > > > > > this reset
> > > > > > > > > let device in unconfigured / broken state.
> > > > > > > >
> > > > > > > > No, there's not:
> > > > > > > >
> > > > > > > > int pci_reset_function(struct pci_dev *dev)
> > > > > > > > {
> > > > > > > > int rc;
> > > > > > > >
> > > > > > > > if (!dev->reset_fn)
> > > > > > > > return -ENOTTY;
> > > > > > > >
> > > > > > > > pci_dev_lock(dev);
> > > > > > > > >>> pci_dev_save_and_disable(dev);
> > > > > > > >
> > > > > > > > rc = __pci_reset_function_locked(dev);
> > > > > > > >
> > > > > > > > >>> pci_dev_restore(dev);
> > > > > > > > pci_dev_unlock(dev);
> > > > > > > >
> > > > > > > > return rc;
> > > > > > > > }
> > > > > > > >
> > > > > > > > The remove/re-scan was discussed primarily because your patch 
> > > > > > > > performed
> > > > > > > > a bus reset regardless of what devices were affected by that 
> > > > > > > > reset and
> > > > > > > > it's difficult to manage the scope where multiple devices are 
> > > > > > > > affected.
> > > > > > > > Here, the bus and slot reset functions will fail unless the 
> > > > > > > > scope is
> > > > > > > > limited to the single device triggering this reset.  Thanks,
> > > > > > > >
> > > > > > > > Alex
> > > > > > > >
> > > > > > >
> > > > > > > I was thinking a bit more about it and I'm really sure how it 
> > > > > > > would
> > > > > > > behave with hotplugging PCIe bridge.
> > > > > > >
> > > > > > > On aardvark PCIe controller I have already tested that secondary 
> > > > > > > bus
> > > > > > > reset bit is triggering Hot Reset event and then also Link Down 
> > > > > > > event.
> > > > > > > These events are not handled by aardvark driver 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-18 Thread Amey Narkhede
On 21/03/18 11:09AM, Leon Romanovsky wrote:
> On Wed, Mar 17, 2021 at 11:31:40AM -0600, Alex Williamson wrote:
> > On Wed, 17 Mar 2021 15:58:40 +0200
> > Leon Romanovsky  wrote:
> >
> > > On Wed, Mar 17, 2021 at 06:47:18PM +0530, Amey Narkhede wrote:
> > > > On 21/03/17 01:47PM, Leon Romanovsky wrote:
> > > > > On Wed, Mar 17, 2021 at 04:53:09PM +0530, Amey Narkhede wrote:
> > > > > > On 21/03/17 01:02PM, Leon Romanovsky wrote:
> > > > > > > On Wed, Mar 17, 2021 at 03:54:47PM +0530, Amey Narkhede wrote:
> > > > > > > > On 21/03/17 06:20AM, Leon Romanovsky wrote:
> > > > > > > > > On Mon, Mar 15, 2021 at 06:32:32PM +, Raphael Norwitz 
> > > > > > > > > wrote:
> > > > > > > > > > On Mon, Mar 15, 2021 at 10:29:50AM -0600, Alex Williamson 
> > > > > > > > > > wrote:
> > > > > > > > > > > On Mon, 15 Mar 2021 21:03:41 +0530
> > > > > > > > > > > Amey Narkhede  wrote:
> > > > > > > > > > >
> > > > > > > > > > > > On 21/03/15 05:07PM, Leon Romanovsky wrote:
> > > > > > > > > > > > > On Mon, Mar 15, 2021 at 08:34:09AM -0600, Alex 
> > > > > > > > > > > > > Williamson wrote:
> > > > > > > > > > > > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > > > > > > > > > > > Pali Rohár  wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede 
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > slot reset (pci_dev_reset_slot_function) and 
> > > > > > > > > > > > > > > > secondary bus
> > > > > > > > > > > > > > > > reset(pci_parent_bus_reset) which I think are 
> > > > > > > > > > > > > > > > hot reset and
> > > > > > > > > > > > > > > > warm reset respectively.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. 
> > > > > > > > > > > > > > > Slot reset is just another
> > > > > > > > > > > > > > > type of reset, which is currently implemented 
> > > > > > > > > > > > > > > only for PCIe hot plug
> > > > > > > > > > > > > > > bridges and for PowerPC PowerNV platform and it 
> > > > > > > > > > > > > > > just call PCI secondary
> > > > > > > > > > > > > > > bus reset with some other hook. PCIe Warm Reset 
> > > > > > > > > > > > > > > does not have API in
> > > > > > > > > > > > > > > kernel and therefore drivers do not export this 
> > > > > > > > > > > > > > > type of reset via any
> > > > > > > > > > > > > > > kernel function (yet).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Warm reset is beyond the scope of this series, but 
> > > > > > > > > > > > > > could be implemented
> > > > > > > > > > > > > > in a compatible way to fit within the 
> > > > > > > > > > > > > > pci_reset_fn_methods[] array
> > > > > > > > > > > > > > defined here.  Note that with this series the 
> > > > > > > > > > > > > > resets available through
> > > > > > > > > > > > > > pci_reset_function() and the per device reset 
> > > > > > > > > > > > > > attribute is sysfs remain
> > > > > > > > > > > > > > exactly the same as they are currently.  The bus 
> > > > > > > > > > > > > > and slot reset
> > > > > > > > > > > > > > methods used here are limited to devices where only 
> > > > > > > > > > > > > > a single function is
> > > > > > > > > > > > > > affected by the reset, therefore it is not like the 
> > > > > > > > > > > > > > patch you proposed
> > > > > > > > > > > > > > which performed a reset irrespective of the 
> > > > > > > > > > > > > > downstream devices.  This
> > > > > > > > > > > > > > series only enables selection of the existing 
> > > > > > > > > > > > > > methods.  Thanks,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Alex,
> > > > > > > > > > > > >
> > > > > > > > > > > > > I asked the patch author here [1], but didn't get any 
> > > > > > > > > > > > > response, maybe
> > > > > > > > > > > > > you can answer me. What is the use case scenario for 
> > > > > > > > > > > > > this functionality?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks
> > > > > > > > > > > > >
> > > > > > > > > > > > > [1] 
> > > > > > > > > > > > > https://lore.kernel.org/lkml/YE389lAqjJSeTolM@unreal/
> > > > > > > > > > > > >
> > > > > > > > > > > > Sorry for not responding immediately. There were some 
> > > > > > > > > > > > buggy wifi cards
> > > > > > > > > > > > which needed FLR explicitly not sure if that behavior 
> > > > > > > > > > > > is fixed in
> > > > > > > > > > > > drivers. Also there is use a case at Nutanix but the 
> > > > > > > > > > > > engineer who
> > > > > > > > > > > > is involved is on PTO that is why I did not respond 
> > > > > > > > > > > > immediately as
> > > > > > > > > > > > I don't know the details yet.
> > > > > > > > > > >
> > > > > > > > > > > And more generally, devices continue to have reset issues 
> > > > > > > > > > > and we
> > > > > > > > > > > impose a fixed priority in our ordering.  We can and 
> > > > > > > > > > > probably should
> > > > > > > > > > > continue to quirk devices when we find broken resets so 
> > > > > > > > > > > that we have
> > > > > > 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-18 Thread Leon Romanovsky
On Wed, Mar 17, 2021 at 11:31:40AM -0600, Alex Williamson wrote:
> On Wed, 17 Mar 2021 15:58:40 +0200
> Leon Romanovsky  wrote:
>
> > On Wed, Mar 17, 2021 at 06:47:18PM +0530, Amey Narkhede wrote:
> > > On 21/03/17 01:47PM, Leon Romanovsky wrote:
> > > > On Wed, Mar 17, 2021 at 04:53:09PM +0530, Amey Narkhede wrote:
> > > > > On 21/03/17 01:02PM, Leon Romanovsky wrote:
> > > > > > On Wed, Mar 17, 2021 at 03:54:47PM +0530, Amey Narkhede wrote:
> > > > > > > On 21/03/17 06:20AM, Leon Romanovsky wrote:
> > > > > > > > On Mon, Mar 15, 2021 at 06:32:32PM +, Raphael Norwitz wrote:
> > > > > > > > > On Mon, Mar 15, 2021 at 10:29:50AM -0600, Alex Williamson 
> > > > > > > > > wrote:
> > > > > > > > > > On Mon, 15 Mar 2021 21:03:41 +0530
> > > > > > > > > > Amey Narkhede  wrote:
> > > > > > > > > >
> > > > > > > > > > > On 21/03/15 05:07PM, Leon Romanovsky wrote:
> > > > > > > > > > > > On Mon, Mar 15, 2021 at 08:34:09AM -0600, Alex 
> > > > > > > > > > > > Williamson wrote:
> > > > > > > > > > > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > > > > > > > > > > Pali Rohár  wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede 
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > slot reset (pci_dev_reset_slot_function) and 
> > > > > > > > > > > > > > > secondary bus
> > > > > > > > > > > > > > > reset(pci_parent_bus_reset) which I think are hot 
> > > > > > > > > > > > > > > reset and
> > > > > > > > > > > > > > > warm reset respectively.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot 
> > > > > > > > > > > > > > reset is just another
> > > > > > > > > > > > > > type of reset, which is currently implemented only 
> > > > > > > > > > > > > > for PCIe hot plug
> > > > > > > > > > > > > > bridges and for PowerPC PowerNV platform and it 
> > > > > > > > > > > > > > just call PCI secondary
> > > > > > > > > > > > > > bus reset with some other hook. PCIe Warm Reset 
> > > > > > > > > > > > > > does not have API in
> > > > > > > > > > > > > > kernel and therefore drivers do not export this 
> > > > > > > > > > > > > > type of reset via any
> > > > > > > > > > > > > > kernel function (yet).
> > > > > > > > > > > > >
> > > > > > > > > > > > > Warm reset is beyond the scope of this series, but 
> > > > > > > > > > > > > could be implemented
> > > > > > > > > > > > > in a compatible way to fit within the 
> > > > > > > > > > > > > pci_reset_fn_methods[] array
> > > > > > > > > > > > > defined here.  Note that with this series the resets 
> > > > > > > > > > > > > available through
> > > > > > > > > > > > > pci_reset_function() and the per device reset 
> > > > > > > > > > > > > attribute is sysfs remain
> > > > > > > > > > > > > exactly the same as they are currently.  The bus and 
> > > > > > > > > > > > > slot reset
> > > > > > > > > > > > > methods used here are limited to devices where only a 
> > > > > > > > > > > > > single function is
> > > > > > > > > > > > > affected by the reset, therefore it is not like the 
> > > > > > > > > > > > > patch you proposed
> > > > > > > > > > > > > which performed a reset irrespective of the 
> > > > > > > > > > > > > downstream devices.  This
> > > > > > > > > > > > > series only enables selection of the existing 
> > > > > > > > > > > > > methods.  Thanks,
> > > > > > > > > > > >
> > > > > > > > > > > > Alex,
> > > > > > > > > > > >
> > > > > > > > > > > > I asked the patch author here [1], but didn't get any 
> > > > > > > > > > > > response, maybe
> > > > > > > > > > > > you can answer me. What is the use case scenario for 
> > > > > > > > > > > > this functionality?
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks
> > > > > > > > > > > >
> > > > > > > > > > > > [1] 
> > > > > > > > > > > > https://lore.kernel.org/lkml/YE389lAqjJSeTolM@unreal/
> > > > > > > > > > > >
> > > > > > > > > > > Sorry for not responding immediately. There were some 
> > > > > > > > > > > buggy wifi cards
> > > > > > > > > > > which needed FLR explicitly not sure if that behavior is 
> > > > > > > > > > > fixed in
> > > > > > > > > > > drivers. Also there is use a case at Nutanix but the 
> > > > > > > > > > > engineer who
> > > > > > > > > > > is involved is on PTO that is why I did not respond 
> > > > > > > > > > > immediately as
> > > > > > > > > > > I don't know the details yet.
> > > > > > > > > >
> > > > > > > > > > And more generally, devices continue to have reset issues 
> > > > > > > > > > and we
> > > > > > > > > > impose a fixed priority in our ordering.  We can and 
> > > > > > > > > > probably should
> > > > > > > > > > continue to quirk devices when we find broken resets so 
> > > > > > > > > > that we have
> > > > > > > > > > the best default behavior, but it's currently not easy for 
> > > > > > > > > > an end user
> > > > > > > > > > to experiment, ie. this reset works, that one doesn't.  We 
> > > > > > > > > > might also
> > > > > > > > > > have 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-17 Thread Pali Rohár
On Wednesday 17 March 2021 14:00:20 Alex Williamson wrote:
> On Wed, 17 Mar 2021 20:40:24 +0100
> Pali Rohár  wrote:
> 
> > On Wednesday 17 March 2021 13:32:45 Alex Williamson wrote:
> > > On Wed, 17 Mar 2021 20:24:24 +0100
> > > Pali Rohár  wrote:
> > >   
> > > > On Wednesday 17 March 2021 13:15:36 Alex Williamson wrote:  
> > > > > On Wed, 17 Mar 2021 20:02:06 +0100
> > > > > Pali Rohár  wrote:
> > > > > 
> > > > > > On Monday 15 March 2021 09:03:39 Alex Williamson wrote:
> > > > > > > On Mon, 15 Mar 2021 15:52:38 +0100
> > > > > > > Pali Rohár  wrote:
> > > > > > >   
> > > > > > > > On Monday 15 March 2021 08:34:09 Alex Williamson wrote:  
> > > > > > > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > > > > > > Pali Rohár  wrote:
> > > > > > > > > 
> > > > > > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote:   
> > > > > > > > > >  
> > > > > > > > > > > slot reset (pci_dev_reset_slot_function) and secondary bus
> > > > > > > > > > > reset(pci_parent_bus_reset) which I think are hot reset 
> > > > > > > > > > > and
> > > > > > > > > > > warm reset respectively.  
> > > > > > > > > > 
> > > > > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is 
> > > > > > > > > > just another
> > > > > > > > > > type of reset, which is currently implemented only for PCIe 
> > > > > > > > > > hot plug
> > > > > > > > > > bridges and for PowerPC PowerNV platform and it just call 
> > > > > > > > > > PCI secondary
> > > > > > > > > > bus reset with some other hook. PCIe Warm Reset does not 
> > > > > > > > > > have API in
> > > > > > > > > > kernel and therefore drivers do not export this type of 
> > > > > > > > > > reset via any
> > > > > > > > > > kernel function (yet).
> > > > > > > > > 
> > > > > > > > > Warm reset is beyond the scope of this series, but could be 
> > > > > > > > > implemented
> > > > > > > > > in a compatible way to fit within the pci_reset_fn_methods[] 
> > > > > > > > > array
> > > > > > > > > defined here.
> > > > > > > > 
> > > > > > > > Ok!
> > > > > > > >   
> > > > > > > > > Note that with this series the resets available through
> > > > > > > > > pci_reset_function() and the per device reset attribute is 
> > > > > > > > > sysfs remain
> > > > > > > > > exactly the same as they are currently.  The bus and slot 
> > > > > > > > > reset
> > > > > > > > > methods used here are limited to devices where only a single 
> > > > > > > > > function is
> > > > > > > > > affected by the reset, therefore it is not like the patch you 
> > > > > > > > > proposed
> > > > > > > > > which performed a reset irrespective of the downstream 
> > > > > > > > > devices.  This
> > > > > > > > > series only enables selection of the existing methods.  
> > > > > > > > > Thanks,
> > > > > > > > > 
> > > > > > > > > Alex
> > > > > > > > > 
> > > > > > > > 
> > > > > > > > But with this patch series, there is still an issue with PCI 
> > > > > > > > secondary
> > > > > > > > bus reset mechanism as exported sysfs attribute does not do that
> > > > > > > > remove-reset-rescan procedure. As discussed in other thread, 
> > > > > > > > this reset
> > > > > > > > let device in unconfigured / broken state.  
> > > > > > > 
> > > > > > > No, there's not:
> > > > > > > 
> > > > > > > int pci_reset_function(struct pci_dev *dev)
> > > > > > > {
> > > > > > > int rc;
> > > > > > > 
> > > > > > > if (!dev->reset_fn)
> > > > > > > return -ENOTTY;
> > > > > > > 
> > > > > > > pci_dev_lock(dev);  
> > > > > > > >>> pci_dev_save_and_disable(dev);  
> > > > > > > 
> > > > > > > rc = __pci_reset_function_locked(dev);
> > > > > > >   
> > > > > > > >>> pci_dev_restore(dev);  
> > > > > > > pci_dev_unlock(dev);
> > > > > > > 
> > > > > > > return rc;
> > > > > > > }
> > > > > > > 
> > > > > > > The remove/re-scan was discussed primarily because your patch 
> > > > > > > performed
> > > > > > > a bus reset regardless of what devices were affected by that 
> > > > > > > reset and
> > > > > > > it's difficult to manage the scope where multiple devices are 
> > > > > > > affected.
> > > > > > > Here, the bus and slot reset functions will fail unless the scope 
> > > > > > > is
> > > > > > > limited to the single device triggering this reset.  Thanks,
> > > > > > > 
> > > > > > > Alex
> > > > > > >   
> > > > > > 
> > > > > > I was thinking a bit more about it and I'm really sure how it would
> > > > > > behave with hotplugging PCIe bridge.
> > > > > > 
> > > > > > On aardvark PCIe controller I have already tested that secondary bus
> > > > > > reset bit is triggering Hot Reset event and then also Link Down 
> > > > > > event.
> > > > > > These events are not handled by aardvark driver yet (needs to
> > > > > > implemented into kernel's emulated root bridge code).
> > > > > > 
> > > > > > But I'm not sure how it would behave on real HW PCIe hotplugging 
> 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-17 Thread Alex Williamson
On Wed, 17 Mar 2021 20:40:24 +0100
Pali Rohár  wrote:

> On Wednesday 17 March 2021 13:32:45 Alex Williamson wrote:
> > On Wed, 17 Mar 2021 20:24:24 +0100
> > Pali Rohár  wrote:
> >   
> > > On Wednesday 17 March 2021 13:15:36 Alex Williamson wrote:  
> > > > On Wed, 17 Mar 2021 20:02:06 +0100
> > > > Pali Rohár  wrote:
> > > > 
> > > > > On Monday 15 March 2021 09:03:39 Alex Williamson wrote:
> > > > > > On Mon, 15 Mar 2021 15:52:38 +0100
> > > > > > Pali Rohár  wrote:
> > > > > >   
> > > > > > > On Monday 15 March 2021 08:34:09 Alex Williamson wrote:  
> > > > > > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > > > > > Pali Rohár  wrote:
> > > > > > > > 
> > > > > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote:
> > > > > > > > > > slot reset (pci_dev_reset_slot_function) and secondary bus
> > > > > > > > > > reset(pci_parent_bus_reset) which I think are hot reset and
> > > > > > > > > > warm reset respectively.  
> > > > > > > > > 
> > > > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is 
> > > > > > > > > just another
> > > > > > > > > type of reset, which is currently implemented only for PCIe 
> > > > > > > > > hot plug
> > > > > > > > > bridges and for PowerPC PowerNV platform and it just call PCI 
> > > > > > > > > secondary
> > > > > > > > > bus reset with some other hook. PCIe Warm Reset does not have 
> > > > > > > > > API in
> > > > > > > > > kernel and therefore drivers do not export this type of reset 
> > > > > > > > > via any
> > > > > > > > > kernel function (yet).
> > > > > > > > 
> > > > > > > > Warm reset is beyond the scope of this series, but could be 
> > > > > > > > implemented
> > > > > > > > in a compatible way to fit within the pci_reset_fn_methods[] 
> > > > > > > > array
> > > > > > > > defined here.
> > > > > > > 
> > > > > > > Ok!
> > > > > > >   
> > > > > > > > Note that with this series the resets available through
> > > > > > > > pci_reset_function() and the per device reset attribute is 
> > > > > > > > sysfs remain
> > > > > > > > exactly the same as they are currently.  The bus and slot reset
> > > > > > > > methods used here are limited to devices where only a single 
> > > > > > > > function is
> > > > > > > > affected by the reset, therefore it is not like the patch you 
> > > > > > > > proposed
> > > > > > > > which performed a reset irrespective of the downstream devices. 
> > > > > > > >  This
> > > > > > > > series only enables selection of the existing methods.  Thanks,
> > > > > > > > 
> > > > > > > > Alex
> > > > > > > > 
> > > > > > > 
> > > > > > > But with this patch series, there is still an issue with PCI 
> > > > > > > secondary
> > > > > > > bus reset mechanism as exported sysfs attribute does not do that
> > > > > > > remove-reset-rescan procedure. As discussed in other thread, this 
> > > > > > > reset
> > > > > > > let device in unconfigured / broken state.  
> > > > > > 
> > > > > > No, there's not:
> > > > > > 
> > > > > > int pci_reset_function(struct pci_dev *dev)
> > > > > > {
> > > > > > int rc;
> > > > > > 
> > > > > > if (!dev->reset_fn)
> > > > > > return -ENOTTY;
> > > > > > 
> > > > > > pci_dev_lock(dev);  
> > > > > > >>> pci_dev_save_and_disable(dev);  
> > > > > > 
> > > > > > rc = __pci_reset_function_locked(dev);
> > > > > >   
> > > > > > >>> pci_dev_restore(dev);  
> > > > > > pci_dev_unlock(dev);
> > > > > > 
> > > > > > return rc;
> > > > > > }
> > > > > > 
> > > > > > The remove/re-scan was discussed primarily because your patch 
> > > > > > performed
> > > > > > a bus reset regardless of what devices were affected by that reset 
> > > > > > and
> > > > > > it's difficult to manage the scope where multiple devices are 
> > > > > > affected.
> > > > > > Here, the bus and slot reset functions will fail unless the scope is
> > > > > > limited to the single device triggering this reset.  Thanks,
> > > > > > 
> > > > > > Alex
> > > > > >   
> > > > > 
> > > > > I was thinking a bit more about it and I'm really sure how it would
> > > > > behave with hotplugging PCIe bridge.
> > > > > 
> > > > > On aardvark PCIe controller I have already tested that secondary bus
> > > > > reset bit is triggering Hot Reset event and then also Link Down event.
> > > > > These events are not handled by aardvark driver yet (needs to
> > > > > implemented into kernel's emulated root bridge code).
> > > > > 
> > > > > But I'm not sure how it would behave on real HW PCIe hotplugging 
> > > > > bridge.
> > > > > Kernel has already code which removes PCIe device if it changes 
> > > > > presence
> > > > > bit (and inform via interrupt). And Link Down event triggers this
> > > > > change.
> > > > 
> > > > This is the difference between slot and bus resets, the slot reset is
> > > > implemented by the hotplug controller and disables presence detection
> > > > 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-17 Thread Pali Rohár
On Wednesday 17 March 2021 13:32:45 Alex Williamson wrote:
> On Wed, 17 Mar 2021 20:24:24 +0100
> Pali Rohár  wrote:
> 
> > On Wednesday 17 March 2021 13:15:36 Alex Williamson wrote:
> > > On Wed, 17 Mar 2021 20:02:06 +0100
> > > Pali Rohár  wrote:
> > >   
> > > > On Monday 15 March 2021 09:03:39 Alex Williamson wrote:  
> > > > > On Mon, 15 Mar 2021 15:52:38 +0100
> > > > > Pali Rohár  wrote:
> > > > > 
> > > > > > On Monday 15 March 2021 08:34:09 Alex Williamson wrote:
> > > > > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > > > > Pali Rohár  wrote:
> > > > > > >   
> > > > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote:  
> > > > > > > > > slot reset (pci_dev_reset_slot_function) and secondary bus
> > > > > > > > > reset(pci_parent_bus_reset) which I think are hot reset and
> > > > > > > > > warm reset respectively.
> > > > > > > > 
> > > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is 
> > > > > > > > just another
> > > > > > > > type of reset, which is currently implemented only for PCIe hot 
> > > > > > > > plug
> > > > > > > > bridges and for PowerPC PowerNV platform and it just call PCI 
> > > > > > > > secondary
> > > > > > > > bus reset with some other hook. PCIe Warm Reset does not have 
> > > > > > > > API in
> > > > > > > > kernel and therefore drivers do not export this type of reset 
> > > > > > > > via any
> > > > > > > > kernel function (yet).  
> > > > > > > 
> > > > > > > Warm reset is beyond the scope of this series, but could be 
> > > > > > > implemented
> > > > > > > in a compatible way to fit within the pci_reset_fn_methods[] array
> > > > > > > defined here.  
> > > > > > 
> > > > > > Ok!
> > > > > > 
> > > > > > > Note that with this series the resets available through
> > > > > > > pci_reset_function() and the per device reset attribute is sysfs 
> > > > > > > remain
> > > > > > > exactly the same as they are currently.  The bus and slot reset
> > > > > > > methods used here are limited to devices where only a single 
> > > > > > > function is
> > > > > > > affected by the reset, therefore it is not like the patch you 
> > > > > > > proposed
> > > > > > > which performed a reset irrespective of the downstream devices.  
> > > > > > > This
> > > > > > > series only enables selection of the existing methods.  Thanks,
> > > > > > > 
> > > > > > > Alex
> > > > > > >   
> > > > > > 
> > > > > > But with this patch series, there is still an issue with PCI 
> > > > > > secondary
> > > > > > bus reset mechanism as exported sysfs attribute does not do that
> > > > > > remove-reset-rescan procedure. As discussed in other thread, this 
> > > > > > reset
> > > > > > let device in unconfigured / broken state.
> > > > > 
> > > > > No, there's not:
> > > > > 
> > > > > int pci_reset_function(struct pci_dev *dev)
> > > > > {
> > > > > int rc;
> > > > > 
> > > > > if (!dev->reset_fn)
> > > > > return -ENOTTY;
> > > > > 
> > > > > pci_dev_lock(dev);
> > > > > >>> pci_dev_save_and_disable(dev);
> > > > > 
> > > > > rc = __pci_reset_function_locked(dev);
> > > > > 
> > > > > >>> pci_dev_restore(dev);
> > > > > pci_dev_unlock(dev);
> > > > > 
> > > > > return rc;
> > > > > }
> > > > > 
> > > > > The remove/re-scan was discussed primarily because your patch 
> > > > > performed
> > > > > a bus reset regardless of what devices were affected by that reset and
> > > > > it's difficult to manage the scope where multiple devices are 
> > > > > affected.
> > > > > Here, the bus and slot reset functions will fail unless the scope is
> > > > > limited to the single device triggering this reset.  Thanks,
> > > > > 
> > > > > Alex
> > > > > 
> > > > 
> > > > I was thinking a bit more about it and I'm really sure how it would
> > > > behave with hotplugging PCIe bridge.
> > > > 
> > > > On aardvark PCIe controller I have already tested that secondary bus
> > > > reset bit is triggering Hot Reset event and then also Link Down event.
> > > > These events are not handled by aardvark driver yet (needs to
> > > > implemented into kernel's emulated root bridge code).
> > > > 
> > > > But I'm not sure how it would behave on real HW PCIe hotplugging bridge.
> > > > Kernel has already code which removes PCIe device if it changes presence
> > > > bit (and inform via interrupt). And Link Down event triggers this
> > > > change.  
> > > 
> > > This is the difference between slot and bus resets, the slot reset is
> > > implemented by the hotplug controller and disables presence detection
> > > around the bus reset.  Thanks,  
> > 
> > Yes, but I'm talking about bus reset, not about slot reset.
> > 
> > I mean: to use bus reset via sysfs on hardware which supports slots and
> > hotplugging.
> > 
> > And if I'm reading code correctly, this combination is allowed, right?
> > Via these new patches it is possible to disable slot reset and enable
> > bus 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-17 Thread Alex Williamson
On Wed, 17 Mar 2021 20:24:24 +0100
Pali Rohár  wrote:

> On Wednesday 17 March 2021 13:15:36 Alex Williamson wrote:
> > On Wed, 17 Mar 2021 20:02:06 +0100
> > Pali Rohár  wrote:
> >   
> > > On Monday 15 March 2021 09:03:39 Alex Williamson wrote:  
> > > > On Mon, 15 Mar 2021 15:52:38 +0100
> > > > Pali Rohár  wrote:
> > > > 
> > > > > On Monday 15 March 2021 08:34:09 Alex Williamson wrote:
> > > > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > > > Pali Rohár  wrote:
> > > > > >   
> > > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote:  
> > > > > > > > slot reset (pci_dev_reset_slot_function) and secondary bus
> > > > > > > > reset(pci_parent_bus_reset) which I think are hot reset and
> > > > > > > > warm reset respectively.
> > > > > > > 
> > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is just 
> > > > > > > another
> > > > > > > type of reset, which is currently implemented only for PCIe hot 
> > > > > > > plug
> > > > > > > bridges and for PowerPC PowerNV platform and it just call PCI 
> > > > > > > secondary
> > > > > > > bus reset with some other hook. PCIe Warm Reset does not have API 
> > > > > > > in
> > > > > > > kernel and therefore drivers do not export this type of reset via 
> > > > > > > any
> > > > > > > kernel function (yet).  
> > > > > > 
> > > > > > Warm reset is beyond the scope of this series, but could be 
> > > > > > implemented
> > > > > > in a compatible way to fit within the pci_reset_fn_methods[] array
> > > > > > defined here.  
> > > > > 
> > > > > Ok!
> > > > > 
> > > > > > Note that with this series the resets available through
> > > > > > pci_reset_function() and the per device reset attribute is sysfs 
> > > > > > remain
> > > > > > exactly the same as they are currently.  The bus and slot reset
> > > > > > methods used here are limited to devices where only a single 
> > > > > > function is
> > > > > > affected by the reset, therefore it is not like the patch you 
> > > > > > proposed
> > > > > > which performed a reset irrespective of the downstream devices.  
> > > > > > This
> > > > > > series only enables selection of the existing methods.  Thanks,
> > > > > > 
> > > > > > Alex
> > > > > >   
> > > > > 
> > > > > But with this patch series, there is still an issue with PCI secondary
> > > > > bus reset mechanism as exported sysfs attribute does not do that
> > > > > remove-reset-rescan procedure. As discussed in other thread, this 
> > > > > reset
> > > > > let device in unconfigured / broken state.
> > > > 
> > > > No, there's not:
> > > > 
> > > > int pci_reset_function(struct pci_dev *dev)
> > > > {
> > > > int rc;
> > > > 
> > > > if (!dev->reset_fn)
> > > > return -ENOTTY;
> > > > 
> > > > pci_dev_lock(dev);
> > > > >>> pci_dev_save_and_disable(dev);
> > > > 
> > > > rc = __pci_reset_function_locked(dev);
> > > > 
> > > > >>> pci_dev_restore(dev);
> > > > pci_dev_unlock(dev);
> > > > 
> > > > return rc;
> > > > }
> > > > 
> > > > The remove/re-scan was discussed primarily because your patch performed
> > > > a bus reset regardless of what devices were affected by that reset and
> > > > it's difficult to manage the scope where multiple devices are affected.
> > > > Here, the bus and slot reset functions will fail unless the scope is
> > > > limited to the single device triggering this reset.  Thanks,
> > > > 
> > > > Alex
> > > > 
> > > 
> > > I was thinking a bit more about it and I'm really sure how it would
> > > behave with hotplugging PCIe bridge.
> > > 
> > > On aardvark PCIe controller I have already tested that secondary bus
> > > reset bit is triggering Hot Reset event and then also Link Down event.
> > > These events are not handled by aardvark driver yet (needs to
> > > implemented into kernel's emulated root bridge code).
> > > 
> > > But I'm not sure how it would behave on real HW PCIe hotplugging bridge.
> > > Kernel has already code which removes PCIe device if it changes presence
> > > bit (and inform via interrupt). And Link Down event triggers this
> > > change.  
> > 
> > This is the difference between slot and bus resets, the slot reset is
> > implemented by the hotplug controller and disables presence detection
> > around the bus reset.  Thanks,  
> 
> Yes, but I'm talking about bus reset, not about slot reset.
> 
> I mean: to use bus reset via sysfs on hardware which supports slots and
> hotplugging.
> 
> And if I'm reading code correctly, this combination is allowed, right?
> Via these new patches it is possible to disable slot reset and enable
> bus reset.

That's true, a slot reset is simply a bus reset wrapped around code
that prevents the device from getting ejected.  Maybe it would make
sense to combine the two as far as this interface is concerned, ie. a
single "bus" reset method that will always use slot reset when
available.  Thanks,

Alex



Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-17 Thread Pali Rohár
On Wednesday 17 March 2021 13:15:36 Alex Williamson wrote:
> On Wed, 17 Mar 2021 20:02:06 +0100
> Pali Rohár  wrote:
> 
> > On Monday 15 March 2021 09:03:39 Alex Williamson wrote:
> > > On Mon, 15 Mar 2021 15:52:38 +0100
> > > Pali Rohár  wrote:
> > >   
> > > > On Monday 15 March 2021 08:34:09 Alex Williamson wrote:  
> > > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > > Pali Rohár  wrote:
> > > > > 
> > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote:
> > > > > > > slot reset (pci_dev_reset_slot_function) and secondary bus
> > > > > > > reset(pci_parent_bus_reset) which I think are hot reset and
> > > > > > > warm reset respectively.  
> > > > > > 
> > > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is just 
> > > > > > another
> > > > > > type of reset, which is currently implemented only for PCIe hot plug
> > > > > > bridges and for PowerPC PowerNV platform and it just call PCI 
> > > > > > secondary
> > > > > > bus reset with some other hook. PCIe Warm Reset does not have API in
> > > > > > kernel and therefore drivers do not export this type of reset via 
> > > > > > any
> > > > > > kernel function (yet).
> > > > > 
> > > > > Warm reset is beyond the scope of this series, but could be 
> > > > > implemented
> > > > > in a compatible way to fit within the pci_reset_fn_methods[] array
> > > > > defined here.
> > > > 
> > > > Ok!
> > > >   
> > > > > Note that with this series the resets available through
> > > > > pci_reset_function() and the per device reset attribute is sysfs 
> > > > > remain
> > > > > exactly the same as they are currently.  The bus and slot reset
> > > > > methods used here are limited to devices where only a single function 
> > > > > is
> > > > > affected by the reset, therefore it is not like the patch you proposed
> > > > > which performed a reset irrespective of the downstream devices.  This
> > > > > series only enables selection of the existing methods.  Thanks,
> > > > > 
> > > > > Alex
> > > > > 
> > > > 
> > > > But with this patch series, there is still an issue with PCI secondary
> > > > bus reset mechanism as exported sysfs attribute does not do that
> > > > remove-reset-rescan procedure. As discussed in other thread, this reset
> > > > let device in unconfigured / broken state.  
> > > 
> > > No, there's not:
> > > 
> > > int pci_reset_function(struct pci_dev *dev)
> > > {
> > > int rc;
> > > 
> > > if (!dev->reset_fn)
> > > return -ENOTTY;
> > > 
> > > pci_dev_lock(dev);  
> > > >>> pci_dev_save_and_disable(dev);  
> > > 
> > > rc = __pci_reset_function_locked(dev);
> > >   
> > > >>> pci_dev_restore(dev);  
> > > pci_dev_unlock(dev);
> > > 
> > > return rc;
> > > }
> > > 
> > > The remove/re-scan was discussed primarily because your patch performed
> > > a bus reset regardless of what devices were affected by that reset and
> > > it's difficult to manage the scope where multiple devices are affected.
> > > Here, the bus and slot reset functions will fail unless the scope is
> > > limited to the single device triggering this reset.  Thanks,
> > > 
> > > Alex
> > >   
> > 
> > I was thinking a bit more about it and I'm really sure how it would
> > behave with hotplugging PCIe bridge.
> > 
> > On aardvark PCIe controller I have already tested that secondary bus
> > reset bit is triggering Hot Reset event and then also Link Down event.
> > These events are not handled by aardvark driver yet (needs to
> > implemented into kernel's emulated root bridge code).
> > 
> > But I'm not sure how it would behave on real HW PCIe hotplugging bridge.
> > Kernel has already code which removes PCIe device if it changes presence
> > bit (and inform via interrupt). And Link Down event triggers this
> > change.
> 
> This is the difference between slot and bus resets, the slot reset is
> implemented by the hotplug controller and disables presence detection
> around the bus reset.  Thanks,

Yes, but I'm talking about bus reset, not about slot reset.

I mean: to use bus reset via sysfs on hardware which supports slots and
hotplugging.

And if I'm reading code correctly, this combination is allowed, right?
Via these new patches it is possible to disable slot reset and enable
bus reset.


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-17 Thread Alex Williamson
On Wed, 17 Mar 2021 20:02:06 +0100
Pali Rohár  wrote:

> On Monday 15 March 2021 09:03:39 Alex Williamson wrote:
> > On Mon, 15 Mar 2021 15:52:38 +0100
> > Pali Rohár  wrote:
> >   
> > > On Monday 15 March 2021 08:34:09 Alex Williamson wrote:  
> > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > Pali Rohár  wrote:
> > > > 
> > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote:
> > > > > > slot reset (pci_dev_reset_slot_function) and secondary bus
> > > > > > reset(pci_parent_bus_reset) which I think are hot reset and
> > > > > > warm reset respectively.  
> > > > > 
> > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is just 
> > > > > another
> > > > > type of reset, which is currently implemented only for PCIe hot plug
> > > > > bridges and for PowerPC PowerNV platform and it just call PCI 
> > > > > secondary
> > > > > bus reset with some other hook. PCIe Warm Reset does not have API in
> > > > > kernel and therefore drivers do not export this type of reset via any
> > > > > kernel function (yet).
> > > > 
> > > > Warm reset is beyond the scope of this series, but could be implemented
> > > > in a compatible way to fit within the pci_reset_fn_methods[] array
> > > > defined here.
> > > 
> > > Ok!
> > >   
> > > > Note that with this series the resets available through
> > > > pci_reset_function() and the per device reset attribute is sysfs remain
> > > > exactly the same as they are currently.  The bus and slot reset
> > > > methods used here are limited to devices where only a single function is
> > > > affected by the reset, therefore it is not like the patch you proposed
> > > > which performed a reset irrespective of the downstream devices.  This
> > > > series only enables selection of the existing methods.  Thanks,
> > > > 
> > > > Alex
> > > > 
> > > 
> > > But with this patch series, there is still an issue with PCI secondary
> > > bus reset mechanism as exported sysfs attribute does not do that
> > > remove-reset-rescan procedure. As discussed in other thread, this reset
> > > let device in unconfigured / broken state.  
> > 
> > No, there's not:
> > 
> > int pci_reset_function(struct pci_dev *dev)
> > {
> > int rc;
> > 
> > if (!dev->reset_fn)
> > return -ENOTTY;
> > 
> > pci_dev_lock(dev);  
> > >>> pci_dev_save_and_disable(dev);  
> > 
> > rc = __pci_reset_function_locked(dev);
> >   
> > >>> pci_dev_restore(dev);  
> > pci_dev_unlock(dev);
> > 
> > return rc;
> > }
> > 
> > The remove/re-scan was discussed primarily because your patch performed
> > a bus reset regardless of what devices were affected by that reset and
> > it's difficult to manage the scope where multiple devices are affected.
> > Here, the bus and slot reset functions will fail unless the scope is
> > limited to the single device triggering this reset.  Thanks,
> > 
> > Alex
> >   
> 
> I was thinking a bit more about it and I'm really sure how it would
> behave with hotplugging PCIe bridge.
> 
> On aardvark PCIe controller I have already tested that secondary bus
> reset bit is triggering Hot Reset event and then also Link Down event.
> These events are not handled by aardvark driver yet (needs to
> implemented into kernel's emulated root bridge code).
> 
> But I'm not sure how it would behave on real HW PCIe hotplugging bridge.
> Kernel has already code which removes PCIe device if it changes presence
> bit (and inform via interrupt). And Link Down event triggers this
> change.

This is the difference between slot and bus resets, the slot reset is
implemented by the hotplug controller and disables presence detection
around the bus reset.  Thanks,

Alex



Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-17 Thread Pali Rohár
On Monday 15 March 2021 09:03:39 Alex Williamson wrote:
> On Mon, 15 Mar 2021 15:52:38 +0100
> Pali Rohár  wrote:
> 
> > On Monday 15 March 2021 08:34:09 Alex Williamson wrote:
> > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > Pali Rohár  wrote:
> > >   
> > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote:  
> > > > > slot reset (pci_dev_reset_slot_function) and secondary bus
> > > > > reset(pci_parent_bus_reset) which I think are hot reset and
> > > > > warm reset respectively.
> > > > 
> > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is just another
> > > > type of reset, which is currently implemented only for PCIe hot plug
> > > > bridges and for PowerPC PowerNV platform and it just call PCI secondary
> > > > bus reset with some other hook. PCIe Warm Reset does not have API in
> > > > kernel and therefore drivers do not export this type of reset via any
> > > > kernel function (yet).  
> > > 
> > > Warm reset is beyond the scope of this series, but could be implemented
> > > in a compatible way to fit within the pci_reset_fn_methods[] array
> > > defined here.  
> > 
> > Ok!
> > 
> > > Note that with this series the resets available through
> > > pci_reset_function() and the per device reset attribute is sysfs remain
> > > exactly the same as they are currently.  The bus and slot reset
> > > methods used here are limited to devices where only a single function is
> > > affected by the reset, therefore it is not like the patch you proposed
> > > which performed a reset irrespective of the downstream devices.  This
> > > series only enables selection of the existing methods.  Thanks,
> > > 
> > > Alex
> > >   
> > 
> > But with this patch series, there is still an issue with PCI secondary
> > bus reset mechanism as exported sysfs attribute does not do that
> > remove-reset-rescan procedure. As discussed in other thread, this reset
> > let device in unconfigured / broken state.
> 
> No, there's not:
> 
> int pci_reset_function(struct pci_dev *dev)
> {
> int rc;
> 
> if (!dev->reset_fn)
> return -ENOTTY;
> 
> pci_dev_lock(dev);
> >>> pci_dev_save_and_disable(dev);
> 
> rc = __pci_reset_function_locked(dev);
> 
> >>> pci_dev_restore(dev);
> pci_dev_unlock(dev);
> 
> return rc;
> }
> 
> The remove/re-scan was discussed primarily because your patch performed
> a bus reset regardless of what devices were affected by that reset and
> it's difficult to manage the scope where multiple devices are affected.
> Here, the bus and slot reset functions will fail unless the scope is
> limited to the single device triggering this reset.  Thanks,
> 
> Alex
> 

I was thinking a bit more about it and I'm really sure how it would
behave with hotplugging PCIe bridge.

On aardvark PCIe controller I have already tested that secondary bus
reset bit is triggering Hot Reset event and then also Link Down event.
These events are not handled by aardvark driver yet (needs to
implemented into kernel's emulated root bridge code).

But I'm not sure how it would behave on real HW PCIe hotplugging bridge.
Kernel has already code which removes PCIe device if it changes presence
bit (and inform via interrupt). And Link Down event triggers this
change.

Can somebody test these changes on some PCIe hotplug controller what
secondary bus reset via sysfs would do? Because currently it is not
exported as reset method and there can be different race conditions and
maybe error (?) if hotplug code is going to remove device on which user
triggered bus reset via sysfs.

And in my opinion this can happen also in case when only one device is
on the bus, so it perfectly matches all conditions when sysfs can use
bus reset for one device.

I can try to implement hotplug code into aardvark driver and root bridge
emulator to test how this patch would happen. But it would take some
time...


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-17 Thread Alex Williamson
On Wed, 17 Mar 2021 15:58:40 +0200
Leon Romanovsky  wrote:

> On Wed, Mar 17, 2021 at 06:47:18PM +0530, Amey Narkhede wrote:
> > On 21/03/17 01:47PM, Leon Romanovsky wrote:  
> > > On Wed, Mar 17, 2021 at 04:53:09PM +0530, Amey Narkhede wrote:  
> > > > On 21/03/17 01:02PM, Leon Romanovsky wrote:  
> > > > > On Wed, Mar 17, 2021 at 03:54:47PM +0530, Amey Narkhede wrote:  
> > > > > > On 21/03/17 06:20AM, Leon Romanovsky wrote:  
> > > > > > > On Mon, Mar 15, 2021 at 06:32:32PM +, Raphael Norwitz wrote:  
> > > > > > > > On Mon, Mar 15, 2021 at 10:29:50AM -0600, Alex Williamson 
> > > > > > > > wrote:  
> > > > > > > > > On Mon, 15 Mar 2021 21:03:41 +0530
> > > > > > > > > Amey Narkhede  wrote:
> > > > > > > > >  
> > > > > > > > > > On 21/03/15 05:07PM, Leon Romanovsky wrote:  
> > > > > > > > > > > On Mon, Mar 15, 2021 at 08:34:09AM -0600, Alex Williamson 
> > > > > > > > > > > wrote:  
> > > > > > > > > > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > > > > > > > > > Pali Rohár  wrote:
> > > > > > > > > > > >  
> > > > > > > > > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote: 
> > > > > > > > > > > > >  
> > > > > > > > > > > > > > slot reset (pci_dev_reset_slot_function) and 
> > > > > > > > > > > > > > secondary bus
> > > > > > > > > > > > > > reset(pci_parent_bus_reset) which I think are hot 
> > > > > > > > > > > > > > reset and
> > > > > > > > > > > > > > warm reset respectively.  
> > > > > > > > > > > > >
> > > > > > > > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot 
> > > > > > > > > > > > > reset is just another
> > > > > > > > > > > > > type of reset, which is currently implemented only 
> > > > > > > > > > > > > for PCIe hot plug
> > > > > > > > > > > > > bridges and for PowerPC PowerNV platform and it just 
> > > > > > > > > > > > > call PCI secondary
> > > > > > > > > > > > > bus reset with some other hook. PCIe Warm Reset does 
> > > > > > > > > > > > > not have API in
> > > > > > > > > > > > > kernel and therefore drivers do not export this type 
> > > > > > > > > > > > > of reset via any
> > > > > > > > > > > > > kernel function (yet).  
> > > > > > > > > > > >
> > > > > > > > > > > > Warm reset is beyond the scope of this series, but 
> > > > > > > > > > > > could be implemented
> > > > > > > > > > > > in a compatible way to fit within the 
> > > > > > > > > > > > pci_reset_fn_methods[] array
> > > > > > > > > > > > defined here.  Note that with this series the resets 
> > > > > > > > > > > > available through
> > > > > > > > > > > > pci_reset_function() and the per device reset attribute 
> > > > > > > > > > > > is sysfs remain
> > > > > > > > > > > > exactly the same as they are currently.  The bus and 
> > > > > > > > > > > > slot reset
> > > > > > > > > > > > methods used here are limited to devices where only a 
> > > > > > > > > > > > single function is
> > > > > > > > > > > > affected by the reset, therefore it is not like the 
> > > > > > > > > > > > patch you proposed
> > > > > > > > > > > > which performed a reset irrespective of the downstream 
> > > > > > > > > > > > devices.  This
> > > > > > > > > > > > series only enables selection of the existing methods.  
> > > > > > > > > > > > Thanks,  
> > > > > > > > > > >
> > > > > > > > > > > Alex,
> > > > > > > > > > >
> > > > > > > > > > > I asked the patch author here [1], but didn't get any 
> > > > > > > > > > > response, maybe
> > > > > > > > > > > you can answer me. What is the use case scenario for this 
> > > > > > > > > > > functionality?
> > > > > > > > > > >
> > > > > > > > > > > Thanks
> > > > > > > > > > >
> > > > > > > > > > > [1] https://lore.kernel.org/lkml/YE389lAqjJSeTolM@unreal/
> > > > > > > > > > >  
> > > > > > > > > > Sorry for not responding immediately. There were some buggy 
> > > > > > > > > > wifi cards
> > > > > > > > > > which needed FLR explicitly not sure if that behavior is 
> > > > > > > > > > fixed in
> > > > > > > > > > drivers. Also there is use a case at Nutanix but the 
> > > > > > > > > > engineer who
> > > > > > > > > > is involved is on PTO that is why I did not respond 
> > > > > > > > > > immediately as
> > > > > > > > > > I don't know the details yet.  
> > > > > > > > >
> > > > > > > > > And more generally, devices continue to have reset issues and 
> > > > > > > > > we
> > > > > > > > > impose a fixed priority in our ordering.  We can and probably 
> > > > > > > > > should
> > > > > > > > > continue to quirk devices when we find broken resets so that 
> > > > > > > > > we have
> > > > > > > > > the best default behavior, but it's currently not easy for an 
> > > > > > > > > end user
> > > > > > > > > to experiment, ie. this reset works, that one doesn't.  We 
> > > > > > > > > might also
> > > > > > > > > have platform issues where a given reset works better on a 
> > > > > > > > > certain
> > > > > > > > > platform.  Exposing a way to test these things might lead to 
> > > > > > > > > better
> > > > > > > > > quirks.  In the case I think Pali was 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-17 Thread Leon Romanovsky
On Wed, Mar 17, 2021 at 06:47:18PM +0530, Amey Narkhede wrote:
> On 21/03/17 01:47PM, Leon Romanovsky wrote:
> > On Wed, Mar 17, 2021 at 04:53:09PM +0530, Amey Narkhede wrote:
> > > On 21/03/17 01:02PM, Leon Romanovsky wrote:
> > > > On Wed, Mar 17, 2021 at 03:54:47PM +0530, Amey Narkhede wrote:
> > > > > On 21/03/17 06:20AM, Leon Romanovsky wrote:
> > > > > > On Mon, Mar 15, 2021 at 06:32:32PM +, Raphael Norwitz wrote:
> > > > > > > On Mon, Mar 15, 2021 at 10:29:50AM -0600, Alex Williamson wrote:
> > > > > > > > On Mon, 15 Mar 2021 21:03:41 +0530
> > > > > > > > Amey Narkhede  wrote:
> > > > > > > >
> > > > > > > > > On 21/03/15 05:07PM, Leon Romanovsky wrote:
> > > > > > > > > > On Mon, Mar 15, 2021 at 08:34:09AM -0600, Alex Williamson 
> > > > > > > > > > wrote:
> > > > > > > > > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > > > > > > > > Pali Rohár  wrote:
> > > > > > > > > > >
> > > > > > > > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote:
> > > > > > > > > > > > > slot reset (pci_dev_reset_slot_function) and 
> > > > > > > > > > > > > secondary bus
> > > > > > > > > > > > > reset(pci_parent_bus_reset) which I think are hot 
> > > > > > > > > > > > > reset and
> > > > > > > > > > > > > warm reset respectively.
> > > > > > > > > > > >
> > > > > > > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot 
> > > > > > > > > > > > reset is just another
> > > > > > > > > > > > type of reset, which is currently implemented only for 
> > > > > > > > > > > > PCIe hot plug
> > > > > > > > > > > > bridges and for PowerPC PowerNV platform and it just 
> > > > > > > > > > > > call PCI secondary
> > > > > > > > > > > > bus reset with some other hook. PCIe Warm Reset does 
> > > > > > > > > > > > not have API in
> > > > > > > > > > > > kernel and therefore drivers do not export this type of 
> > > > > > > > > > > > reset via any
> > > > > > > > > > > > kernel function (yet).
> > > > > > > > > > >
> > > > > > > > > > > Warm reset is beyond the scope of this series, but could 
> > > > > > > > > > > be implemented
> > > > > > > > > > > in a compatible way to fit within the 
> > > > > > > > > > > pci_reset_fn_methods[] array
> > > > > > > > > > > defined here.  Note that with this series the resets 
> > > > > > > > > > > available through
> > > > > > > > > > > pci_reset_function() and the per device reset attribute 
> > > > > > > > > > > is sysfs remain
> > > > > > > > > > > exactly the same as they are currently.  The bus and slot 
> > > > > > > > > > > reset
> > > > > > > > > > > methods used here are limited to devices where only a 
> > > > > > > > > > > single function is
> > > > > > > > > > > affected by the reset, therefore it is not like the patch 
> > > > > > > > > > > you proposed
> > > > > > > > > > > which performed a reset irrespective of the downstream 
> > > > > > > > > > > devices.  This
> > > > > > > > > > > series only enables selection of the existing methods.  
> > > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Alex,
> > > > > > > > > >
> > > > > > > > > > I asked the patch author here [1], but didn't get any 
> > > > > > > > > > response, maybe
> > > > > > > > > > you can answer me. What is the use case scenario for this 
> > > > > > > > > > functionality?
> > > > > > > > > >
> > > > > > > > > > Thanks
> > > > > > > > > >
> > > > > > > > > > [1] https://lore.kernel.org/lkml/YE389lAqjJSeTolM@unreal/
> > > > > > > > > >
> > > > > > > > > Sorry for not responding immediately. There were some buggy 
> > > > > > > > > wifi cards
> > > > > > > > > which needed FLR explicitly not sure if that behavior is 
> > > > > > > > > fixed in
> > > > > > > > > drivers. Also there is use a case at Nutanix but the engineer 
> > > > > > > > > who
> > > > > > > > > is involved is on PTO that is why I did not respond 
> > > > > > > > > immediately as
> > > > > > > > > I don't know the details yet.
> > > > > > > >
> > > > > > > > And more generally, devices continue to have reset issues and we
> > > > > > > > impose a fixed priority in our ordering.  We can and probably 
> > > > > > > > should
> > > > > > > > continue to quirk devices when we find broken resets so that we 
> > > > > > > > have
> > > > > > > > the best default behavior, but it's currently not easy for an 
> > > > > > > > end user
> > > > > > > > to experiment, ie. this reset works, that one doesn't.  We 
> > > > > > > > might also
> > > > > > > > have platform issues where a given reset works better on a 
> > > > > > > > certain
> > > > > > > > platform.  Exposing a way to test these things might lead to 
> > > > > > > > better
> > > > > > > > quirks.  In the case I think Pali was looking for, they wanted a
> > > > > > > > mechanism to force a bus reset, if this was in reference to a 
> > > > > > > > single
> > > > > > > > function device, this could be accomplished by setting a 
> > > > > > > > priority for
> > > > > > > > that mechanism, which would translate to not only the sysfs 
> > > > > > > > reset
> > > > 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-17 Thread Amey Narkhede
On 21/03/17 01:47PM, Leon Romanovsky wrote:
> On Wed, Mar 17, 2021 at 04:53:09PM +0530, Amey Narkhede wrote:
> > On 21/03/17 01:02PM, Leon Romanovsky wrote:
> > > On Wed, Mar 17, 2021 at 03:54:47PM +0530, Amey Narkhede wrote:
> > > > On 21/03/17 06:20AM, Leon Romanovsky wrote:
> > > > > On Mon, Mar 15, 2021 at 06:32:32PM +, Raphael Norwitz wrote:
> > > > > > On Mon, Mar 15, 2021 at 10:29:50AM -0600, Alex Williamson wrote:
> > > > > > > On Mon, 15 Mar 2021 21:03:41 +0530
> > > > > > > Amey Narkhede  wrote:
> > > > > > >
> > > > > > > > On 21/03/15 05:07PM, Leon Romanovsky wrote:
> > > > > > > > > On Mon, Mar 15, 2021 at 08:34:09AM -0600, Alex Williamson 
> > > > > > > > > wrote:
> > > > > > > > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > > > > > > > Pali Rohár  wrote:
> > > > > > > > > >
> > > > > > > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote:
> > > > > > > > > > > > slot reset (pci_dev_reset_slot_function) and secondary 
> > > > > > > > > > > > bus
> > > > > > > > > > > > reset(pci_parent_bus_reset) which I think are hot reset 
> > > > > > > > > > > > and
> > > > > > > > > > > > warm reset respectively.
> > > > > > > > > > >
> > > > > > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot reset 
> > > > > > > > > > > is just another
> > > > > > > > > > > type of reset, which is currently implemented only for 
> > > > > > > > > > > PCIe hot plug
> > > > > > > > > > > bridges and for PowerPC PowerNV platform and it just call 
> > > > > > > > > > > PCI secondary
> > > > > > > > > > > bus reset with some other hook. PCIe Warm Reset does not 
> > > > > > > > > > > have API in
> > > > > > > > > > > kernel and therefore drivers do not export this type of 
> > > > > > > > > > > reset via any
> > > > > > > > > > > kernel function (yet).
> > > > > > > > > >
> > > > > > > > > > Warm reset is beyond the scope of this series, but could be 
> > > > > > > > > > implemented
> > > > > > > > > > in a compatible way to fit within the 
> > > > > > > > > > pci_reset_fn_methods[] array
> > > > > > > > > > defined here.  Note that with this series the resets 
> > > > > > > > > > available through
> > > > > > > > > > pci_reset_function() and the per device reset attribute is 
> > > > > > > > > > sysfs remain
> > > > > > > > > > exactly the same as they are currently.  The bus and slot 
> > > > > > > > > > reset
> > > > > > > > > > methods used here are limited to devices where only a 
> > > > > > > > > > single function is
> > > > > > > > > > affected by the reset, therefore it is not like the patch 
> > > > > > > > > > you proposed
> > > > > > > > > > which performed a reset irrespective of the downstream 
> > > > > > > > > > devices.  This
> > > > > > > > > > series only enables selection of the existing methods.  
> > > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Alex,
> > > > > > > > >
> > > > > > > > > I asked the patch author here [1], but didn't get any 
> > > > > > > > > response, maybe
> > > > > > > > > you can answer me. What is the use case scenario for this 
> > > > > > > > > functionality?
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > > [1] https://lore.kernel.org/lkml/YE389lAqjJSeTolM@unreal/
> > > > > > > > >
> > > > > > > > Sorry for not responding immediately. There were some buggy 
> > > > > > > > wifi cards
> > > > > > > > which needed FLR explicitly not sure if that behavior is fixed 
> > > > > > > > in
> > > > > > > > drivers. Also there is use a case at Nutanix but the engineer 
> > > > > > > > who
> > > > > > > > is involved is on PTO that is why I did not respond immediately 
> > > > > > > > as
> > > > > > > > I don't know the details yet.
> > > > > > >
> > > > > > > And more generally, devices continue to have reset issues and we
> > > > > > > impose a fixed priority in our ordering.  We can and probably 
> > > > > > > should
> > > > > > > continue to quirk devices when we find broken resets so that we 
> > > > > > > have
> > > > > > > the best default behavior, but it's currently not easy for an end 
> > > > > > > user
> > > > > > > to experiment, ie. this reset works, that one doesn't.  We might 
> > > > > > > also
> > > > > > > have platform issues where a given reset works better on a certain
> > > > > > > platform.  Exposing a way to test these things might lead to 
> > > > > > > better
> > > > > > > quirks.  In the case I think Pali was looking for, they wanted a
> > > > > > > mechanism to force a bus reset, if this was in reference to a 
> > > > > > > single
> > > > > > > function device, this could be accomplished by setting a priority 
> > > > > > > for
> > > > > > > that mechanism, which would translate to not only the sysfs reset
> > > > > > > attribute, but also the reset mechanism used by vfio-pci.  Thanks,
> > > > > > >
> > > > > > > Alex
> > > > > > >
> > > > > >
> > > > > > To confirm from our end - we have seen many such instances where 
> > > > > > default
> > > > > > reset methods have not worked well on our 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-17 Thread Leon Romanovsky
On Wed, Mar 17, 2021 at 04:53:09PM +0530, Amey Narkhede wrote:
> On 21/03/17 01:02PM, Leon Romanovsky wrote:
> > On Wed, Mar 17, 2021 at 03:54:47PM +0530, Amey Narkhede wrote:
> > > On 21/03/17 06:20AM, Leon Romanovsky wrote:
> > > > On Mon, Mar 15, 2021 at 06:32:32PM +, Raphael Norwitz wrote:
> > > > > On Mon, Mar 15, 2021 at 10:29:50AM -0600, Alex Williamson wrote:
> > > > > > On Mon, 15 Mar 2021 21:03:41 +0530
> > > > > > Amey Narkhede  wrote:
> > > > > >
> > > > > > > On 21/03/15 05:07PM, Leon Romanovsky wrote:
> > > > > > > > On Mon, Mar 15, 2021 at 08:34:09AM -0600, Alex Williamson wrote:
> > > > > > > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > > > > > > Pali Rohár  wrote:
> > > > > > > > >
> > > > > > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote:
> > > > > > > > > > > slot reset (pci_dev_reset_slot_function) and secondary bus
> > > > > > > > > > > reset(pci_parent_bus_reset) which I think are hot reset 
> > > > > > > > > > > and
> > > > > > > > > > > warm reset respectively.
> > > > > > > > > >
> > > > > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is 
> > > > > > > > > > just another
> > > > > > > > > > type of reset, which is currently implemented only for PCIe 
> > > > > > > > > > hot plug
> > > > > > > > > > bridges and for PowerPC PowerNV platform and it just call 
> > > > > > > > > > PCI secondary
> > > > > > > > > > bus reset with some other hook. PCIe Warm Reset does not 
> > > > > > > > > > have API in
> > > > > > > > > > kernel and therefore drivers do not export this type of 
> > > > > > > > > > reset via any
> > > > > > > > > > kernel function (yet).
> > > > > > > > >
> > > > > > > > > Warm reset is beyond the scope of this series, but could be 
> > > > > > > > > implemented
> > > > > > > > > in a compatible way to fit within the pci_reset_fn_methods[] 
> > > > > > > > > array
> > > > > > > > > defined here.  Note that with this series the resets 
> > > > > > > > > available through
> > > > > > > > > pci_reset_function() and the per device reset attribute is 
> > > > > > > > > sysfs remain
> > > > > > > > > exactly the same as they are currently.  The bus and slot 
> > > > > > > > > reset
> > > > > > > > > methods used here are limited to devices where only a single 
> > > > > > > > > function is
> > > > > > > > > affected by the reset, therefore it is not like the patch you 
> > > > > > > > > proposed
> > > > > > > > > which performed a reset irrespective of the downstream 
> > > > > > > > > devices.  This
> > > > > > > > > series only enables selection of the existing methods.  
> > > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Alex,
> > > > > > > >
> > > > > > > > I asked the patch author here [1], but didn't get any response, 
> > > > > > > > maybe
> > > > > > > > you can answer me. What is the use case scenario for this 
> > > > > > > > functionality?
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > > [1] https://lore.kernel.org/lkml/YE389lAqjJSeTolM@unreal/
> > > > > > > >
> > > > > > > Sorry for not responding immediately. There were some buggy wifi 
> > > > > > > cards
> > > > > > > which needed FLR explicitly not sure if that behavior is fixed in
> > > > > > > drivers. Also there is use a case at Nutanix but the engineer who
> > > > > > > is involved is on PTO that is why I did not respond immediately as
> > > > > > > I don't know the details yet.
> > > > > >
> > > > > > And more generally, devices continue to have reset issues and we
> > > > > > impose a fixed priority in our ordering.  We can and probably should
> > > > > > continue to quirk devices when we find broken resets so that we have
> > > > > > the best default behavior, but it's currently not easy for an end 
> > > > > > user
> > > > > > to experiment, ie. this reset works, that one doesn't.  We might 
> > > > > > also
> > > > > > have platform issues where a given reset works better on a certain
> > > > > > platform.  Exposing a way to test these things might lead to better
> > > > > > quirks.  In the case I think Pali was looking for, they wanted a
> > > > > > mechanism to force a bus reset, if this was in reference to a single
> > > > > > function device, this could be accomplished by setting a priority 
> > > > > > for
> > > > > > that mechanism, which would translate to not only the sysfs reset
> > > > > > attribute, but also the reset mechanism used by vfio-pci.  Thanks,
> > > > > >
> > > > > > Alex
> > > > > >
> > > > >
> > > > > To confirm from our end - we have seen many such instances where 
> > > > > default
> > > > > reset methods have not worked well on our platform. Debugging these
> > > > > issues is painful in practice, and this interface would make it far
> > > > > easier.
> > > > >
> > > > > Having an interface like this would also help us better communicate 
> > > > > the
> > > > > issues we find with upstream. Allowing others to more easily test our
> > > > > (or other entities') findings should give better visibility into

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-17 Thread Amey Narkhede
On 21/03/17 01:02PM, Leon Romanovsky wrote:
> On Wed, Mar 17, 2021 at 03:54:47PM +0530, Amey Narkhede wrote:
> > On 21/03/17 06:20AM, Leon Romanovsky wrote:
> > > On Mon, Mar 15, 2021 at 06:32:32PM +, Raphael Norwitz wrote:
> > > > On Mon, Mar 15, 2021 at 10:29:50AM -0600, Alex Williamson wrote:
> > > > > On Mon, 15 Mar 2021 21:03:41 +0530
> > > > > Amey Narkhede  wrote:
> > > > >
> > > > > > On 21/03/15 05:07PM, Leon Romanovsky wrote:
> > > > > > > On Mon, Mar 15, 2021 at 08:34:09AM -0600, Alex Williamson wrote:
> > > > > > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > > > > > Pali Rohár  wrote:
> > > > > > > >
> > > > > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote:
> > > > > > > > > > slot reset (pci_dev_reset_slot_function) and secondary bus
> > > > > > > > > > reset(pci_parent_bus_reset) which I think are hot reset and
> > > > > > > > > > warm reset respectively.
> > > > > > > > >
> > > > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is 
> > > > > > > > > just another
> > > > > > > > > type of reset, which is currently implemented only for PCIe 
> > > > > > > > > hot plug
> > > > > > > > > bridges and for PowerPC PowerNV platform and it just call PCI 
> > > > > > > > > secondary
> > > > > > > > > bus reset with some other hook. PCIe Warm Reset does not have 
> > > > > > > > > API in
> > > > > > > > > kernel and therefore drivers do not export this type of reset 
> > > > > > > > > via any
> > > > > > > > > kernel function (yet).
> > > > > > > >
> > > > > > > > Warm reset is beyond the scope of this series, but could be 
> > > > > > > > implemented
> > > > > > > > in a compatible way to fit within the pci_reset_fn_methods[] 
> > > > > > > > array
> > > > > > > > defined here.  Note that with this series the resets available 
> > > > > > > > through
> > > > > > > > pci_reset_function() and the per device reset attribute is 
> > > > > > > > sysfs remain
> > > > > > > > exactly the same as they are currently.  The bus and slot reset
> > > > > > > > methods used here are limited to devices where only a single 
> > > > > > > > function is
> > > > > > > > affected by the reset, therefore it is not like the patch you 
> > > > > > > > proposed
> > > > > > > > which performed a reset irrespective of the downstream devices. 
> > > > > > > >  This
> > > > > > > > series only enables selection of the existing methods.  Thanks,
> > > > > > >
> > > > > > > Alex,
> > > > > > >
> > > > > > > I asked the patch author here [1], but didn't get any response, 
> > > > > > > maybe
> > > > > > > you can answer me. What is the use case scenario for this 
> > > > > > > functionality?
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > [1] https://lore.kernel.org/lkml/YE389lAqjJSeTolM@unreal/
> > > > > > >
> > > > > > Sorry for not responding immediately. There were some buggy wifi 
> > > > > > cards
> > > > > > which needed FLR explicitly not sure if that behavior is fixed in
> > > > > > drivers. Also there is use a case at Nutanix but the engineer who
> > > > > > is involved is on PTO that is why I did not respond immediately as
> > > > > > I don't know the details yet.
> > > > >
> > > > > And more generally, devices continue to have reset issues and we
> > > > > impose a fixed priority in our ordering.  We can and probably should
> > > > > continue to quirk devices when we find broken resets so that we have
> > > > > the best default behavior, but it's currently not easy for an end user
> > > > > to experiment, ie. this reset works, that one doesn't.  We might also
> > > > > have platform issues where a given reset works better on a certain
> > > > > platform.  Exposing a way to test these things might lead to better
> > > > > quirks.  In the case I think Pali was looking for, they wanted a
> > > > > mechanism to force a bus reset, if this was in reference to a single
> > > > > function device, this could be accomplished by setting a priority for
> > > > > that mechanism, which would translate to not only the sysfs reset
> > > > > attribute, but also the reset mechanism used by vfio-pci.  Thanks,
> > > > >
> > > > > Alex
> > > > >
> > > >
> > > > To confirm from our end - we have seen many such instances where default
> > > > reset methods have not worked well on our platform. Debugging these
> > > > issues is painful in practice, and this interface would make it far
> > > > easier.
> > > >
> > > > Having an interface like this would also help us better communicate the
> > > > issues we find with upstream. Allowing others to more easily test our
> > > > (or other entities') findings should give better visibility into
> > > > which issues apply to the device in general and which are platform
> > > > specific. In disambiguating the former from the latter, we should be
> > > > able to better quirk devices for everyone, and in the latter cases, this
> > > > interface allows for a safer and more elegant solution than any of the
> > > > current alternatives.
> > >
> > > So to 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-17 Thread Leon Romanovsky
On Wed, Mar 17, 2021 at 03:54:47PM +0530, Amey Narkhede wrote:
> On 21/03/17 06:20AM, Leon Romanovsky wrote:
> > On Mon, Mar 15, 2021 at 06:32:32PM +, Raphael Norwitz wrote:
> > > On Mon, Mar 15, 2021 at 10:29:50AM -0600, Alex Williamson wrote:
> > > > On Mon, 15 Mar 2021 21:03:41 +0530
> > > > Amey Narkhede  wrote:
> > > >
> > > > > On 21/03/15 05:07PM, Leon Romanovsky wrote:
> > > > > > On Mon, Mar 15, 2021 at 08:34:09AM -0600, Alex Williamson wrote:
> > > > > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > > > > Pali Rohár  wrote:
> > > > > > >
> > > > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote:
> > > > > > > > > slot reset (pci_dev_reset_slot_function) and secondary bus
> > > > > > > > > reset(pci_parent_bus_reset) which I think are hot reset and
> > > > > > > > > warm reset respectively.
> > > > > > > >
> > > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is 
> > > > > > > > just another
> > > > > > > > type of reset, which is currently implemented only for PCIe hot 
> > > > > > > > plug
> > > > > > > > bridges and for PowerPC PowerNV platform and it just call PCI 
> > > > > > > > secondary
> > > > > > > > bus reset with some other hook. PCIe Warm Reset does not have 
> > > > > > > > API in
> > > > > > > > kernel and therefore drivers do not export this type of reset 
> > > > > > > > via any
> > > > > > > > kernel function (yet).
> > > > > > >
> > > > > > > Warm reset is beyond the scope of this series, but could be 
> > > > > > > implemented
> > > > > > > in a compatible way to fit within the pci_reset_fn_methods[] array
> > > > > > > defined here.  Note that with this series the resets available 
> > > > > > > through
> > > > > > > pci_reset_function() and the per device reset attribute is sysfs 
> > > > > > > remain
> > > > > > > exactly the same as they are currently.  The bus and slot reset
> > > > > > > methods used here are limited to devices where only a single 
> > > > > > > function is
> > > > > > > affected by the reset, therefore it is not like the patch you 
> > > > > > > proposed
> > > > > > > which performed a reset irrespective of the downstream devices.  
> > > > > > > This
> > > > > > > series only enables selection of the existing methods.  Thanks,
> > > > > >
> > > > > > Alex,
> > > > > >
> > > > > > I asked the patch author here [1], but didn't get any response, 
> > > > > > maybe
> > > > > > you can answer me. What is the use case scenario for this 
> > > > > > functionality?
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > [1] https://lore.kernel.org/lkml/YE389lAqjJSeTolM@unreal/
> > > > > >
> > > > > Sorry for not responding immediately. There were some buggy wifi cards
> > > > > which needed FLR explicitly not sure if that behavior is fixed in
> > > > > drivers. Also there is use a case at Nutanix but the engineer who
> > > > > is involved is on PTO that is why I did not respond immediately as
> > > > > I don't know the details yet.
> > > >
> > > > And more generally, devices continue to have reset issues and we
> > > > impose a fixed priority in our ordering.  We can and probably should
> > > > continue to quirk devices when we find broken resets so that we have
> > > > the best default behavior, but it's currently not easy for an end user
> > > > to experiment, ie. this reset works, that one doesn't.  We might also
> > > > have platform issues where a given reset works better on a certain
> > > > platform.  Exposing a way to test these things might lead to better
> > > > quirks.  In the case I think Pali was looking for, they wanted a
> > > > mechanism to force a bus reset, if this was in reference to a single
> > > > function device, this could be accomplished by setting a priority for
> > > > that mechanism, which would translate to not only the sysfs reset
> > > > attribute, but also the reset mechanism used by vfio-pci.  Thanks,
> > > >
> > > > Alex
> > > >
> > >
> > > To confirm from our end - we have seen many such instances where default
> > > reset methods have not worked well on our platform. Debugging these
> > > issues is painful in practice, and this interface would make it far
> > > easier.
> > >
> > > Having an interface like this would also help us better communicate the
> > > issues we find with upstream. Allowing others to more easily test our
> > > (or other entities') findings should give better visibility into
> > > which issues apply to the device in general and which are platform
> > > specific. In disambiguating the former from the latter, we should be
> > > able to better quirk devices for everyone, and in the latter cases, this
> > > interface allows for a safer and more elegant solution than any of the
> > > current alternatives.
> >
> > So to summarize, we are talking about test and debug interface to
> > overcome HW bugs, am I right?
> >
> > My personal experience shows that once the easy workaround exists
> > (and write to generally available sysfs is very simple), the vendors
> > and users desire 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-17 Thread Amey Narkhede
On 21/03/17 06:20AM, Leon Romanovsky wrote:
> On Mon, Mar 15, 2021 at 06:32:32PM +, Raphael Norwitz wrote:
> > On Mon, Mar 15, 2021 at 10:29:50AM -0600, Alex Williamson wrote:
> > > On Mon, 15 Mar 2021 21:03:41 +0530
> > > Amey Narkhede  wrote:
> > >
> > > > On 21/03/15 05:07PM, Leon Romanovsky wrote:
> > > > > On Mon, Mar 15, 2021 at 08:34:09AM -0600, Alex Williamson wrote:
> > > > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > > > Pali Rohár  wrote:
> > > > > >
> > > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote:
> > > > > > > > slot reset (pci_dev_reset_slot_function) and secondary bus
> > > > > > > > reset(pci_parent_bus_reset) which I think are hot reset and
> > > > > > > > warm reset respectively.
> > > > > > >
> > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is just 
> > > > > > > another
> > > > > > > type of reset, which is currently implemented only for PCIe hot 
> > > > > > > plug
> > > > > > > bridges and for PowerPC PowerNV platform and it just call PCI 
> > > > > > > secondary
> > > > > > > bus reset with some other hook. PCIe Warm Reset does not have API 
> > > > > > > in
> > > > > > > kernel and therefore drivers do not export this type of reset via 
> > > > > > > any
> > > > > > > kernel function (yet).
> > > > > >
> > > > > > Warm reset is beyond the scope of this series, but could be 
> > > > > > implemented
> > > > > > in a compatible way to fit within the pci_reset_fn_methods[] array
> > > > > > defined here.  Note that with this series the resets available 
> > > > > > through
> > > > > > pci_reset_function() and the per device reset attribute is sysfs 
> > > > > > remain
> > > > > > exactly the same as they are currently.  The bus and slot reset
> > > > > > methods used here are limited to devices where only a single 
> > > > > > function is
> > > > > > affected by the reset, therefore it is not like the patch you 
> > > > > > proposed
> > > > > > which performed a reset irrespective of the downstream devices.  
> > > > > > This
> > > > > > series only enables selection of the existing methods.  Thanks,
> > > > >
> > > > > Alex,
> > > > >
> > > > > I asked the patch author here [1], but didn't get any response, maybe
> > > > > you can answer me. What is the use case scenario for this 
> > > > > functionality?
> > > > >
> > > > > Thanks
> > > > >
> > > > > [1] https://lore.kernel.org/lkml/YE389lAqjJSeTolM@unreal/
> > > > >
> > > > Sorry for not responding immediately. There were some buggy wifi cards
> > > > which needed FLR explicitly not sure if that behavior is fixed in
> > > > drivers. Also there is use a case at Nutanix but the engineer who
> > > > is involved is on PTO that is why I did not respond immediately as
> > > > I don't know the details yet.
> > >
> > > And more generally, devices continue to have reset issues and we
> > > impose a fixed priority in our ordering.  We can and probably should
> > > continue to quirk devices when we find broken resets so that we have
> > > the best default behavior, but it's currently not easy for an end user
> > > to experiment, ie. this reset works, that one doesn't.  We might also
> > > have platform issues where a given reset works better on a certain
> > > platform.  Exposing a way to test these things might lead to better
> > > quirks.  In the case I think Pali was looking for, they wanted a
> > > mechanism to force a bus reset, if this was in reference to a single
> > > function device, this could be accomplished by setting a priority for
> > > that mechanism, which would translate to not only the sysfs reset
> > > attribute, but also the reset mechanism used by vfio-pci.  Thanks,
> > >
> > > Alex
> > >
> >
> > To confirm from our end - we have seen many such instances where default
> > reset methods have not worked well on our platform. Debugging these
> > issues is painful in practice, and this interface would make it far
> > easier.
> >
> > Having an interface like this would also help us better communicate the
> > issues we find with upstream. Allowing others to more easily test our
> > (or other entities') findings should give better visibility into
> > which issues apply to the device in general and which are platform
> > specific. In disambiguating the former from the latter, we should be
> > able to better quirk devices for everyone, and in the latter cases, this
> > interface allows for a safer and more elegant solution than any of the
> > current alternatives.
>
> So to summarize, we are talking about test and debug interface to
> overcome HW bugs, am I right?
>
> My personal experience shows that once the easy workaround exists
> (and write to generally available sysfs is very simple), the vendors
> and users desire for proper fix decreases drastically. IMHO, we will
> see increase of copy/paste in SO and blog posts, but reduce in quirks.
>
> My 2-cents.
>
I agree with your point but at least it gives the userspace ability
to use broken device until bug is fixed in upstream.
This 

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-16 Thread Leon Romanovsky
On Mon, Mar 15, 2021 at 06:32:32PM +, Raphael Norwitz wrote:
> On Mon, Mar 15, 2021 at 10:29:50AM -0600, Alex Williamson wrote:
> > On Mon, 15 Mar 2021 21:03:41 +0530
> > Amey Narkhede  wrote:
> >
> > > On 21/03/15 05:07PM, Leon Romanovsky wrote:
> > > > On Mon, Mar 15, 2021 at 08:34:09AM -0600, Alex Williamson wrote:
> > > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > > Pali Rohár  wrote:
> > > > >
> > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote:
> > > > > > > slot reset (pci_dev_reset_slot_function) and secondary bus
> > > > > > > reset(pci_parent_bus_reset) which I think are hot reset and
> > > > > > > warm reset respectively.
> > > > > >
> > > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is just 
> > > > > > another
> > > > > > type of reset, which is currently implemented only for PCIe hot plug
> > > > > > bridges and for PowerPC PowerNV platform and it just call PCI 
> > > > > > secondary
> > > > > > bus reset with some other hook. PCIe Warm Reset does not have API in
> > > > > > kernel and therefore drivers do not export this type of reset via 
> > > > > > any
> > > > > > kernel function (yet).
> > > > >
> > > > > Warm reset is beyond the scope of this series, but could be 
> > > > > implemented
> > > > > in a compatible way to fit within the pci_reset_fn_methods[] array
> > > > > defined here.  Note that with this series the resets available through
> > > > > pci_reset_function() and the per device reset attribute is sysfs 
> > > > > remain
> > > > > exactly the same as they are currently.  The bus and slot reset
> > > > > methods used here are limited to devices where only a single function 
> > > > > is
> > > > > affected by the reset, therefore it is not like the patch you proposed
> > > > > which performed a reset irrespective of the downstream devices.  This
> > > > > series only enables selection of the existing methods.  Thanks,
> > > >
> > > > Alex,
> > > >
> > > > I asked the patch author here [1], but didn't get any response, maybe
> > > > you can answer me. What is the use case scenario for this functionality?
> > > >
> > > > Thanks
> > > >
> > > > [1] https://lore.kernel.org/lkml/YE389lAqjJSeTolM@unreal/
> > > >
> > > Sorry for not responding immediately. There were some buggy wifi cards
> > > which needed FLR explicitly not sure if that behavior is fixed in
> > > drivers. Also there is use a case at Nutanix but the engineer who
> > > is involved is on PTO that is why I did not respond immediately as
> > > I don't know the details yet.
> >
> > And more generally, devices continue to have reset issues and we
> > impose a fixed priority in our ordering.  We can and probably should
> > continue to quirk devices when we find broken resets so that we have
> > the best default behavior, but it's currently not easy for an end user
> > to experiment, ie. this reset works, that one doesn't.  We might also
> > have platform issues where a given reset works better on a certain
> > platform.  Exposing a way to test these things might lead to better
> > quirks.  In the case I think Pali was looking for, they wanted a
> > mechanism to force a bus reset, if this was in reference to a single
> > function device, this could be accomplished by setting a priority for
> > that mechanism, which would translate to not only the sysfs reset
> > attribute, but also the reset mechanism used by vfio-pci.  Thanks,
> >
> > Alex
> >
>
> To confirm from our end - we have seen many such instances where default
> reset methods have not worked well on our platform. Debugging these
> issues is painful in practice, and this interface would make it far
> easier.
>
> Having an interface like this would also help us better communicate the
> issues we find with upstream. Allowing others to more easily test our
> (or other entities') findings should give better visibility into
> which issues apply to the device in general and which are platform
> specific. In disambiguating the former from the latter, we should be
> able to better quirk devices for everyone, and in the latter cases, this
> interface allows for a safer and more elegant solution than any of the
> current alternatives.

So to summarize, we are talking about test and debug interface to
overcome HW bugs, am I right?

My personal experience shows that once the easy workaround exists
(and write to generally available sysfs is very simple), the vendors
and users desire for proper fix decreases drastically. IMHO, we will
see increase of copy/paste in SO and blog posts, but reduce in quirks.

My 2-cents.

>
> CC Alay, Suresh, Shyam and Felipe in case they have anything to add.


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-15 Thread Raphael Norwitz
On Mon, Mar 15, 2021 at 10:29:50AM -0600, Alex Williamson wrote:
> On Mon, 15 Mar 2021 21:03:41 +0530
> Amey Narkhede  wrote:
> 
> > On 21/03/15 05:07PM, Leon Romanovsky wrote:
> > > On Mon, Mar 15, 2021 at 08:34:09AM -0600, Alex Williamson wrote:  
> > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > Pali Rohár  wrote:
> > > >  
> > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote:  
> > > > > > slot reset (pci_dev_reset_slot_function) and secondary bus
> > > > > > reset(pci_parent_bus_reset) which I think are hot reset and
> > > > > > warm reset respectively.  
> > > > >
> > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is just 
> > > > > another
> > > > > type of reset, which is currently implemented only for PCIe hot plug
> > > > > bridges and for PowerPC PowerNV platform and it just call PCI 
> > > > > secondary
> > > > > bus reset with some other hook. PCIe Warm Reset does not have API in
> > > > > kernel and therefore drivers do not export this type of reset via any
> > > > > kernel function (yet).  
> > > >
> > > > Warm reset is beyond the scope of this series, but could be implemented
> > > > in a compatible way to fit within the pci_reset_fn_methods[] array
> > > > defined here.  Note that with this series the resets available through
> > > > pci_reset_function() and the per device reset attribute is sysfs remain
> > > > exactly the same as they are currently.  The bus and slot reset
> > > > methods used here are limited to devices where only a single function is
> > > > affected by the reset, therefore it is not like the patch you proposed
> > > > which performed a reset irrespective of the downstream devices.  This
> > > > series only enables selection of the existing methods.  Thanks,  
> > >
> > > Alex,
> > >
> > > I asked the patch author here [1], but didn't get any response, maybe
> > > you can answer me. What is the use case scenario for this functionality?
> > >
> > > Thanks
> > >
> > > [1] https://lore.kernel.org/lkml/YE389lAqjJSeTolM@unreal/ 
> > >  
> > Sorry for not responding immediately. There were some buggy wifi cards
> > which needed FLR explicitly not sure if that behavior is fixed in
> > drivers. Also there is use a case at Nutanix but the engineer who
> > is involved is on PTO that is why I did not respond immediately as
> > I don't know the details yet.
> 
> And more generally, devices continue to have reset issues and we
> impose a fixed priority in our ordering.  We can and probably should
> continue to quirk devices when we find broken resets so that we have
> the best default behavior, but it's currently not easy for an end user
> to experiment, ie. this reset works, that one doesn't.  We might also
> have platform issues where a given reset works better on a certain
> platform.  Exposing a way to test these things might lead to better
> quirks.  In the case I think Pali was looking for, they wanted a
> mechanism to force a bus reset, if this was in reference to a single
> function device, this could be accomplished by setting a priority for
> that mechanism, which would translate to not only the sysfs reset
> attribute, but also the reset mechanism used by vfio-pci.  Thanks,
> 
> Alex
>

To confirm from our end - we have seen many such instances where default
reset methods have not worked well on our platform. Debugging these
issues is painful in practice, and this interface would make it far
easier.

Having an interface like this would also help us better communicate the
issues we find with upstream. Allowing others to more easily test our
(or other entities') findings should give better visibility into
which issues apply to the device in general and which are platform
specific. In disambiguating the former from the latter, we should be
able to better quirk devices for everyone, and in the latter cases, this
interface allows for a safer and more elegant solution than any of the
current alternatives.

CC Alay, Suresh, Shyam and Felipe in case they have anything to add.


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-15 Thread Alex Williamson
On Mon, 15 Mar 2021 21:03:41 +0530
Amey Narkhede  wrote:

> On 21/03/15 05:07PM, Leon Romanovsky wrote:
> > On Mon, Mar 15, 2021 at 08:34:09AM -0600, Alex Williamson wrote:  
> > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > Pali Rohár  wrote:
> > >  
> > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote:  
> > > > > slot reset (pci_dev_reset_slot_function) and secondary bus
> > > > > reset(pci_parent_bus_reset) which I think are hot reset and
> > > > > warm reset respectively.  
> > > >
> > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is just another
> > > > type of reset, which is currently implemented only for PCIe hot plug
> > > > bridges and for PowerPC PowerNV platform and it just call PCI secondary
> > > > bus reset with some other hook. PCIe Warm Reset does not have API in
> > > > kernel and therefore drivers do not export this type of reset via any
> > > > kernel function (yet).  
> > >
> > > Warm reset is beyond the scope of this series, but could be implemented
> > > in a compatible way to fit within the pci_reset_fn_methods[] array
> > > defined here.  Note that with this series the resets available through
> > > pci_reset_function() and the per device reset attribute is sysfs remain
> > > exactly the same as they are currently.  The bus and slot reset
> > > methods used here are limited to devices where only a single function is
> > > affected by the reset, therefore it is not like the patch you proposed
> > > which performed a reset irrespective of the downstream devices.  This
> > > series only enables selection of the existing methods.  Thanks,  
> >
> > Alex,
> >
> > I asked the patch author here [1], but didn't get any response, maybe
> > you can answer me. What is the use case scenario for this functionality?
> >
> > Thanks
> >
> > [1] https://lore.kernel.org/lkml/YE389lAqjJSeTolM@unreal
> >  
> Sorry for not responding immediately. There were some buggy wifi cards
> which needed FLR explicitly not sure if that behavior is fixed in
> drivers. Also there is use a case at Nutanix but the engineer who
> is involved is on PTO that is why I did not respond immediately as
> I don't know the details yet.

And more generally, devices continue to have reset issues and we
impose a fixed priority in our ordering.  We can and probably should
continue to quirk devices when we find broken resets so that we have
the best default behavior, but it's currently not easy for an end user
to experiment, ie. this reset works, that one doesn't.  We might also
have platform issues where a given reset works better on a certain
platform.  Exposing a way to test these things might lead to better
quirks.  In the case I think Pali was looking for, they wanted a
mechanism to force a bus reset, if this was in reference to a single
function device, this could be accomplished by setting a priority for
that mechanism, which would translate to not only the sysfs reset
attribute, but also the reset mechanism used by vfio-pci.  Thanks,

Alex



Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-15 Thread Amey Narkhede
On 21/03/15 05:07PM, Leon Romanovsky wrote:
> On Mon, Mar 15, 2021 at 08:34:09AM -0600, Alex Williamson wrote:
> > On Mon, 15 Mar 2021 14:52:26 +0100
> > Pali Rohár  wrote:
> >
> > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote:
> > > > slot reset (pci_dev_reset_slot_function) and secondary bus
> > > > reset(pci_parent_bus_reset) which I think are hot reset and
> > > > warm reset respectively.
> > >
> > > No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is just another
> > > type of reset, which is currently implemented only for PCIe hot plug
> > > bridges and for PowerPC PowerNV platform and it just call PCI secondary
> > > bus reset with some other hook. PCIe Warm Reset does not have API in
> > > kernel and therefore drivers do not export this type of reset via any
> > > kernel function (yet).
> >
> > Warm reset is beyond the scope of this series, but could be implemented
> > in a compatible way to fit within the pci_reset_fn_methods[] array
> > defined here.  Note that with this series the resets available through
> > pci_reset_function() and the per device reset attribute is sysfs remain
> > exactly the same as they are currently.  The bus and slot reset
> > methods used here are limited to devices where only a single function is
> > affected by the reset, therefore it is not like the patch you proposed
> > which performed a reset irrespective of the downstream devices.  This
> > series only enables selection of the existing methods.  Thanks,
>
> Alex,
>
> I asked the patch author here [1], but didn't get any response, maybe
> you can answer me. What is the use case scenario for this functionality?
>
> Thanks
>
> [1] https://lore.kernel.org/lkml/YE389lAqjJSeTolM@unreal
>
Sorry for not responding immediately. There were some buggy wifi cards
which needed FLR explicitly not sure if that behavior is fixed in
drivers. Also there is use a case at Nutanix but the engineer who
is involved is on PTO that is why I did not respond immediately as
I don't know the details yet.

Thanks,
Amey


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-15 Thread Leon Romanovsky
On Mon, Mar 15, 2021 at 08:34:09AM -0600, Alex Williamson wrote:
> On Mon, 15 Mar 2021 14:52:26 +0100
> Pali Rohár  wrote:
>
> > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote:
> > > slot reset (pci_dev_reset_slot_function) and secondary bus
> > > reset(pci_parent_bus_reset) which I think are hot reset and
> > > warm reset respectively.
> >
> > No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is just another
> > type of reset, which is currently implemented only for PCIe hot plug
> > bridges and for PowerPC PowerNV platform and it just call PCI secondary
> > bus reset with some other hook. PCIe Warm Reset does not have API in
> > kernel and therefore drivers do not export this type of reset via any
> > kernel function (yet).
>
> Warm reset is beyond the scope of this series, but could be implemented
> in a compatible way to fit within the pci_reset_fn_methods[] array
> defined here.  Note that with this series the resets available through
> pci_reset_function() and the per device reset attribute is sysfs remain
> exactly the same as they are currently.  The bus and slot reset
> methods used here are limited to devices where only a single function is
> affected by the reset, therefore it is not like the patch you proposed
> which performed a reset irrespective of the downstream devices.  This
> series only enables selection of the existing methods.  Thanks,

Alex,

I asked the patch author here [1], but didn't get any response, maybe
you can answer me. What is the use case scenario for this functionality?

Thanks

[1] https://lore.kernel.org/lkml/YE389lAqjJSeTolM@unreal

>
> Alex
>


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-15 Thread Alex Williamson
On Mon, 15 Mar 2021 15:52:38 +0100
Pali Rohár  wrote:

> On Monday 15 March 2021 08:34:09 Alex Williamson wrote:
> > On Mon, 15 Mar 2021 14:52:26 +0100
> > Pali Rohár  wrote:
> >   
> > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote:  
> > > > slot reset (pci_dev_reset_slot_function) and secondary bus
> > > > reset(pci_parent_bus_reset) which I think are hot reset and
> > > > warm reset respectively.
> > > 
> > > No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is just another
> > > type of reset, which is currently implemented only for PCIe hot plug
> > > bridges and for PowerPC PowerNV platform and it just call PCI secondary
> > > bus reset with some other hook. PCIe Warm Reset does not have API in
> > > kernel and therefore drivers do not export this type of reset via any
> > > kernel function (yet).  
> > 
> > Warm reset is beyond the scope of this series, but could be implemented
> > in a compatible way to fit within the pci_reset_fn_methods[] array
> > defined here.  
> 
> Ok!
> 
> > Note that with this series the resets available through
> > pci_reset_function() and the per device reset attribute is sysfs remain
> > exactly the same as they are currently.  The bus and slot reset
> > methods used here are limited to devices where only a single function is
> > affected by the reset, therefore it is not like the patch you proposed
> > which performed a reset irrespective of the downstream devices.  This
> > series only enables selection of the existing methods.  Thanks,
> > 
> > Alex
> >   
> 
> But with this patch series, there is still an issue with PCI secondary
> bus reset mechanism as exported sysfs attribute does not do that
> remove-reset-rescan procedure. As discussed in other thread, this reset
> let device in unconfigured / broken state.

No, there's not:

int pci_reset_function(struct pci_dev *dev)
{
int rc;

if (!dev->reset_fn)
return -ENOTTY;

pci_dev_lock(dev);
>>> pci_dev_save_and_disable(dev);

rc = __pci_reset_function_locked(dev);

>>> pci_dev_restore(dev);
pci_dev_unlock(dev);

return rc;
}

The remove/re-scan was discussed primarily because your patch performed
a bus reset regardless of what devices were affected by that reset and
it's difficult to manage the scope where multiple devices are affected.
Here, the bus and slot reset functions will fail unless the scope is
limited to the single device triggering this reset.  Thanks,

Alex



Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-15 Thread Pali Rohár
On Monday 15 March 2021 08:34:09 Alex Williamson wrote:
> On Mon, 15 Mar 2021 14:52:26 +0100
> Pali Rohár  wrote:
> 
> > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote:
> > > slot reset (pci_dev_reset_slot_function) and secondary bus
> > > reset(pci_parent_bus_reset) which I think are hot reset and
> > > warm reset respectively.  
> > 
> > No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is just another
> > type of reset, which is currently implemented only for PCIe hot plug
> > bridges and for PowerPC PowerNV platform and it just call PCI secondary
> > bus reset with some other hook. PCIe Warm Reset does not have API in
> > kernel and therefore drivers do not export this type of reset via any
> > kernel function (yet).
> 
> Warm reset is beyond the scope of this series, but could be implemented
> in a compatible way to fit within the pci_reset_fn_methods[] array
> defined here.

Ok!

> Note that with this series the resets available through
> pci_reset_function() and the per device reset attribute is sysfs remain
> exactly the same as they are currently.  The bus and slot reset
> methods used here are limited to devices where only a single function is
> affected by the reset, therefore it is not like the patch you proposed
> which performed a reset irrespective of the downstream devices.  This
> series only enables selection of the existing methods.  Thanks,
> 
> Alex
> 

But with this patch series, there is still an issue with PCI secondary
bus reset mechanism as exported sysfs attribute does not do that
remove-reset-rescan procedure. As discussed in other thread, this reset
let device in unconfigured / broken state.


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-15 Thread Alex Williamson
On Mon, 15 Mar 2021 14:52:26 +0100
Pali Rohár  wrote:

> On Monday 15 March 2021 19:13:23 Amey Narkhede wrote:
> > slot reset (pci_dev_reset_slot_function) and secondary bus
> > reset(pci_parent_bus_reset) which I think are hot reset and
> > warm reset respectively.  
> 
> No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is just another
> type of reset, which is currently implemented only for PCIe hot plug
> bridges and for PowerPC PowerNV platform and it just call PCI secondary
> bus reset with some other hook. PCIe Warm Reset does not have API in
> kernel and therefore drivers do not export this type of reset via any
> kernel function (yet).

Warm reset is beyond the scope of this series, but could be implemented
in a compatible way to fit within the pci_reset_fn_methods[] array
defined here.  Note that with this series the resets available through
pci_reset_function() and the per device reset attribute is sysfs remain
exactly the same as they are currently.  The bus and slot reset
methods used here are limited to devices where only a single function is
affected by the reset, therefore it is not like the patch you proposed
which performed a reset irrespective of the downstream devices.  This
series only enables selection of the existing methods.  Thanks,

Alex



Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-15 Thread Pali Rohár
On Monday 15 March 2021 19:13:23 Amey Narkhede wrote:
> slot reset (pci_dev_reset_slot_function) and secondary bus
> reset(pci_parent_bus_reset) which I think are hot reset and
> warm reset respectively.

No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is just another
type of reset, which is currently implemented only for PCIe hot plug
bridges and for PowerPC PowerNV platform and it just call PCI secondary
bus reset with some other hook. PCIe Warm Reset does not have API in
kernel and therefore drivers do not export this type of reset via any
kernel function (yet).


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-15 Thread Amey Narkhede
On 21/03/15 12:55AM, Pali Rohár wrote:
> On Friday 12 March 2021 23:04:52 ameynarkhed...@gmail.com wrote:
> > From: Amey Narkhede 
> >
> > Add reset_methods_enabled bitmap to struct pci_dev to
> > keep track of user preferred device reset mechanisms.
> > Add reset_method sysfs attribute to query and set
> > user preferred device reset mechanisms.
> >
> > Signed-off-by: Amey Narkhede 
> > ---
> > Reviewed-by: Alex Williamson 
> > Reviewed-by: Raphael Norwitz 
> >
> >  Documentation/ABI/testing/sysfs-bus-pci | 15 ++
> >  drivers/pci/pci-sysfs.c | 66 +++--
> >  drivers/pci/pci.c   |  3 +-
> >  include/linux/pci.h |  2 +
> >  4 files changed, 82 insertions(+), 4 deletions(-)
> >
> > diff --git a/Documentation/ABI/testing/sysfs-bus-pci 
> > b/Documentation/ABI/testing/sysfs-bus-pci
> > index 25c9c3977..ae53ecd2e 100644
> > --- a/Documentation/ABI/testing/sysfs-bus-pci
> > +++ b/Documentation/ABI/testing/sysfs-bus-pci
> > @@ -121,6 +121,21 @@ Description:
> > child buses, and re-discover devices removed earlier
> > from this part of the device tree.
> >
> > +What:  /sys/bus/pci/devices/.../reset_method
> > +Date:  March 2021
> > +Contact:   Amey Narkhede 
> > +Description:
> > +   Some devices allow an individual function to be reset
> > +   without affecting other functions in the same slot.
> > +   For devices that have this support, a file named reset_method
> > +   will be present in sysfs. Reading this file will give names
> > +   of the device supported reset methods. Currently used methods
> > +   are enclosed in brackets. Writing the name of any of the device
> > +   supported reset method to this file will set the reset method to
> > +   be used when resetting the device. Writing "none" to this file
> > +   will disable ability to reset the device and writing "default"
> > +   will return to the original value.
> > +
>
> Hello Amey!
>
> I think that this API does not work for PCIe Hot Reset (=PCI secondary
> bus reset) and PCIe Warm Reset.
>
> First reset method is bound to the bus, not device and therefore kernel
> does not have to see any registered device. So there would be no
> "reset_method" sysfs file, and also no "reset" sysfs file. But PCIe Hot
> Reset is in most cases needed when buggy card is not registered on bus,
> to trigger this reset. And with this API this is not possible.
>
> PCIe Warm Reset is done by PERST# signal. When signal is asserted then
> device is in reset state and therefore is not registered. So again
> kernel does not have to see registered device.
>
> Moreover for mPCIe form factor cards, boards can share one PERST# signal
> with more PCIe cards and control this signal via GPIO. So asserting
> PERST# GPIO can trigger Warm reset for more PCIe cards, not just one. It
> depends on board or topology.
>
> So... I do not think that current approach with "reset_method" sysfs
> entry bound to the PCI device does not work for PCI secondary bus reset
> and also cannot be used for implementing PCIe Warm Reset.
>
> I would rather suggest to re-design and prepare a new API which would
> work also with PCIe Hot Reset and PCIe Warm Reset.
>
> This "reset" sysfs file can work only with PCI Function Level Reset or
> some PM or device specific reset. But not with reset types which are
> more like slot or bus orientated.
>
The scope of this patch was to expose current reset methods
to the userspace. Also reset methods are available
for only those devices that allow an individual function to be reset
without affecting other functions in the same device.
So if those conditions are satisfied by the device then it can
use slot reset (pci_dev_reset_slot_function) and secondary bus
reset(pci_parent_bus_reset) which I think are hot reset and
warm reset respectively.

Thanks,
Amey
[...]


Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-14 Thread Pali Rohár
On Friday 12 March 2021 23:04:52 ameynarkhed...@gmail.com wrote:
> From: Amey Narkhede 
> 
> Add reset_methods_enabled bitmap to struct pci_dev to
> keep track of user preferred device reset mechanisms.
> Add reset_method sysfs attribute to query and set
> user preferred device reset mechanisms.
> 
> Signed-off-by: Amey Narkhede 
> ---
> Reviewed-by: Alex Williamson 
> Reviewed-by: Raphael Norwitz 
> 
>  Documentation/ABI/testing/sysfs-bus-pci | 15 ++
>  drivers/pci/pci-sysfs.c | 66 +++--
>  drivers/pci/pci.c   |  3 +-
>  include/linux/pci.h |  2 +
>  4 files changed, 82 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-pci 
> b/Documentation/ABI/testing/sysfs-bus-pci
> index 25c9c3977..ae53ecd2e 100644
> --- a/Documentation/ABI/testing/sysfs-bus-pci
> +++ b/Documentation/ABI/testing/sysfs-bus-pci
> @@ -121,6 +121,21 @@ Description:
>   child buses, and re-discover devices removed earlier
>   from this part of the device tree.
> 
> +What:/sys/bus/pci/devices/.../reset_method
> +Date:March 2021
> +Contact: Amey Narkhede 
> +Description:
> + Some devices allow an individual function to be reset
> + without affecting other functions in the same slot.
> + For devices that have this support, a file named reset_method
> + will be present in sysfs. Reading this file will give names
> + of the device supported reset methods. Currently used methods
> + are enclosed in brackets. Writing the name of any of the device
> + supported reset method to this file will set the reset method to
> + be used when resetting the device. Writing "none" to this file
> + will disable ability to reset the device and writing "default"
> + will return to the original value.
> +

Hello Amey!

I think that this API does not work for PCIe Hot Reset (=PCI secondary
bus reset) and PCIe Warm Reset.

First reset method is bound to the bus, not device and therefore kernel
does not have to see any registered device. So there would be no
"reset_method" sysfs file, and also no "reset" sysfs file. But PCIe Hot
Reset is in most cases needed when buggy card is not registered on bus,
to trigger this reset. And with this API this is not possible.

PCIe Warm Reset is done by PERST# signal. When signal is asserted then
device is in reset state and therefore is not registered. So again
kernel does not have to see registered device.

Moreover for mPCIe form factor cards, boards can share one PERST# signal
with more PCIe cards and control this signal via GPIO. So asserting
PERST# GPIO can trigger Warm reset for more PCIe cards, not just one. It
depends on board or topology.

So... I do not think that current approach with "reset_method" sysfs
entry bound to the PCI device does not work for PCI secondary bus reset
and also cannot be used for implementing PCIe Warm Reset.

I would rather suggest to re-design and prepare a new API which would
work also with PCIe Hot Reset and PCIe Warm Reset.

This "reset" sysfs file can work only with PCI Function Level Reset or
some PM or device specific reset. But not with reset types which are
more like slot or bus orientated.

>  What:/sys/bus/pci/devices/.../reset
>  Date:July 2009
>  Contact: Michael S. Tsirkin 
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 78d2c130c..3cd06d1c0 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -1304,6 +1304,59 @@ static const struct bin_attribute pcie_config_attr = {
>   .write = pci_write_config,
>  };
> 
> +static ssize_t reset_method_show(struct device *dev,
> +  struct device_attribute *attr,
> +  char *buf)
> +{
> + const struct pci_reset_fn_method *reset;
> + struct pci_dev *pdev = to_pci_dev(dev);
> + ssize_t len = 0;
> + int i;
> +
> + for (i = 0, reset = pci_reset_fn_methods; reset->reset_fn; i++, 
> reset++) {
> + if (pdev->reset_methods_enabled & (1 << i))
> + len += sysfs_emit_at(buf, len, "[%s] ", reset->name);
> + else if (pdev->reset_methods & (1 << i))
> + len += sysfs_emit_at(buf, len, "%s ", reset->name);
> + }
> +
> + return len;
> +}
> +
> +static ssize_t reset_method_store(struct device *dev,
> +   struct device_attribute *attr,
> +   const char *buf, size_t count)
> +{
> + const struct pci_reset_fn_method *reset = pci_reset_fn_methods;
> + struct pci_dev *pdev = to_pci_dev(dev);
> + u8 reset_mechanism;
> + int i = 0;
> +
> + /* Writing none disables reset */
> + if (sysfs_streq(buf, "none")) {
> + reset_mechanism = 0;
> + } else if 

[PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-12 Thread ameynarkhede03
From: Amey Narkhede 

Add reset_methods_enabled bitmap to struct pci_dev to
keep track of user preferred device reset mechanisms.
Add reset_method sysfs attribute to query and set
user preferred device reset mechanisms.

Signed-off-by: Amey Narkhede 
---
Reviewed-by: Alex Williamson 
Reviewed-by: Raphael Norwitz 

 Documentation/ABI/testing/sysfs-bus-pci | 15 ++
 drivers/pci/pci-sysfs.c | 66 +++--
 drivers/pci/pci.c   |  3 +-
 include/linux/pci.h |  2 +
 4 files changed, 82 insertions(+), 4 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-bus-pci 
b/Documentation/ABI/testing/sysfs-bus-pci
index 25c9c3977..ae53ecd2e 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -121,6 +121,21 @@ Description:
child buses, and re-discover devices removed earlier
from this part of the device tree.

+What:  /sys/bus/pci/devices/.../reset_method
+Date:  March 2021
+Contact:   Amey Narkhede 
+Description:
+   Some devices allow an individual function to be reset
+   without affecting other functions in the same slot.
+   For devices that have this support, a file named reset_method
+   will be present in sysfs. Reading this file will give names
+   of the device supported reset methods. Currently used methods
+   are enclosed in brackets. Writing the name of any of the device
+   supported reset method to this file will set the reset method to
+   be used when resetting the device. Writing "none" to this file
+   will disable ability to reset the device and writing "default"
+   will return to the original value.
+
 What:  /sys/bus/pci/devices/.../reset
 Date:  July 2009
 Contact:   Michael S. Tsirkin 
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 78d2c130c..3cd06d1c0 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1304,6 +1304,59 @@ static const struct bin_attribute pcie_config_attr = {
.write = pci_write_config,
 };

+static ssize_t reset_method_show(struct device *dev,
+struct device_attribute *attr,
+char *buf)
+{
+   const struct pci_reset_fn_method *reset;
+   struct pci_dev *pdev = to_pci_dev(dev);
+   ssize_t len = 0;
+   int i;
+
+   for (i = 0, reset = pci_reset_fn_methods; reset->reset_fn; i++, 
reset++) {
+   if (pdev->reset_methods_enabled & (1 << i))
+   len += sysfs_emit_at(buf, len, "[%s] ", reset->name);
+   else if (pdev->reset_methods & (1 << i))
+   len += sysfs_emit_at(buf, len, "%s ", reset->name);
+   }
+
+   return len;
+}
+
+static ssize_t reset_method_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+   const struct pci_reset_fn_method *reset = pci_reset_fn_methods;
+   struct pci_dev *pdev = to_pci_dev(dev);
+   u8 reset_mechanism;
+   int i = 0;
+
+   /* Writing none disables reset */
+   if (sysfs_streq(buf, "none")) {
+   reset_mechanism = 0;
+   } else if (sysfs_streq(buf, "default")) {
+   /* Writing default returns to initial value */
+   reset_mechanism = pdev->reset_methods;
+   } else {
+   reset_mechanism = 0;
+   for (; reset->reset_fn; i++, reset++) {
+   if (sysfs_streq(buf, reset->name)) {
+   reset_mechanism = 1 << i;
+   break;
+   }
+   }
+   if (!reset_mechanism || !(pdev->reset_methods & 
reset_mechanism))
+   return -EINVAL;
+   }
+
+   pdev->reset_methods_enabled = reset_mechanism;
+
+   return count;
+}
+
+static DEVICE_ATTR_RW(reset_method);
+
 static ssize_t reset_store(struct device *dev, struct device_attribute *attr,
   const char *buf, size_t count)
 {
@@ -1337,11 +1390,16 @@ static int pci_create_capabilities_sysfs(struct pci_dev 
*dev)
if (dev->reset_methods) {
retval = device_create_file(>dev, _attr_reset);
if (retval)
-   goto error;
+   goto err_reset;
+   retval = device_create_file(>dev, _attr_reset_method);
+   if (retval)
+   goto err_method;
}
return 0;

-error:
+err_method:
+   device_remove_file(>dev, _attr_reset);
+err_reset:
pcie_vpd_remove_sysfs_dev_files(dev);
return retval;
 }
@@ -1417,8 +1475,10 @@ int __must_check pci_create_sysfs_dev_files(struct 
pci_dev *pdev)
 static void