Re: [RFC PATCH] PM / core: skip suspend next time if resume returns an error

Rafael J. Wysocki Tue, 02 Oct 2018 01:29:21 -0700

On Tue, Oct 2, 2018 at 10:05 AM Pavel Machek <[email protected]> wrote:
>
> Hi!
>
> > In general Linux doesn't behave super great if you get an error while
> > executing a device's resume handler.  Nothing will come along later
> > and and try again to resume the device (and all devices that depend on
> > it), so pretty much you're left with a non-functioning device and
> > that's not good.
> >
> > However, even though you'll end up with a non-functioning device we
> > still don't consider resume failures to be fatal to the system.  We'll
> > keep chugging along and just hope that the device that failed to
> > resume wasn't too critical.  This establishes the precedent that we
> > should at least try our best not to fully bork the system after a
> > resume failure.
> >
> > I will argue that the best way to keep the system in the best shape is
> > to assume that if a resume callback failed that it did as close to
> > no-op as possible.  Because of this we should consider the device
> > still suspended and shouldn't try to suspend the device again next
> > time around.  Today that's not what happens.  AKA if you have a
> > device
>
> I don't think there are many guarantees when device resume fail. It
> may have done nothing, and it may have resumed the device almost
> fully.
>
> I guess the best option would be to refuse system suspend after some
> device failed like that.
>
> That leaves user possibility to debug it...


I guess so.

Doing that in all cases is kind of risky IMO, because we haven't taken
the return values of the ->resume* callbacks into account so far
(except for printing the information that is), so there may be
non-lethal cases when that happens and the $subject patch would make
them not work any more.

Thanks,
Rafael

Re: [RFC PATCH] PM / core: skip suspend next time if resume returns an error

Reply via email to