A couple of replies in one email:


Garrett D'Amore wrote:
> I requested that the team not create a new DDI_QUIESCE command.  The 
> expansion of detach(9e) and attach(9e) with new interfaces was, IMO, a 
> big mistake.  The biggest problem is that it is impossible to tell (at 
> least easily) if a driver supports the new interfaces or not.  By adding 
> a new devops entry point, it becomes nothing more than a matter of 
> routine inspection.   (The current design is one of the major limiting 
> factors in suspend-to-ram.  There is no predictability about whether 
> suspend will work or not -- the only way to know is to actually try it.)

I guess inspection is a good reason to create a new entry in dev_ops,
but it's still not a guarantee that quiesce(9e) will succeed.

You could require the driver to export a property if it supports
quiesce(9e) and get the same effect you are looking for without
expanding dev_ops if you wanted to.

There's nothing wrong with expanding dev_ops, but we don't have
to expand it if we don't want to.

Sherry Moore wrote:
> Hi David,
> 
>> Specifically, can open(9e) or any other entry point be called
>> while the driver is in quiesce(9e)? I would think we would want
>> to guard against that, and if not, the driver needs to know that.
>> If the intent is that nothing else will be called while the
>> driver is in quiesce(9e), please state that guarantee in the
>> man page.
> 
> Technically, yes; architecturally no.  Since we're basically at the
> same place as we would be when the reset() entry points are called, the
> same sort of assumption hold here also.
> 
> I will add the following to the man page:
> 
> "open(9E) and other entry points should not be called while the driver
> is in quiesce(9E)."

I think the man page needs to provide a guarantee that the driver
does not have to guard against open(9e) and close(9e) being called
once the framework calls quiesce(9e), so that has to be worded
appropriately. (Not a "should" for the implementation, but written
as a normative requirement that the driver can rely on.)

Something like:

   open(9e), close(9e), attach(9e), detach(9e) and other driver entry
   points shall not be called when quiesce(9e) is called.

I'm not sure if that's enough. What if open, close, etc are in
progress when the framework decides to call quiesce. Is that
possible? So maybe it has to be worded even stronger such that
the guarantee is that open, close, attach, detach and other driver
entry points will not be active or called when the framework calls
quiesce(9e). That still leaves the question of interrupt handlers,
timers, callbacks, etc that could be called when the driver is
in quiesce(9e) or might be active when the quiesce(9e) is called.

This needs a bit of work, IMO.


> 
>> Second, are there open instances of the driver when quiesce(9e)
>> is called, or does the framework guarantee to close all open
>> instances of the device first?
> 
> Yes, as long as the driver can guarantee that no more asynchronous
> writes occur to system memory as operation to the open instances.
> 
> I will add that to the man page as well.

Bear with me on this. A driver has scheduled the device to do
a bunch of io by handing off a list of requests to the device.
The driver probably uses completion interrupts or doorbells to
the application directly to signal completions or errors from
the device. Nothing has been done to tell the driver to stop
doing that when we call quiesce(9e), so the device may still be
writing to memory and sending interrupts. Will those interrupts
be taken when the driver is in quiesce(9e)? If so, that's why
the driver needs some sort of lock to guard against race conditions
with it's interrupt handler. (Also, timers, callbacks, etc. Basically
any interface that can asynchronously call into the driver while
any other entry point is active.)

There's always the possibility of a race inside the driver if any
interrupt can be taken when we start to call quiesce(9e).

Also, there's timer interrupts, soft interrupts, etc. Basically
any sort of async event that the driver setup before quiesce(9e)
was called.

This needs some work also.

>> Also, the phrase "guarantee on disk data integrety" is not
>> well-defined from a driver perspective. If I put my driver hat
>> on, I really have no idea what that requirement is. Do you expect
>> the driver writer to do something specific in that case, or just
>> reset the controller, dropping any scheduled or in-flight
>> transactions? How does it know if the disk data integrety requirement
>> can be met.
> 
> Fair enough.  I will remove the data integrity clause.  From Fast
> Reboot's perspective, it's preferred that the driver drop any scheduled
> or in-flight transactions.

Removing it is ok, but I'm not sure that's sufficient.
We really have to be very explicit about what we expect
the device to do and what it can expect in quiesce(9e).

> 
>> Finally, I'm not sure if you want to say anything about DMA
>> mappings, etc. I guess they can be thrown away because the
>> driver for any iommu in the path (yes, x86 has them too) will
>> end up being reset by its driver?
> 
> OK.  How about "asynchronous writes to system memory" instead?

Not sure. Basically, I think what you are trying to create
is an interface to the driver that gets the device into
a quiet reset state. But doing that in the driver may be
more complicated than you think if it has to worry about
taking interrupts, soft ints, timers, callbacks or any other async
event while it's in quiesce(9e).

> 
>> How will console output work during this process?
> 
> Console output will continue to work until the chain of drivers for the
> console are quiesced, which will be performed last.

What happens after that and before we're ready to do console
output again? What if the process fails? There's no way to
display a message on the console? What is the behavior of this
stuff if the process fails? I suppose the fallback could be to
do a normal reset, as long as we end up booting the same thing
that would have been booted had this process succeeded. Is that
part of the interface?

>> Also, which ddi services is the driver permitted to call and not
>> permitted to call as part of it's quiesce(9e) implementation?
> 
> All DDI interfaces can be called except those prevented by
> devtree_freeze().

So I can create timers, add interrupts, wait for locks, etc?

> 
>> I guess what you are doing is loading the new ramdisk or kernel
>> into memory and then calling quiesce(9e)? How would that work
>> in a panic situation? (Yes, I saw that's part of phase 2, but
>> it still needs to fit these interfaces.)
> 
> We will probably require drivers to implement quiesce() in a lock-free,
> no-memory-allocation, do-the-bare-minimum manner to poke only the few
> registers to stop all asynchronous writes to system memory and
> interrupt generate.  That's why the quisce() interface has an "arg"
> field...

The quiesce(9e) man page is one of the interfaces that PSARC is being
asked to approve with this case, right? The man page doesn't say that.

I think this project needs a bit of work before it can just
time out as a fast-track.

It should be possible for you to do this, but the interfaces,
especially quiesce(9e) and what the driver can do and is supposed
to do needs some more thought, IMO.

It seems like what you want to do is stop interrupt dispatch,
stop callback threads, timer callbacks, soft ints, etc, and then
just call into the driver to tell it to stop whatever it might be
doing and get the device into a quiet/reset state. You have to
guarantee in the interface all of the above conditions in order
to implement something like that and show the driver writer that
they don't need to use locks or worry about any other part of
the driver becoming active when it's in quiesce(9e). Once the
driver returns successfully from quiesce(9e), it basically
guarantees that the device will no longer access system memory
(we need more there.) I think we also have to guarantee that
once quiesce(9e) is called, we won't call into the driver in
any e.p. or callback, interrupt, timer, etc anymore.

-David

Reply via email to