Re: [PATCH 1/2] blk-mq: Export iterating all tagged requests

James Smart Tue, 04 Dec 2018 14:10:34 -0800



On 12/4/2018 1:21 PM, Keith Busch wrote:

On Tue, Dec 04, 2018 at 11:33:33AM -0800, James Smart wrote:

I disagree.  The cases I've run into are on the admin queue - where we are
sending io to initialize the controller when another error/reset occurs, and
the checks are required to identify/reject the "old" initialization
commands, with another state check allowing them to proceed on the "new"
initialization commands.  And there are also cases for ioctls and other
things that occur during the middle of those initialization steps that need
to be weeded out.   The Admin queue has to be kept live to allow the
initialization commands on the new controller.

state checks are also needed for those namespace validation cases....

Once quiesced, the proposed iterator can handle the final termination
of the request, perform failover, or some other lld specific action
depending on your situation.

I don't believe they can remain frozen, definitely not for the admin queue.
-- james

Quiesced and frozen carry different semantics.

My understanding of the nvme-fc implementation is that it returns
BLK_STS_RESOURCE in the scenario you've described where the admin
command can't be executed at the moment. That just has the block layer
requeue it for later resubmission 3 milliseconds later, which will
continue to return the same status code until you're really ready for
it.

BLK_STS_RESOURCE is correct - but for "normal" io, which comes from thefilesystem, etc and are mostly on the io queues.

But if the io originated from other sources, like the core layer vianvme_alloc_request() - used by lots of service routines for thetransport to initialize the controller, core routines to talk to thecontroller, and ioctls from user space - then they are failed with aPATHING ERROR status. The status doesn't mean much to these otherplaces which mainly care if they succeed or not, and if not, they failand unwind. It's pretty critical for these paths to get that errorstatus as many of those threads do synchronous io. And this is not justfor nvme-fc. Any transport initializing the controller and getting halfway through it when an error occurs that kills the association willdepend on this behavior. PCI is a large exception as interaction with apci function is very different from sending packets over a network anddetecting network errors.

Io requests, on the io queues, that are flagged as multipath, also arefailed this way rather than requeued. We would need some iterator hereto classify the type of io (one valid to go down another path) and moveit to another path (but the transport doesn't want to know about otherpaths).


What I'm proposing is that instead of using that return code, you may
have nvme-fc control when to dispatch those queued requests by utilizing
the blk-mq quiesce on/off states. Is there a reason that wouldn't work?


and quiesce on/off isn't sufficient to do this.

-- james

Re: [PATCH 1/2] blk-mq: Export iterating all tagged requests

Reply via email to