Hi Keith,
I like this series a lot. One comment that is probably close
to the big discussion in the thread:
> switch (ret) {
> case BLK_EH_HANDLED:
> /*
> + * If the request is still in flight, the driver is requesting
> + * blk-mq complete it.
> */
> + if (blk_mq_rq_state(req) == MQ_RQ_IN_FLIGHT)
> + __blk_mq_complete_request(req);
> + break;
The state check here really irked me, and from the thread it seems like
I'm not the only one. At least for the NVMe case I think it is perfectly
safe, although I agree I'd rather audit what other drivers do carefully.
That being said I think BLK_EH_HANDLED seems like a fundamentally broken
idea, and I'd actually prefer to get rid of it over adding things like
the MQ_RQ_IN_FLIGHT check above.
E.g. if we look at the cases where nvme-pci returns it:
- if we did call nvme_dev_disable, we already canceled all requests,
so we might as well just return BLK_EH_NOT_HANDLED
- the poll for completion case already completed the command,
so we should return BLK_EH_NOT_HANDLED
So I think we need to fix up nvme and if needed any other driver
to return the right value and then assert that the request is
still in in-flight status for the BLK_EH_HANDLED case.
> @@ -124,16 +119,7 @@ static inline int blk_mq_rq_state(struct request *rq)
> static inline void blk_mq_rq_update_state(struct request *rq,
> enum mq_rq_state state)
> {
> + WRITE_ONCE(rq->state, state);
> }
I think this helper can go away now. But we should have a comment
near the state field documenting the concurrency implications.
> + u64 state;
This should probably be a mq_rq_state instead. Which means it needs
to be moved to blkdev.h, but that should be ok.