Re: [PATCH 2/3] block: Introduce blk_rq_completed()
On 05/26/14 17:27, James Bottomley wrote: On Mon, 2014-05-26 at 17:15 +0200, Bart Van Assche wrote: Make it possible to test the REQ_ATOM_COMPLETE bit from outside the block layer core. I don't see the value of patches 2,3 they're checking for an impossible condition ... why might it be possible? When reading the source code in scsi_error.c it's easy to overlook that scmd_eh_abort_handler(), scsi_abort_command() and scsi_eh_scmd_add() are all invoked for requests in which the REQ_ATOM_COMPLETE bit has been set. Although it is possible to mention this as a comment above these functions, such comments are not checked at runtime. It would require additional work from the reader to verify whether or not such source code comments are up to date. However, the condition inside a WARN_ON_ONCE() statement is checked every time the code is executed. Hence my preference for a WARN_ON_ONCE() statement instead of writing down somewhere that these three functions operate on requests in which the REQ_ATOM_COMPLETE bit has been set. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] block: Introduce blk_rq_completed()
On Tue, May 27, 2014 at 09:49:48AM +0200, Bart Van Assche wrote: I don't see the value of patches 2,3 they're checking for an impossible condition ... why might it be possible? When reading the source code in scsi_error.c it's easy to overlook that scmd_eh_abort_handler(), scsi_abort_command() and scsi_eh_scmd_add() are all invoked for requests in which the REQ_ATOM_COMPLETE bit has been set. Although it is possible to mention this as a comment above these functions, such comments are not checked at runtime. It would require additional work from the reader to verify whether or not such source code comments are up to date. However, the condition inside a WARN_ON_ONCE() statement is checked every time the code is executed. Hence my preference for a WARN_ON_ONCE() statement instead of writing down somewhere that these three functions operate on requests in which the REQ_ATOM_COMPLETE bit has been set. I really do like the REQ_ATOM_COMPLETE asserts - as experience tells these kinds of assumptions are best checked, otherwise they will unintentionally be violated. I'm less excited about the list walk I have to say, as the overhead is getting fairly large for a simple assertation. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] block: Introduce blk_rq_completed()
On Tue, 2014-05-27 at 09:49 +0200, Bart Van Assche wrote: On 05/26/14 17:27, James Bottomley wrote: On Mon, 2014-05-26 at 17:15 +0200, Bart Van Assche wrote: Make it possible to test the REQ_ATOM_COMPLETE bit from outside the block layer core. I don't see the value of patches 2,3 they're checking for an impossible condition ... why might it be possible? When reading the source code in scsi_error.c it's easy to overlook that scmd_eh_abort_handler(), scsi_abort_command() and scsi_eh_scmd_add() are all invoked for requests in which the REQ_ATOM_COMPLETE bit has been set. I really don't like this entanglement of state of block and SCSI. complete in block terms isn't the same as in SCSI terms. The REQ_ATOM_COMPLETE flag is fully internal to block and indicates that we've taken over processing the command and any completions into block get ignored. This is for the possible race between timeout and inbound command completion. If you start coding SCSI assertions in terms of it, you're entangling layers that should be separate. The assertion in SCSI terms is that abort and -done cannot race. James Although it is possible to mention this as a comment above these functions, such comments are not checked at runtime. It would require additional work from the reader to verify whether or not such source code comments are up to date. However, the condition inside a WARN_ON_ONCE() statement is checked every time the code is executed. Hence my preference for a WARN_ON_ONCE() statement instead of writing down somewhere that these three functions operate on requests in which the REQ_ATOM_COMPLETE bit has been set. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] block: Introduce blk_rq_completed()
On 05/27/14 10:23, James Bottomley wrote: On Tue, 2014-05-27 at 09:49 +0200, Bart Van Assche wrote: When reading the source code in scsi_error.c it's easy to overlook that scmd_eh_abort_handler(), scsi_abort_command() and scsi_eh_scmd_add() are all invoked for requests in which the REQ_ATOM_COMPLETE bit has been set. I really don't like this entanglement of state of block and SCSI. complete in block terms isn't the same as in SCSI terms. The REQ_ATOM_COMPLETE flag is fully internal to block and indicates that we've taken over processing the command and any completions into block get ignored. This is for the possible race between timeout and inbound command completion. If you start coding SCSI assertions in terms of it, you're entangling layers that should be separate. The assertion in SCSI terms is that abort and -done cannot race. Recent e-mail threads on the linux-scsi mailing list have shown that it is easily overlooked which error handler functions are called only for requests marked by the block layer as complete and hence cannot race with scsi_softirq_done(). So far my proposals for how to improve the documentation of how this race is avoided were rejected. Do you perhaps have a proposal of how this should be documented properly ? Thanks, Bart. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] block: Introduce blk_rq_completed()
On Tue, 2014-05-27 at 11:00 +0200, Bart Van Assche wrote: On 05/27/14 10:23, James Bottomley wrote: On Tue, 2014-05-27 at 09:49 +0200, Bart Van Assche wrote: When reading the source code in scsi_error.c it's easy to overlook that scmd_eh_abort_handler(), scsi_abort_command() and scsi_eh_scmd_add() are all invoked for requests in which the REQ_ATOM_COMPLETE bit has been set. I really don't like this entanglement of state of block and SCSI. complete in block terms isn't the same as in SCSI terms. The REQ_ATOM_COMPLETE flag is fully internal to block and indicates that we've taken over processing the command and any completions into block get ignored. This is for the possible race between timeout and inbound command completion. If you start coding SCSI assertions in terms of it, you're entangling layers that should be separate. The assertion in SCSI terms is that abort and -done cannot race. Recent e-mail threads on the linux-scsi mailing list have shown that it is easily overlooked which error handler functions are called only for requests marked by the block layer as complete and hence cannot race with scsi_softirq_done(). So far my proposals for how to improve the documentation of how this race is avoided were rejected. Do you perhaps have a proposal of how this should be documented properly ? Um, if that's your goal, then I don't see how adding a WARN_ON_ONCE(!blk_rq_completed(scmd-request)); makes it clear that timeout and the softirq cannot race. I suppose for me, it's just an obvious systems assertion since every command must be in a defined state for the state model, either a command has timed out or it's in normal completion. But since we inherit this race mediation from block, isn't that the place to document it if people are confused? I could also see us one day extending the TMF capability to abort any running command, which would make even an assertion of block timed out or completed invalid. James -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] block: Introduce blk_rq_completed()
Il 27/05/2014 12:21, James Bottomley ha scritto: I could also see us one day extending the TMF capability to abort any running command, which would make even an assertion of block timed out or completed invalid. Actually the assertion would remain valid, and that's exactly what Bart wants to document with this assertion. Completed from the point of view of the block layer really means do not run the softirq. If you wanted to abort any running command, you still would mark the request as completed for the block layer before issuing the TMF. Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] block: Introduce blk_rq_completed()
On Tue, 2014-05-27 at 12:47 +0200, Paolo Bonzini wrote: Il 27/05/2014 12:21, James Bottomley ha scritto: I could also see us one day extending the TMF capability to abort any running command, which would make even an assertion of block timed out or completed invalid. Actually the assertion would remain valid, and that's exactly what Bart wants to document with this assertion. No, it wouldn't: if we abort a running command by definition the command hadn't timed out and might not be completed. This is required by TMF handling because now you have an abort racing with a completion. Either the command completes normally because it misses the abort or the abort gets to it and its returned status is set to TASK_ABORTED. That's the only way you can tell if the abort was successful or not. If you're thinking we would tell block to ignore returning commands before issuing the abort, we'd never be able to tell if the abort were successful, so we have to allow the race to collect the status. James -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] block: Introduce blk_rq_completed()
Il 27/05/2014 12:59, James Bottomley ha scritto: On Tue, 2014-05-27 at 12:47 +0200, Paolo Bonzini wrote: Il 27/05/2014 12:21, James Bottomley ha scritto: I could also see us one day extending the TMF capability to abort any running command, which would make even an assertion of block timed out or completed invalid. Actually the assertion would remain valid, and that's exactly what Bart wants to document with this assertion. No, it wouldn't: if we abort a running command by definition the command hadn't timed out and might not be completed. This is required by TMF handling because now you have an abort racing with a completion. Either the command completes normally because it misses the abort or the abort gets to it and its returned status is set to TASK_ABORTED. That's the only way you can tell if the abort was successful or not. If you're thinking we would tell block to ignore returning commands before issuing the abort, we'd never be able to tell if the abort were successful, so we have to allow the race to collect the status. You could use a different mechanism than a softirq to tell the abort were successful, for example by overriding scsi_done. But with respect to the block layer, the mechanics of avoiding the race and double-free would probably be the same. Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] block: Introduce blk_rq_completed()
Il 27/05/2014 13:26, James Bottomley ha scritto: You could use a different mechanism than a softirq to tell the abort were successful, for example by overriding scsi_done. But with respect to the block layer, the mechanics of avoiding the race and double-free would probably be the same. I think there's some confusion about what the race and double free is: It only occurs with timeouts. In a timeout situation, the host had decided it's not waiting any longer for the target to respond and proceeds to error recovery. At any time between the host making this decision up to the point it kicks the target hard enough to clear all in-flight commands, the target may return the command. If we didn't have some ignore function on command completions while we're handling errors, this would lead to double completion. If we decided to allow arbitrary aborts of running commands, we would send a TMF in during the normal (i.e. un timed out) command period. Because there's no timeout involved, there's no double free problem. The race in this case is whether the abort catches the command or not and to mediate that race we need the normal status return. I'm not sure why no timeout implies no double free. There would still be a race between the interrupt handler and softirq on one side, and the abort handler on the other. The interrupt handler's call to cmd-scsi_done ends up triggering the softirq and thus freeing the command with scsi_put_command. The abort handler, as you mentioned, wants the status return so it needs the interrupt handler to run---but not the softirq. A simple way to avoid this could be to skip the softirq processing, by marking the request block-layer-complete, and do everything in the abort handler. The interrupt handler would still run and fill in the status. Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] block: Introduce blk_rq_completed()
On Tue, 2014-05-27 at 13:52 +0200, Paolo Bonzini wrote: Il 27/05/2014 13:26, James Bottomley ha scritto: You could use a different mechanism than a softirq to tell the abort were successful, for example by overriding scsi_done. But with respect to the block layer, the mechanics of avoiding the race and double-free would probably be the same. I think there's some confusion about what the race and double free is: It only occurs with timeouts. In a timeout situation, the host had decided it's not waiting any longer for the target to respond and proceeds to error recovery. At any time between the host making this decision up to the point it kicks the target hard enough to clear all in-flight commands, the target may return the command. If we didn't have some ignore function on command completions while we're handling errors, this would lead to double completion. If we decided to allow arbitrary aborts of running commands, we would send a TMF in during the normal (i.e. un timed out) command period. Because there's no timeout involved, there's no double free problem. The race in this case is whether the abort catches the command or not and to mediate that race we need the normal status return. I'm not sure why no timeout implies no double free. There would still be a race between the interrupt handler and softirq on one side, and the abort handler on the other. You're assuming the current error handling after timeout methodology. If we allow arbitrary TMFs, that doesn't hold because we have to wait for the command completion. James -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] block: Introduce blk_rq_completed()
Make it possible to test the REQ_ATOM_COMPLETE bit from outside the block layer core. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: Jens Axboe ax...@fb.com Cc: Hannes Reinecke h...@suse.de Cc: Paolo Bonzini pbonz...@redhat.com Cc: Christoph Hellwig h...@infradead.org Cc: Joe Lawrence joe.lawre...@stratus.com --- block/blk-softirq.c| 13 + include/linux/blkdev.h | 1 + 2 files changed, 14 insertions(+) diff --git a/block/blk-softirq.c b/block/blk-softirq.c index 53b1737..fc7d160 100644 --- a/block/blk-softirq.c +++ b/block/blk-softirq.c @@ -172,6 +172,19 @@ void blk_complete_request(struct request *req) } EXPORT_SYMBOL(blk_complete_request); +/** + * blk_rq_completed - whether or not a request has been completed + * + * Intended for debugging purposes only. Any completion handler code should go + * into the softirq_done_fn() and/or in the rq_timed_out_fn() request_queue + * callback functions. + */ +bool blk_rq_completed(struct request *rq) +{ + return test_bit(REQ_ATOM_COMPLETE, rq-atomic_flags); +} +EXPORT_SYMBOL(blk_rq_completed); + static __init int blk_softirq_init(void) { int i; diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 0d84981..a621bc5 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -952,6 +952,7 @@ extern void blk_complete_request(struct request *); extern void __blk_complete_request(struct request *); extern void blk_abort_request(struct request *); extern void blk_unprep_request(struct request *); +extern bool blk_rq_completed(struct request *); /* * Access functions for manipulating queue properties -- 1.8.4.5 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] block: Introduce blk_rq_completed()
On 05/26/2014 05:15 PM, Bart Van Assche wrote: Make it possible to test the REQ_ATOM_COMPLETE bit from outside the block layer core. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: Jens Axboe ax...@fb.com Cc: Hannes Reinecke h...@suse.de Cc: Paolo Bonzini pbonz...@redhat.com Cc: Christoph Hellwig h...@infradead.org Cc: Joe Lawrence joe.lawre...@stratus.com Reviewed-by: Hannes Reinecke h...@suse.de Cheers, Hannes -- Dr. Hannes Reinecke zSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html