On Fri, 2017-03-17 at 18:55 +0000, Madhani, Himanshu wrote:
> On Mar 16, 2017, at 3:27 PM, Bart Van Assche <bart.vanass...@sandisk.com> 
> wrote:
> > On Thu, 2017-03-16 at 14:40 -0700, Himanshu Madhani wrote:
> > > +static int
> > > +scsi_poll(struct blk_mq_hw_ctx *hctx, unsigned int tag)
> > > +
> > > +{
> > > +        struct Scsi_Host *shost = hctx->driver_data;
> > > +        struct request *req;
> > > +        struct scsi_cmnd *cmd;
> > > +
> > > +        req = blk_mq_tag_to_rq(shost->tag_set.tags[hctx->queue_num], 
> > > tag);
> > > +        if (!req)
> > > +                return 0;
> > > +
> > > +        cmd = blk_mq_rq_to_pdu(req);
> > > + if (!cmd)
> > > +         return 0;
> > > +
> > > + if (shost->hostt->poll_queue)
> > > +         return(shost->hostt->poll_queue(shost, cmd));
> > > + else return 0;
> > > +}
> > 
> > What will happen if the request with tag 'tag' is completed after the block
> > layer decided to call this function and before this function had the chance
> > to call blk_mq_tag_to_rq()? Will this trigger a kernel crash? Also, please
> > format patches properly. This is what checkpatch reports for this patch:
> > 
> 
> Tag is passed into here by rq->tag
> 
> When tag goes to blk_mq_tag_to_rq  it just indexes to the rqs array using tag:
> 
> struct request *blk_mq_tag_to_rq(struct blk_mq_tags *tags, unsigned int tag)
> {
>        if (tag < tags->nr_tags) {
>                prefetch(tags->rqs[tag]);
>                return tags->rqs[tag];
>        }
> 
>        return NULL;
> 
> So if tag is invalid(too large), it returns NULL.  Every step it validates it 
> has a valid * before continuing.  
> 
> Worse case we poll for a completion that has already completed. 
> 
> We don’t think this will result into NULL pointer crash.

Hello Himanshu,

Have you noticed that the caller of scsi_poll(), namely blk_mq_poll(), does
not use any kind of mechanism to prevent that the 'tag' argument is freed and
reused while blk_mq_poll() is in progress? I think that a SCSI completion
can occur while blk_mq_poll() is in progress. What will happen if a SCSI
completion occurs concurrently with scsi_poll()? Will that trigger a
use-after-free of the scsi_cmnd structure 'cmd' points at? If so, what will
be the consequences?

Because of this I'm not sure we should proceed with adding blk_mq_poll()
support to the SCSI core. But if such support is added it's probably a much
better choice to change the type of the second argument of .poll_queue()
from struct scsi_cmnd * into int (the tag number). It is then the
responsibility of SCSI LLD's to avoid that their .poll_queue() implementation
triggers a use-after-free.

Bart.

Reply via email to