On Tue, Sep 19, 2017 at 03:18:45PM +, Bart Van Assche wrote:
> On Tue, 2017-09-19 at 11:07 -0400, Keith Busch wrote:
> > The problem is when blk-mq's timeout handler prevents the request from
> > completing, and doesn't leave any indication the driver requested to
> > complete it. Who is
On Tue, Sep 19, 2017 at 11:22:20PM +0800, Ming Lei wrote:
> On Tue, Sep 19, 2017 at 11:07 PM, Keith Busch wrote:
> > On Tue, Sep 19, 2017 at 12:16:31PM +0800, Ming Lei wrote:
> >> On Tue, Sep 19, 2017 at 7:08 AM, Keith Busch wrote:
> >> >
> >> >
On Tue, Sep 19, 2017 at 11:07 PM, Keith Busch wrote:
> On Tue, Sep 19, 2017 at 12:16:31PM +0800, Ming Lei wrote:
>> On Tue, Sep 19, 2017 at 7:08 AM, Keith Busch wrote:
>> >
>> > Indeed that prevents .complete from running concurrently with the
>> >
On Tue, 2017-09-19 at 11:07 -0400, Keith Busch wrote:
> The problem is when blk-mq's timeout handler prevents the request from
> completing, and doesn't leave any indication the driver requested to
> complete it. Who is responsible for completing that request now?
Hello Keith,
My understanding
On Mon, 2017-09-18 at 21:55 -0400, Keith Busch wrote:
> The only way to complete that request now is if the timeout
> handler returns BLK_EH_HANDLED, but the scsi-mq abort path returns
> BLK_EH_NOT_HANDLED on success (very few drivers actually return
> BLK_EH_HANDLED).
>
> After the timeout
On Tue, Sep 19, 2017 at 12:16:31PM +0800, Ming Lei wrote:
> On Tue, Sep 19, 2017 at 7:08 AM, Keith Busch wrote:
> >
> > Indeed that prevents .complete from running concurrently with the
> > timeout handler, but scsi_mq_done and nvme_handle_cqe are not .complete
> >
On Tue, Sep 19, 2017 at 7:08 AM, Keith Busch wrote:
> On Mon, Sep 18, 2017 at 10:53:12PM +, Bart Van Assche wrote:
>> On Mon, 2017-09-18 at 18:39 -0400, Keith Busch wrote:
>> > The nvme driver's use of blk_mq_reinit_tagset only happens during
>> > controller
On Mon, Sep 18, 2017 at 11:14:38PM +, Bart Van Assche wrote:
> On Mon, 2017-09-18 at 19:08 -0400, Keith Busch wrote:
> > On Mon, Sep 18, 2017 at 10:53:12PM +, Bart Van Assche wrote:
> > > Are you sure that scenario can happen? The blk-mq core calls
> > > test_and_set_bit()
> > > for the
On Mon, 2017-09-18 at 19:08 -0400, Keith Busch wrote:
> On Mon, Sep 18, 2017 at 10:53:12PM +, Bart Van Assche wrote:
> > Are you sure that scenario can happen? The blk-mq core calls
> > test_and_set_bit()
> > for the REQ_ATOM_COMPLETE flag before any completion or timeout handler is
> >
On Mon, Sep 18, 2017 at 10:53:12PM +, Bart Van Assche wrote:
> On Mon, 2017-09-18 at 18:39 -0400, Keith Busch wrote:
> > The nvme driver's use of blk_mq_reinit_tagset only happens during
> > controller initialisation, but I'm seeing lost commands well after that
> > during normal and stable
On Mon, 2017-09-18 at 18:39 -0400, Keith Busch wrote:
> The nvme driver's use of blk_mq_reinit_tagset only happens during
> controller initialisation, but I'm seeing lost commands well after that
> during normal and stable running.
>
> The timing is pretty narrow to hit, but I'm pretty sure this
On Mon, Sep 18, 2017 at 10:07:58PM +, Bart Van Assche wrote:
> On Mon, 2017-09-18 at 18:03 -0400, Keith Busch wrote:
> > I think we've always known it's possible to lose a request during timeout
> > handling, but just accepted that possibility. It seems to be causing
> > problems, though,
On Mon, 2017-09-18 at 18:03 -0400, Keith Busch wrote:
> I think we've always known it's possible to lose a request during timeout
> handling, but just accepted that possibility. It seems to be causing
> problems, though, leading to unnecessary error escalation and IO failures.
>
> The possiblity
13 matches
Mail list logo