Re: blk-mq 5-8 times slower for bmap-tools

2018-08-23 Thread Ricardo Ribalda Delgado
Hi Ben
On Wed, Aug 22, 2018 at 9:22 PM Ben Hutchings  wrote:
>
> On Mon, 2018-08-20 at 11:04 +0200, Ricardo Ribalda Delgado wrote:
> > Hello Ming
> > On Mon, Aug 20, 2018 at 10:30 AM Ming Lei  wrote:
> [...]
> > > One problem found from your iostat log is that looks there is ~30sec
> > > idle period between IO activities when blk-mq is enabled.
> >
> > During all the test the LED on the device was blinking.
> >
> > But a closer look to dmesg reveals a lot of this messages:
> >
> > [  196.929811] sd 6:0:0:0: [sdb] tag#3 data cmplt err -71 uas-tag 1
> > inflight: CMD
> > [  196.929822] sd 6:0:0:0: [sdb] tag#3 CDB: Write(10) 2a 00 00 04 00
> > 68 00 04 00 00
> > [  227.764379] sd 6:0:0:0: [sdb] tag#4 uas_eh_abort_handler 0 uas-tag
> > 2 inflight: CMD OUT
> > [  227.764389] sd 6:0:0:0: [sdb] tag#4 CDB: Write(10) 2a 00 00 04 04
> > 68 00 04 00 00
> > [  227.766555] sd 6:0:0:0: [sdb] tag#3 uas_eh_abort_handler 0 uas-tag
> > 1 inflight: CMD
> > [  227.766562] sd 6:0:0:0: [sdb] tag#3 CDB: Write(10) 2a 00 00 04 00
> > 68 00 04 00 00
> > [  227.784312] scsi host6: uas_eh_device_reset_handler start
> > [  227.913672] usb 2-2: reset SuperSpeed USB device number 3 using xhci_hcd
> > [  227.944842] scsi host6: uas_eh_device_reset_handler success
> > [  231.416133] sd 6:0:0:0: [sdb] tag#1 data cmplt err -71 uas-tag 10
> > inflight: CMD
> > [  231.416147] sd 6:0:0:0: [sdb] tag#1 CDB: Write(10) 2a 00 00 06 d5
> > e8 00 04 00 00
> >
> > And they take around 30 secons (227-196)
> >
> >
> > Maybe this is a hw issue? I will bring the reader home tonight and see
> > if I can replicate the bug with my notebook
> >
> > >
> > > Maybe it is related with timeout, given we had big change in v4.17 
> > > timeout code,
> > > and we also fixed one scsi_mq timeout related issue recently, and the 
> > > patch[1] has
> > > been merged to v4.18 release already.
> >
> > I tried with v4.18-rc4 (latest one packaged in debian experimental)
> > and after 3 runs, 2 were fine (27 sec), but the last one was over a
> > minute.
>
> Is it possible that this is fixed by "block: really disable runtime-pm
> for blk-mq"?
>
> That fix went into Debian's version 4.17.14-1, but is not yet in any
> 4.18-based package.

Just tried out
ricardo@neopili:~$ uname -a
Linux neopili 4.17.0-3-amd64 #1 SMP Debian 4.17.17-1 (2018-08-18)
x86_64 GNU/Linux

and after around 10 executions all the runtimes are similar to the
ones with non-blk-mq and I havent seen the "uas error" on dmesg

Thanks for your help.

I will keep an eye on the issue in case it triggers again

>
> Ben.
>
> --
> Ben Hutchings
> You can't have everything.  Where would you put it?
>


-- 
Ricardo Ribalda


Re: blk-mq 5-8 times slower for bmap-tools

2018-08-22 Thread Ming Lei
On Thu, Aug 23, 2018 at 01:51:05AM +0100, Ben Hutchings wrote:
> On Thu, 2018-08-23 at 06:02 +0800, Ming Lei wrote:
> > On Wed, Aug 22, 2018 at 08:22:00PM +0100, Ben Hutchings wrote:
> > > On Mon, 2018-08-20 at 11:04 +0200, Ricardo Ribalda Delgado wrote:
> > > > Hello Ming
> > > > On Mon, Aug 20, 2018 at 10:30 AM Ming Lei  wrote:
> > > 
> > > [...]
> > > > > One problem found from your iostat log is that looks there is ~30sec
> > > > > idle period between IO activities when blk-mq is enabled.
> > > > 
> > > > During all the test the LED on the device was blinking.
> > > > 
> > > > But a closer look to dmesg reveals a lot of this messages:
> > > > 
> > > > [  196.929811] sd 6:0:0:0: [sdb] tag#3 data cmplt err -71 uas-tag 1
> > > > inflight: CMD
> > > > [  196.929822] sd 6:0:0:0: [sdb] tag#3 CDB: Write(10) 2a 00 00 04 00
> > > > 68 00 04 00 00
> > > > [  227.764379] sd 6:0:0:0: [sdb] tag#4 uas_eh_abort_handler 0 uas-tag
> > > > 2 inflight: CMD OUT
> > > > [  227.764389] sd 6:0:0:0: [sdb] tag#4 CDB: Write(10) 2a 00 00 04 04
> > > > 68 00 04 00 00
> > > > [  227.766555] sd 6:0:0:0: [sdb] tag#3 uas_eh_abort_handler 0 uas-tag
> > > > 1 inflight: CMD
> > > > [  227.766562] sd 6:0:0:0: [sdb] tag#3 CDB: Write(10) 2a 00 00 04 00
> > > > 68 00 04 00 00
> > > > [  227.784312] scsi host6: uas_eh_device_reset_handler start
> > > > [  227.913672] usb 2-2: reset SuperSpeed USB device number 3 using 
> > > > xhci_hcd
> > > > [  227.944842] scsi host6: uas_eh_device_reset_handler success
> > > > [  231.416133] sd 6:0:0:0: [sdb] tag#1 data cmplt err -71 uas-tag 10
> > > > inflight: CMD
> > > > [  231.416147] sd 6:0:0:0: [sdb] tag#1 CDB: Write(10) 2a 00 00 06 d5
> > > > e8 00 04 00 00
> > > > 
> > > > And they take around 30 secons (227-196)
> > > > 
> > > > 
> > > > Maybe this is a hw issue? I will bring the reader home tonight and see
> > > > if I can replicate the bug with my notebook
> > > > 
> > > > > 
> > > > > Maybe it is related with timeout, given we had big change in v4.17 
> > > > > timeout code,
> > > > > and we also fixed one scsi_mq timeout related issue recently, and the 
> > > > > patch[1] has
> > > > > been merged to v4.18 release already.
> > > > 
> > > > I tried with v4.18-rc4 (latest one packaged in debian experimental)
> > > > and after 3 runs, 2 were fine (27 sec), but the last one was over a
> > > > minute.
> > > 
> > > Is it possible that this is fixed by "block: really disable runtime-pm
> > > for blk-mq"?
> > > 
> > 
> > As I mentioned, the similar issue can be triggered in both blk-mq and
> > non-blk-mq, so it shouldn't be related with runtime PM.
> > 
> > We need our UAS guys to take a look at this issue.
> 
> I saw that you found a problem with UAS in both modes, but Ricardo's
> problem seemed to be specific to blk-mq.
> 

Actually both are triggered when doing WRITE to UAS, and the error
message is same too.

So I think two problems may be same.

Thanks,
Ming


Re: blk-mq 5-8 times slower for bmap-tools

2018-08-22 Thread Ben Hutchings
On Thu, 2018-08-23 at 06:02 +0800, Ming Lei wrote:
> On Wed, Aug 22, 2018 at 08:22:00PM +0100, Ben Hutchings wrote:
> > On Mon, 2018-08-20 at 11:04 +0200, Ricardo Ribalda Delgado wrote:
> > > Hello Ming
> > > On Mon, Aug 20, 2018 at 10:30 AM Ming Lei  wrote:
> > 
> > [...]
> > > > One problem found from your iostat log is that looks there is ~30sec
> > > > idle period between IO activities when blk-mq is enabled.
> > > 
> > > During all the test the LED on the device was blinking.
> > > 
> > > But a closer look to dmesg reveals a lot of this messages:
> > > 
> > > [  196.929811] sd 6:0:0:0: [sdb] tag#3 data cmplt err -71 uas-tag 1
> > > inflight: CMD
> > > [  196.929822] sd 6:0:0:0: [sdb] tag#3 CDB: Write(10) 2a 00 00 04 00
> > > 68 00 04 00 00
> > > [  227.764379] sd 6:0:0:0: [sdb] tag#4 uas_eh_abort_handler 0 uas-tag
> > > 2 inflight: CMD OUT
> > > [  227.764389] sd 6:0:0:0: [sdb] tag#4 CDB: Write(10) 2a 00 00 04 04
> > > 68 00 04 00 00
> > > [  227.766555] sd 6:0:0:0: [sdb] tag#3 uas_eh_abort_handler 0 uas-tag
> > > 1 inflight: CMD
> > > [  227.766562] sd 6:0:0:0: [sdb] tag#3 CDB: Write(10) 2a 00 00 04 00
> > > 68 00 04 00 00
> > > [  227.784312] scsi host6: uas_eh_device_reset_handler start
> > > [  227.913672] usb 2-2: reset SuperSpeed USB device number 3 using 
> > > xhci_hcd
> > > [  227.944842] scsi host6: uas_eh_device_reset_handler success
> > > [  231.416133] sd 6:0:0:0: [sdb] tag#1 data cmplt err -71 uas-tag 10
> > > inflight: CMD
> > > [  231.416147] sd 6:0:0:0: [sdb] tag#1 CDB: Write(10) 2a 00 00 06 d5
> > > e8 00 04 00 00
> > > 
> > > And they take around 30 secons (227-196)
> > > 
> > > 
> > > Maybe this is a hw issue? I will bring the reader home tonight and see
> > > if I can replicate the bug with my notebook
> > > 
> > > > 
> > > > Maybe it is related with timeout, given we had big change in v4.17 
> > > > timeout code,
> > > > and we also fixed one scsi_mq timeout related issue recently, and the 
> > > > patch[1] has
> > > > been merged to v4.18 release already.
> > > 
> > > I tried with v4.18-rc4 (latest one packaged in debian experimental)
> > > and after 3 runs, 2 were fine (27 sec), but the last one was over a
> > > minute.
> > 
> > Is it possible that this is fixed by "block: really disable runtime-pm
> > for blk-mq"?
> > 
> 
> As I mentioned, the similar issue can be triggered in both blk-mq and
> non-blk-mq, so it shouldn't be related with runtime PM.
> 
> We need our UAS guys to take a look at this issue.

I saw that you found a problem with UAS in both modes, but Ricardo's
problem seemed to be specific to blk-mq.

Ben.

-- 
Ben Hutchings
Quantity is no substitute for quality, but it's the only one we've got.




signature.asc
Description: This is a digitally signed message part


Re: blk-mq 5-8 times slower for bmap-tools

2018-08-22 Thread Ming Lei
On Wed, Aug 22, 2018 at 08:22:00PM +0100, Ben Hutchings wrote:
> On Mon, 2018-08-20 at 11:04 +0200, Ricardo Ribalda Delgado wrote:
> > Hello Ming
> > On Mon, Aug 20, 2018 at 10:30 AM Ming Lei  wrote:
> [...]
> > > One problem found from your iostat log is that looks there is ~30sec
> > > idle period between IO activities when blk-mq is enabled.
> > 
> > During all the test the LED on the device was blinking.
> > 
> > But a closer look to dmesg reveals a lot of this messages:
> > 
> > [  196.929811] sd 6:0:0:0: [sdb] tag#3 data cmplt err -71 uas-tag 1
> > inflight: CMD
> > [  196.929822] sd 6:0:0:0: [sdb] tag#3 CDB: Write(10) 2a 00 00 04 00
> > 68 00 04 00 00
> > [  227.764379] sd 6:0:0:0: [sdb] tag#4 uas_eh_abort_handler 0 uas-tag
> > 2 inflight: CMD OUT
> > [  227.764389] sd 6:0:0:0: [sdb] tag#4 CDB: Write(10) 2a 00 00 04 04
> > 68 00 04 00 00
> > [  227.766555] sd 6:0:0:0: [sdb] tag#3 uas_eh_abort_handler 0 uas-tag
> > 1 inflight: CMD
> > [  227.766562] sd 6:0:0:0: [sdb] tag#3 CDB: Write(10) 2a 00 00 04 00
> > 68 00 04 00 00
> > [  227.784312] scsi host6: uas_eh_device_reset_handler start
> > [  227.913672] usb 2-2: reset SuperSpeed USB device number 3 using xhci_hcd
> > [  227.944842] scsi host6: uas_eh_device_reset_handler success
> > [  231.416133] sd 6:0:0:0: [sdb] tag#1 data cmplt err -71 uas-tag 10
> > inflight: CMD
> > [  231.416147] sd 6:0:0:0: [sdb] tag#1 CDB: Write(10) 2a 00 00 06 d5
> > e8 00 04 00 00
> > 
> > And they take around 30 secons (227-196)
> > 
> > 
> > Maybe this is a hw issue? I will bring the reader home tonight and see
> > if I can replicate the bug with my notebook
> > 
> > > 
> > > Maybe it is related with timeout, given we had big change in v4.17 
> > > timeout code,
> > > and we also fixed one scsi_mq timeout related issue recently, and the 
> > > patch[1] has
> > > been merged to v4.18 release already.
> > 
> > I tried with v4.18-rc4 (latest one packaged in debian experimental)
> > and after 3 runs, 2 were fine (27 sec), but the last one was over a
> > minute.
> 
> Is it possible that this is fixed by "block: really disable runtime-pm
> for blk-mq"?
> 

As I mentioned, the similar issue can be triggered in both blk-mq and
non-blk-mq, so it shouldn't be related with runtime PM.

We need our UAS guys to take a look at this issue.

Thanks,
Ming


Re: blk-mq 5-8 times slower for bmap-tools

2018-08-22 Thread Ben Hutchings
On Mon, 2018-08-20 at 11:04 +0200, Ricardo Ribalda Delgado wrote:
> Hello Ming
> On Mon, Aug 20, 2018 at 10:30 AM Ming Lei  wrote:
[...]
> > One problem found from your iostat log is that looks there is ~30sec
> > idle period between IO activities when blk-mq is enabled.
> 
> During all the test the LED on the device was blinking.
> 
> But a closer look to dmesg reveals a lot of this messages:
> 
> [  196.929811] sd 6:0:0:0: [sdb] tag#3 data cmplt err -71 uas-tag 1
> inflight: CMD
> [  196.929822] sd 6:0:0:0: [sdb] tag#3 CDB: Write(10) 2a 00 00 04 00
> 68 00 04 00 00
> [  227.764379] sd 6:0:0:0: [sdb] tag#4 uas_eh_abort_handler 0 uas-tag
> 2 inflight: CMD OUT
> [  227.764389] sd 6:0:0:0: [sdb] tag#4 CDB: Write(10) 2a 00 00 04 04
> 68 00 04 00 00
> [  227.766555] sd 6:0:0:0: [sdb] tag#3 uas_eh_abort_handler 0 uas-tag
> 1 inflight: CMD
> [  227.766562] sd 6:0:0:0: [sdb] tag#3 CDB: Write(10) 2a 00 00 04 00
> 68 00 04 00 00
> [  227.784312] scsi host6: uas_eh_device_reset_handler start
> [  227.913672] usb 2-2: reset SuperSpeed USB device number 3 using xhci_hcd
> [  227.944842] scsi host6: uas_eh_device_reset_handler success
> [  231.416133] sd 6:0:0:0: [sdb] tag#1 data cmplt err -71 uas-tag 10
> inflight: CMD
> [  231.416147] sd 6:0:0:0: [sdb] tag#1 CDB: Write(10) 2a 00 00 06 d5
> e8 00 04 00 00
> 
> And they take around 30 secons (227-196)
> 
> 
> Maybe this is a hw issue? I will bring the reader home tonight and see
> if I can replicate the bug with my notebook
> 
> > 
> > Maybe it is related with timeout, given we had big change in v4.17 timeout 
> > code,
> > and we also fixed one scsi_mq timeout related issue recently, and the 
> > patch[1] has
> > been merged to v4.18 release already.
> 
> I tried with v4.18-rc4 (latest one packaged in debian experimental)
> and after 3 runs, 2 were fine (27 sec), but the last one was over a
> minute.

Is it possible that this is fixed by "block: really disable runtime-pm
for blk-mq"?

That fix went into Debian's version 4.17.14-1, but is not yet in any
4.18-based package.

Ben.

-- 
Ben Hutchings
You can't have everything.  Where would you put it?



signature.asc
Description: This is a digitally signed message part


Re: blk-mq 5-8 times slower for bmap-tools

2018-08-20 Thread Ricardo Ribalda Delgado
Hi
On Mon, Aug 20, 2018 at 11:55 AM Ming Lei  wrote:
>
> On Mon, Aug 20, 2018 at 11:04:08AM +0200, Ricardo Ribalda Delgado wrote:
> > Hello Ming
> > On Mon, Aug 20, 2018 at 10:30 AM Ming Lei  wrote:
> > >
> > > On Mon, Aug 20, 2018 at 09:39:45AM +0200, Ricardo Ribalda Delgado wrote:
> > > > Some measurements:
> > > >
> > > > Please note that even when iostat shows 0.0 the LED on the device was
> > > > blinking as if there was some activity going on.
> > > >
> > > > Thanks!
> > > >
> > > > KERNEL:
> > > > Linux neopili 4.17.0-1-amd64 #1 SMP Debian 4.17.8-1 (2018-07-20)
> > > > x86_64 GNU/Linux
> > > >
> > > > DISK:
> > > > [247530.712686] usb 2-2: new SuperSpeed USB device number 3 using 
> > > > xhci_hcd
> > > > [247530.738906] usb 2-2: New USB device found, idVendor=174c,
> > > > idProduct=5136, bcdDevice= 1.00
> > > > [247530.738913] usb 2-2: New USB device strings: Mfr=2, Product=3,
> > > > SerialNumber=1
> > > > [247530.738918] usb 2-2: Product: 91686
> > > > [247530.738921] usb 2-2: Manufacturer: Delock
> > > > [247530.738925] usb 2-2: SerialNumber: 30003B45
> > > > [247530.750709] scsi host6: uas
> > > > [247530.751377] scsi 6:0:0:0: Direct-Access ASMT 2105
> > > >0PQ: 0 ANSI: 6
> > > > [247530.752174] sd 6:0:0:0: Attached scsi generic sg2 type 0
> > > > [247530.752616] sd 6:0:0:0: [sdb] 31277232 512-byte logical blocks:
> > > > (16.0 GB/14.9 GiB)
> > > > [247530.752740] sd 6:0:0:0: [sdb] Write Protect is off
> > > > [247530.752742] sd 6:0:0:0: [sdb] Mode Sense: 43 00 00 00
> > > > [247530.752905] sd 6:0:0:0: [sdb] Write cache: enabled, read cache:
> > > > enabled, doesn't support DPO or FUA
> > > > [247530.755912]  sdb: sdb1 sdb2
> > > > [247530.756832] sd 6:0:0:0: [sdb] Attached SCSI disk
> > > >
> > > >
> > > > TL/DR:
> > > > CFQ: 21.7
> > > > NOOP: 21.7
> > > > DEADLINE: 21.7
> > > > MQ-DEADLINE: 175.2
> > > > NONE: 236.4
> > > > KYBER: 174.9
> > >
> > > Thanks for your test!
> > >
> > > One problem found from your iostat log is that looks there is ~30sec
> > > idle period between IO activities when blk-mq is enabled.
> >
> > During all the test the LED on the device was blinking.
> >
> > But a closer look to dmesg reveals a lot of this messages:
> >
> > [  196.929811] sd 6:0:0:0: [sdb] tag#3 data cmplt err -71 uas-tag 1
> > inflight: CMD
>
> -71 is -EPROTO, which is returned from xhci driver, usually it means
> there is error from the USB device side.
>
> > [  196.929822] sd 6:0:0:0: [sdb] tag#3 CDB: Write(10) 2a 00 00 04 00
> > 68 00 04 00 00
> > [  227.764379] sd 6:0:0:0: [sdb] tag#4 uas_eh_abort_handler 0 uas-tag
> > 2 inflight: CMD OUT
> > [  227.764389] sd 6:0:0:0: [sdb] tag#4 CDB: Write(10) 2a 00 00 04 04
> > 68 00 04 00 00
> > [  227.766555] sd 6:0:0:0: [sdb] tag#3 uas_eh_abort_handler 0 uas-tag
> > 1 inflight: CMD
> > [  227.766562] sd 6:0:0:0: [sdb] tag#3 CDB: Write(10) 2a 00 00 04 00
> > 68 00 04 00 00
> > [  227.784312] scsi host6: uas_eh_device_reset_handler start
> > [  227.913672] usb 2-2: reset SuperSpeed USB device number 3 using xhci_hcd
> > [  227.944842] scsi host6: uas_eh_device_reset_handler success
> > [  231.416133] sd 6:0:0:0: [sdb] tag#1 data cmplt err -71 uas-tag 10
> > inflight: CMD
> > [  231.416147] sd 6:0:0:0: [sdb] tag#1 CDB: Write(10) 2a 00 00 06 d5
> > e8 00 04 00 00
> >
> > And they take around 30 secons (227-196)
> >
> >
> > Maybe this is a hw issue? I will bring the reader home tonight and see
> > if I can replicate the bug with my notebook
>
> It might be.
>
> Is there such message in case of non-blk-mq?

No, only with blk-mq

Thanks!
>
> thanks,
> Ming



-- 
Ricardo Ribalda


Re: blk-mq 5-8 times slower for bmap-tools

2018-08-20 Thread Ming Lei
On Mon, Aug 20, 2018 at 11:04:08AM +0200, Ricardo Ribalda Delgado wrote:
> Hello Ming
> On Mon, Aug 20, 2018 at 10:30 AM Ming Lei  wrote:
> >
> > On Mon, Aug 20, 2018 at 09:39:45AM +0200, Ricardo Ribalda Delgado wrote:
> > > Some measurements:
> > >
> > > Please note that even when iostat shows 0.0 the LED on the device was
> > > blinking as if there was some activity going on.
> > >
> > > Thanks!
> > >
> > > KERNEL:
> > > Linux neopili 4.17.0-1-amd64 #1 SMP Debian 4.17.8-1 (2018-07-20)
> > > x86_64 GNU/Linux
> > >
> > > DISK:
> > > [247530.712686] usb 2-2: new SuperSpeed USB device number 3 using xhci_hcd
> > > [247530.738906] usb 2-2: New USB device found, idVendor=174c,
> > > idProduct=5136, bcdDevice= 1.00
> > > [247530.738913] usb 2-2: New USB device strings: Mfr=2, Product=3,
> > > SerialNumber=1
> > > [247530.738918] usb 2-2: Product: 91686
> > > [247530.738921] usb 2-2: Manufacturer: Delock
> > > [247530.738925] usb 2-2: SerialNumber: 30003B45
> > > [247530.750709] scsi host6: uas
> > > [247530.751377] scsi 6:0:0:0: Direct-Access ASMT 2105
> > >0PQ: 0 ANSI: 6
> > > [247530.752174] sd 6:0:0:0: Attached scsi generic sg2 type 0
> > > [247530.752616] sd 6:0:0:0: [sdb] 31277232 512-byte logical blocks:
> > > (16.0 GB/14.9 GiB)
> > > [247530.752740] sd 6:0:0:0: [sdb] Write Protect is off
> > > [247530.752742] sd 6:0:0:0: [sdb] Mode Sense: 43 00 00 00
> > > [247530.752905] sd 6:0:0:0: [sdb] Write cache: enabled, read cache:
> > > enabled, doesn't support DPO or FUA
> > > [247530.755912]  sdb: sdb1 sdb2
> > > [247530.756832] sd 6:0:0:0: [sdb] Attached SCSI disk
> > >
> > >
> > > TL/DR:
> > > CFQ: 21.7
> > > NOOP: 21.7
> > > DEADLINE: 21.7
> > > MQ-DEADLINE: 175.2
> > > NONE: 236.4
> > > KYBER: 174.9
> >
> > Thanks for your test!
> >
> > One problem found from your iostat log is that looks there is ~30sec
> > idle period between IO activities when blk-mq is enabled.
> 
> During all the test the LED on the device was blinking.
> 
> But a closer look to dmesg reveals a lot of this messages:
> 
> [  196.929811] sd 6:0:0:0: [sdb] tag#3 data cmplt err -71 uas-tag 1
> inflight: CMD

-71 is -EPROTO, which is returned from xhci driver, usually it means
there is error from the USB device side.

> [  196.929822] sd 6:0:0:0: [sdb] tag#3 CDB: Write(10) 2a 00 00 04 00
> 68 00 04 00 00
> [  227.764379] sd 6:0:0:0: [sdb] tag#4 uas_eh_abort_handler 0 uas-tag
> 2 inflight: CMD OUT
> [  227.764389] sd 6:0:0:0: [sdb] tag#4 CDB: Write(10) 2a 00 00 04 04
> 68 00 04 00 00
> [  227.766555] sd 6:0:0:0: [sdb] tag#3 uas_eh_abort_handler 0 uas-tag
> 1 inflight: CMD
> [  227.766562] sd 6:0:0:0: [sdb] tag#3 CDB: Write(10) 2a 00 00 04 00
> 68 00 04 00 00
> [  227.784312] scsi host6: uas_eh_device_reset_handler start
> [  227.913672] usb 2-2: reset SuperSpeed USB device number 3 using xhci_hcd
> [  227.944842] scsi host6: uas_eh_device_reset_handler success
> [  231.416133] sd 6:0:0:0: [sdb] tag#1 data cmplt err -71 uas-tag 10
> inflight: CMD
> [  231.416147] sd 6:0:0:0: [sdb] tag#1 CDB: Write(10) 2a 00 00 06 d5
> e8 00 04 00 00
> 
> And they take around 30 secons (227-196)
> 
> 
> Maybe this is a hw issue? I will bring the reader home tonight and see
> if I can replicate the bug with my notebook

It might be.

Is there such message in case of non-blk-mq?

thanks,
Ming


Re: blk-mq 5-8 times slower for bmap-tools

2018-08-20 Thread Ricardo Ribalda Delgado
Hello Ming
On Mon, Aug 20, 2018 at 10:30 AM Ming Lei  wrote:
>
> On Mon, Aug 20, 2018 at 09:39:45AM +0200, Ricardo Ribalda Delgado wrote:
> > Some measurements:
> >
> > Please note that even when iostat shows 0.0 the LED on the device was
> > blinking as if there was some activity going on.
> >
> > Thanks!
> >
> > KERNEL:
> > Linux neopili 4.17.0-1-amd64 #1 SMP Debian 4.17.8-1 (2018-07-20)
> > x86_64 GNU/Linux
> >
> > DISK:
> > [247530.712686] usb 2-2: new SuperSpeed USB device number 3 using xhci_hcd
> > [247530.738906] usb 2-2: New USB device found, idVendor=174c,
> > idProduct=5136, bcdDevice= 1.00
> > [247530.738913] usb 2-2: New USB device strings: Mfr=2, Product=3,
> > SerialNumber=1
> > [247530.738918] usb 2-2: Product: 91686
> > [247530.738921] usb 2-2: Manufacturer: Delock
> > [247530.738925] usb 2-2: SerialNumber: 30003B45
> > [247530.750709] scsi host6: uas
> > [247530.751377] scsi 6:0:0:0: Direct-Access ASMT 2105
> >0PQ: 0 ANSI: 6
> > [247530.752174] sd 6:0:0:0: Attached scsi generic sg2 type 0
> > [247530.752616] sd 6:0:0:0: [sdb] 31277232 512-byte logical blocks:
> > (16.0 GB/14.9 GiB)
> > [247530.752740] sd 6:0:0:0: [sdb] Write Protect is off
> > [247530.752742] sd 6:0:0:0: [sdb] Mode Sense: 43 00 00 00
> > [247530.752905] sd 6:0:0:0: [sdb] Write cache: enabled, read cache:
> > enabled, doesn't support DPO or FUA
> > [247530.755912]  sdb: sdb1 sdb2
> > [247530.756832] sd 6:0:0:0: [sdb] Attached SCSI disk
> >
> >
> > TL/DR:
> > CFQ: 21.7
> > NOOP: 21.7
> > DEADLINE: 21.7
> > MQ-DEADLINE: 175.2
> > NONE: 236.4
> > KYBER: 174.9
>
> Thanks for your test!
>
> One problem found from your iostat log is that looks there is ~30sec
> idle period between IO activities when blk-mq is enabled.

During all the test the LED on the device was blinking.

But a closer look to dmesg reveals a lot of this messages:

[  196.929811] sd 6:0:0:0: [sdb] tag#3 data cmplt err -71 uas-tag 1
inflight: CMD
[  196.929822] sd 6:0:0:0: [sdb] tag#3 CDB: Write(10) 2a 00 00 04 00
68 00 04 00 00
[  227.764379] sd 6:0:0:0: [sdb] tag#4 uas_eh_abort_handler 0 uas-tag
2 inflight: CMD OUT
[  227.764389] sd 6:0:0:0: [sdb] tag#4 CDB: Write(10) 2a 00 00 04 04
68 00 04 00 00
[  227.766555] sd 6:0:0:0: [sdb] tag#3 uas_eh_abort_handler 0 uas-tag
1 inflight: CMD
[  227.766562] sd 6:0:0:0: [sdb] tag#3 CDB: Write(10) 2a 00 00 04 00
68 00 04 00 00
[  227.784312] scsi host6: uas_eh_device_reset_handler start
[  227.913672] usb 2-2: reset SuperSpeed USB device number 3 using xhci_hcd
[  227.944842] scsi host6: uas_eh_device_reset_handler success
[  231.416133] sd 6:0:0:0: [sdb] tag#1 data cmplt err -71 uas-tag 10
inflight: CMD
[  231.416147] sd 6:0:0:0: [sdb] tag#1 CDB: Write(10) 2a 00 00 06 d5
e8 00 04 00 00

And they take around 30 secons (227-196)


Maybe this is a hw issue? I will bring the reader home tonight and see
if I can replicate the bug with my notebook

>
> Maybe it is related with timeout, given we had big change in v4.17 timeout 
> code,
> and we also fixed one scsi_mq timeout related issue recently, and the 
> patch[1] has
> been merged to v4.18 release already.

I tried with v4.18-rc4 (latest one packaged in debian experimental)
and after 3 runs, 2 were fine (27 sec), but the last one was over a
minute.



Thanks!



>
> [1] 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2d5ba0e2de24ec87636244a01d4e78d095cc1b20
>
> So could you test v4.18 and see if there is such issue?
>
> BTW, I have run the same test on usb-storage, and can't reproduce it.
> However you are using UAS, and I don't have it at hand, so can't
> reproduce it completely.
>
> Thanks,
> Ming



--
Ricardo Ribalda


Re: blk-mq 5-8 times slower for bmap-tools

2018-08-20 Thread Ming Lei
On Mon, Aug 20, 2018 at 09:39:45AM +0200, Ricardo Ribalda Delgado wrote:
> Some measurements:
> 
> Please note that even when iostat shows 0.0 the LED on the device was
> blinking as if there was some activity going on.
> 
> Thanks!
> 
> KERNEL:
> Linux neopili 4.17.0-1-amd64 #1 SMP Debian 4.17.8-1 (2018-07-20)
> x86_64 GNU/Linux
> 
> DISK:
> [247530.712686] usb 2-2: new SuperSpeed USB device number 3 using xhci_hcd
> [247530.738906] usb 2-2: New USB device found, idVendor=174c,
> idProduct=5136, bcdDevice= 1.00
> [247530.738913] usb 2-2: New USB device strings: Mfr=2, Product=3,
> SerialNumber=1
> [247530.738918] usb 2-2: Product: 91686
> [247530.738921] usb 2-2: Manufacturer: Delock
> [247530.738925] usb 2-2: SerialNumber: 30003B45
> [247530.750709] scsi host6: uas
> [247530.751377] scsi 6:0:0:0: Direct-Access ASMT 2105
>0PQ: 0 ANSI: 6
> [247530.752174] sd 6:0:0:0: Attached scsi generic sg2 type 0
> [247530.752616] sd 6:0:0:0: [sdb] 31277232 512-byte logical blocks:
> (16.0 GB/14.9 GiB)
> [247530.752740] sd 6:0:0:0: [sdb] Write Protect is off
> [247530.752742] sd 6:0:0:0: [sdb] Mode Sense: 43 00 00 00
> [247530.752905] sd 6:0:0:0: [sdb] Write cache: enabled, read cache:
> enabled, doesn't support DPO or FUA
> [247530.755912]  sdb: sdb1 sdb2
> [247530.756832] sd 6:0:0:0: [sdb] Attached SCSI disk
> 
> 
> TL/DR:
> CFQ: 21.7
> NOOP: 21.7
> DEADLINE: 21.7
> MQ-DEADLINE: 175.2
> NONE: 236.4
> KYBER: 174.9

Thanks for your test!

One problem found from your iostat log is that looks there is ~30sec
idle period between IO activities when blk-mq is enabled.

Maybe it is related with timeout, given we had big change in v4.17 timeout code,
and we also fixed one scsi_mq timeout related issue recently, and the patch[1] 
has
been merged to v4.18 release already.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2d5ba0e2de24ec87636244a01d4e78d095cc1b20

So could you test v4.18 and see if there is such issue?

BTW, I have run the same test on usb-storage, and can't reproduce it.
However you are using UAS, and I don't have it at hand, so can't
reproduce it completely.

Thanks,
Ming


Re: blk-mq 5-8 times slower for bmap-tools

2018-08-17 Thread Ming Lei
On Sat, Aug 18, 2018 at 1:32 AM Ricardo Ribalda Delgado
 wrote:
>
> Hello
>
> Coming from: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=906328
>
> bmaptools is kind of a "smart dd" tool, that lets you write images
> very fast and securely. Since the last Debian Kernel update it has
> become 5-8 times slower.
>
> After some debugging, we have figured out that the reason for that
> slowness is the Multi-Queue Block IO Queueing Mechanism.
>
> Debian maintainer has pointed out that in the near future the single
> queue will be deprecated.
>
> My question is if we can get a similar perfomance for bmaptools with
> blk-mq and how?

It suppose to be not worse than non-blk-mq, and maybe there is bug
somewhere.

Could you use 'iostat -dx $DISK_PATH 1' to collect io stat when running
bmaptools in both blk-mq and non-blk-mq?

BTW, don't change the default io scheduler as none, which shouldn't
work well for slow disk, such as non-SSD.

Thanks,
Ming Lei


Re: blk-mq 5-8 times slower for bmap-tools

2018-08-17 Thread Jens Axboe
On 8/17/18 11:49 AM, Ricardo Ribalda Delgado wrote:
> Hi Jens
> On Fri, Aug 17, 2018 at 7:41 PM Jens Axboe  wrote:
>>
>> On 8/17/18 11:39 AM, Ricardo Ribalda Delgado wrote:
>>> Hello Paolo
>>> On Fri, Aug 17, 2018 at 7:35 PM Paolo Valente  
>>> wrote:



> Il giorno 17 ago 2018, alle ore 19:31, Ricardo Ribalda Delgado 
>  ha scritto:
>
> Hello
>
> Coming from: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=906328
>
> bmaptools is kind of a "smart dd" tool, that lets you write images
> very fast and securely. Since the last Debian Kernel update it has
> become 5-8 times slower.
>
> After some debugging, we have figured out that the reason for that
> slowness is the Multi-Queue Block IO Queueing Mechanism.
>
> Debian maintainer has pointed out that in the near future the single
> queue will be deprecated.
>
> My question is if we can get a similar perfomance for bmaptools with
> blk-mq and how?
>

 Have you also checked what happens after switching to a different I/O
 scheduler for the involved drive (among none, mq-deadline, bfq and
 kyber)?
>>>
>>> I have only tried mq-deadline and none (because they are enabled by
>>> default on Debian). Both produce results in the same range: 5-8 times
>>> slower.
>>>
>>> I could easily enable kyber:
>>> cat /boot/config-4.17.0-1-amd64  | grep CONFIG_MQ_IOSCHED
>>> CONFIG_MQ_IOSCHED_DEADLINE=y
>>> CONFIG_MQ_IOSCHED_KYBER=m
>>>
>>> But I left the card reader on the office, so any test would have to
>>> wait until monday sorry :(
>>
>> Can someone describe what bmaptools does? IOW, how is it different than
>> dd? Does it use multiple threads for both reads and writes?
> 
> https://github.com/intel/bmap-tools/blob/master/bmaptools/BmapCopy.py
> 
> I am not an author, just a user. But from the code it looks like
> 
> 1) sets the io scheduler to noop
> 2) Copy the file in chuncks of block_size (4096 by default)
> 3) fflush
> 4) restore io scheduler

That seems like a very odd tool... Why would you switch to noop for
a file-to-dev copy?!

In any case, I ran it on a USB stick here.

With blk-mq: bmaptool: info: copying time: 9.5s, copying speed 27.0 MiB/sec
Without blk-mq: bmaptool: info: copying time: 9.5s, copying speed 26.8 MiB/sec

which seems identical to me, maybe a slight edge to blk-mq.

For both cases, I used --nobmap since the tool complained:

bmaptool: ERROR: bmap file not found, please, use --nobmap option to flash 
without bmap.

I used a 256MB file full of random data. I then ran it with a 1G file, and
the results were:

With blk-mq: bmaptool: info: copying time: 36.8s, copying speed 27.8 MiB/sec
Without blk-mq: bmaptool: info: copying time: 38.6s, copying speed 26.5 MiB/sec

which also doesn't seem slower, quite the contrary.

-- 
Jens Axboe



Re: blk-mq 5-8 times slower for bmap-tools

2018-08-17 Thread Ricardo Ribalda Delgado
Hi Paolo

On Fri, Aug 17, 2018 at 7:42 PM Paolo Valente  wrote:

>
> Try bfq too, as it is probably the one containing also the most old
> way heuristics.  If bfq fails too, we will at least know that the issue
> is out of the schedulers. Or in every scheduler! :)

I believe that for this specific setup: A disk reader with a cfast via
USB3, the best approach might be to have a single queue and respect
the tool sequence. But definitely the io layer is out of my expertise
;)

Monday morning will performe the tests.

Thanks again and best regards!


-- 
Ricardo Ribalda


Re: blk-mq 5-8 times slower for bmap-tools

2018-08-17 Thread Ricardo Ribalda Delgado
Hi Jens
On Fri, Aug 17, 2018 at 7:41 PM Jens Axboe  wrote:
>
> On 8/17/18 11:39 AM, Ricardo Ribalda Delgado wrote:
> > Hello Paolo
> > On Fri, Aug 17, 2018 at 7:35 PM Paolo Valente  
> > wrote:
> >>
> >>
> >>
> >>> Il giorno 17 ago 2018, alle ore 19:31, Ricardo Ribalda Delgado 
> >>>  ha scritto:
> >>>
> >>> Hello
> >>>
> >>> Coming from: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=906328
> >>>
> >>> bmaptools is kind of a "smart dd" tool, that lets you write images
> >>> very fast and securely. Since the last Debian Kernel update it has
> >>> become 5-8 times slower.
> >>>
> >>> After some debugging, we have figured out that the reason for that
> >>> slowness is the Multi-Queue Block IO Queueing Mechanism.
> >>>
> >>> Debian maintainer has pointed out that in the near future the single
> >>> queue will be deprecated.
> >>>
> >>> My question is if we can get a similar perfomance for bmaptools with
> >>> blk-mq and how?
> >>>
> >>
> >> Have you also checked what happens after switching to a different I/O
> >> scheduler for the involved drive (among none, mq-deadline, bfq and
> >> kyber)?
> >
> > I have only tried mq-deadline and none (because they are enabled by
> > default on Debian). Both produce results in the same range: 5-8 times
> > slower.
> >
> > I could easily enable kyber:
> > cat /boot/config-4.17.0-1-amd64  | grep CONFIG_MQ_IOSCHED
> > CONFIG_MQ_IOSCHED_DEADLINE=y
> > CONFIG_MQ_IOSCHED_KYBER=m
> >
> > But I left the card reader on the office, so any test would have to
> > wait until monday sorry :(
>
> Can someone describe what bmaptools does? IOW, how is it different than
> dd? Does it use multiple threads for both reads and writes?

https://github.com/intel/bmap-tools/blob/master/bmaptools/BmapCopy.py

I am not an author, just a user. But from the code it looks like

1) sets the io scheduler to noop
2) Copy the file in chuncks of block_size (4096 by default)
3) fflush
4) restore io scheduler


>
> --
> Jens Axboe
>


-- 
Ricardo Ribalda


Re: blk-mq 5-8 times slower for bmap-tools

2018-08-17 Thread Paolo Valente



> Il giorno 17 ago 2018, alle ore 19:39, Ricardo Ribalda Delgado 
>  ha scritto:
> 
> Hello Paolo
> On Fri, Aug 17, 2018 at 7:35 PM Paolo Valente  
> wrote:
>> 
>> 
>> 
>>> Il giorno 17 ago 2018, alle ore 19:31, Ricardo Ribalda Delgado 
>>>  ha scritto:
>>> 
>>> Hello
>>> 
>>> Coming from: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=906328
>>> 
>>> bmaptools is kind of a "smart dd" tool, that lets you write images
>>> very fast and securely. Since the last Debian Kernel update it has
>>> become 5-8 times slower.
>>> 
>>> After some debugging, we have figured out that the reason for that
>>> slowness is the Multi-Queue Block IO Queueing Mechanism.
>>> 
>>> Debian maintainer has pointed out that in the near future the single
>>> queue will be deprecated.
>>> 
>>> My question is if we can get a similar perfomance for bmaptools with
>>> blk-mq and how?
>>> 
>> 
>> Have you also checked what happens after switching to a different I/O
>> scheduler for the involved drive (among none, mq-deadline, bfq and
>> kyber)?
> 
> I have only tried mq-deadline and none (because they are enabled by
> default on Debian). Both produce results in the same range: 5-8 times
> slower.
> 
> I could easily enable kyber:
> cat /boot/config-4.17.0-1-amd64  | grep CONFIG_MQ_IOSCHED
> CONFIG_MQ_IOSCHED_DEADLINE=y
> CONFIG_MQ_IOSCHED_KYBER=m
> 
> But I left the card reader on the office, so any test would have to
> wait until monday sorry :(
> 

Try bfq too, as it is probably the one containing also the most old
way heuristics.  If bfq fails too, we will at least know that the issue
is out of the schedulers. Or in every scheduler! :)

Paolo

> 
>> 
>> Paolo
>> 
>>> Thanks!
>>> 
>>> --
>>> Ricardo Ribalda
>> 
> 
> 
> -- 
> Ricardo Ribalda



Re: blk-mq 5-8 times slower for bmap-tools

2018-08-17 Thread Jens Axboe
On 8/17/18 11:39 AM, Ricardo Ribalda Delgado wrote:
> Hello Paolo
> On Fri, Aug 17, 2018 at 7:35 PM Paolo Valente  
> wrote:
>>
>>
>>
>>> Il giorno 17 ago 2018, alle ore 19:31, Ricardo Ribalda Delgado 
>>>  ha scritto:
>>>
>>> Hello
>>>
>>> Coming from: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=906328
>>>
>>> bmaptools is kind of a "smart dd" tool, that lets you write images
>>> very fast and securely. Since the last Debian Kernel update it has
>>> become 5-8 times slower.
>>>
>>> After some debugging, we have figured out that the reason for that
>>> slowness is the Multi-Queue Block IO Queueing Mechanism.
>>>
>>> Debian maintainer has pointed out that in the near future the single
>>> queue will be deprecated.
>>>
>>> My question is if we can get a similar perfomance for bmaptools with
>>> blk-mq and how?
>>>
>>
>> Have you also checked what happens after switching to a different I/O
>> scheduler for the involved drive (among none, mq-deadline, bfq and
>> kyber)?
> 
> I have only tried mq-deadline and none (because they are enabled by
> default on Debian). Both produce results in the same range: 5-8 times
> slower.
> 
> I could easily enable kyber:
> cat /boot/config-4.17.0-1-amd64  | grep CONFIG_MQ_IOSCHED
> CONFIG_MQ_IOSCHED_DEADLINE=y
> CONFIG_MQ_IOSCHED_KYBER=m
> 
> But I left the card reader on the office, so any test would have to
> wait until monday sorry :(

Can someone describe what bmaptools does? IOW, how is it different than
dd? Does it use multiple threads for both reads and writes?

-- 
Jens Axboe



Re: blk-mq 5-8 times slower for bmap-tools

2018-08-17 Thread Ricardo Ribalda Delgado
Hello Paolo
On Fri, Aug 17, 2018 at 7:35 PM Paolo Valente  wrote:
>
>
>
> > Il giorno 17 ago 2018, alle ore 19:31, Ricardo Ribalda Delgado 
> >  ha scritto:
> >
> > Hello
> >
> > Coming from: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=906328
> >
> > bmaptools is kind of a "smart dd" tool, that lets you write images
> > very fast and securely. Since the last Debian Kernel update it has
> > become 5-8 times slower.
> >
> > After some debugging, we have figured out that the reason for that
> > slowness is the Multi-Queue Block IO Queueing Mechanism.
> >
> > Debian maintainer has pointed out that in the near future the single
> > queue will be deprecated.
> >
> > My question is if we can get a similar perfomance for bmaptools with
> > blk-mq and how?
> >
>
> Have you also checked what happens after switching to a different I/O
> scheduler for the involved drive (among none, mq-deadline, bfq and
> kyber)?

I have only tried mq-deadline and none (because they are enabled by
default on Debian). Both produce results in the same range: 5-8 times
slower.

I could easily enable kyber:
cat /boot/config-4.17.0-1-amd64  | grep CONFIG_MQ_IOSCHED
CONFIG_MQ_IOSCHED_DEADLINE=y
CONFIG_MQ_IOSCHED_KYBER=m

But I left the card reader on the office, so any test would have to
wait until monday sorry :(


>
> Paolo
>
> > Thanks!
> >
> > --
> > Ricardo Ribalda
>


-- 
Ricardo Ribalda


Re: blk-mq 5-8 times slower for bmap-tools

2018-08-17 Thread Paolo Valente



> Il giorno 17 ago 2018, alle ore 19:31, Ricardo Ribalda Delgado 
>  ha scritto:
> 
> Hello
> 
> Coming from: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=906328
> 
> bmaptools is kind of a "smart dd" tool, that lets you write images
> very fast and securely. Since the last Debian Kernel update it has
> become 5-8 times slower.
> 
> After some debugging, we have figured out that the reason for that
> slowness is the Multi-Queue Block IO Queueing Mechanism.
> 
> Debian maintainer has pointed out that in the near future the single
> queue will be deprecated.
> 
> My question is if we can get a similar perfomance for bmaptools with
> blk-mq and how?
> 

Have you also checked what happens after switching to a different I/O
scheduler for the involved drive (among none, mq-deadline, bfq and
kyber)?

Paolo

> Thanks!
> 
> -- 
> Ricardo Ribalda



blk-mq 5-8 times slower for bmap-tools

2018-08-17 Thread Ricardo Ribalda Delgado
Hello

Coming from: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=906328

bmaptools is kind of a "smart dd" tool, that lets you write images
very fast and securely. Since the last Debian Kernel update it has
become 5-8 times slower.

After some debugging, we have figured out that the reason for that
slowness is the Multi-Queue Block IO Queueing Mechanism.

Debian maintainer has pointed out that in the near future the single
queue will be deprecated.

My question is if we can get a similar perfomance for bmaptools with
blk-mq and how?

Thanks!

-- 
Ricardo Ribalda