On 2018-06-11 12:07 PM, Ted Cabeen wrote:
I'm seeing a similar behavior on my system, but across multiple devices on a SAS drive array (front bays on a Supermicro-based system with onboard mpt3sas card). The Sense Key here doesn't show a medium error, and the multiple-drive behavior makes me think it's more likely either a controller or cable problem. Interestingly, the issue only shows up under heavy load (specifically a ZFS scrub).

During my next downtime window, I'm going to try to re-create the problem while capturing a blktrace.  Any other things to try at that time, or a filter-mask I should apply?

[Wed Jun  6 14:30:19 2018] blk_update_request: I/O error, dev sdn, sector 1757633640 [Wed Jun  6 14:37:10 2018] sd 15:0:5:0: unaligned partial completion avoided (xfer_cnt=3072, sector_sz=4096) [Wed Jun  6 14:37:10 2018] sd 15:0:5:0: [sdr] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [Wed Jun  6 14:37:10 2018] sd 15:0:5:0: [sdr] Sense Key : Aborted Command [current] [descriptor]
[Wed Jun  6 14:37:10 2018] sd 15:0:5:0: [sdr] Add. Sense: Nak received
[Wed Jun  6 14:37:10 2018] sd 15:0:5:0: [sdr] CDB: Read(10) 28 00 07 8a c1 ca 00 00 01 00 [Wed Jun  6 14:37:10 2018] blk_update_request: I/O error, dev sdr, sector 1012272720 [Wed Jun  6 15:20:43 2018] sd 15:0:8:0: unaligned partial completion avoided (xfer_cnt=52224, sector_sz=4096) [Wed Jun  6 15:20:43 2018] sd 15:0:8:0: [sdu] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [Wed Jun  6 15:20:43 2018] sd 15:0:8:0: [sdu] Sense Key : Aborted Command [current] [descriptor]
[Wed Jun  6 15:20:43 2018] sd 15:0:8:0: [sdu] Add. Sense: Nak received
[Wed Jun  6 15:20:43 2018] sd 15:0:8:0: [sdu] CDB: Read(10) 28 00 12 ab dc 52 00 00 19 00 [Wed Jun  6 15:20:43 2018] blk_update_request: I/O error, dev sdu, sector 2506023568 [Wed Jun  6 15:46:20 2018] sd 15:0:2:0: unaligned partial completion avoided (xfer_cnt=11264, sector_sz=4096) [Wed Jun  6 15:46:20 2018] sd 15:0:2:0: [sdo] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [Wed Jun  6 15:46:20 2018] sd 15:0:2:0: [sdo] Sense Key : Aborted Command [current] [descriptor]
[Wed Jun  6 15:46:20 2018] sd 15:0:2:0: [sdo] Add. Sense: Nak received
[Wed Jun  6 15:46:20 2018] sd 15:0:2:0: [sdo] CDB: Read(10) 28 00 40 a8 ef b5 00 00 03 00 [Wed Jun  6 15:46:20 2018] blk_update_request: I/O error, dev sdo, sector 8678505896


I have also seen Aborted Command sense when doing heavy testing on one or
more SAS disks behind a SAS expander. I put it down to a temporary lack of
paths available (on the link between the host's HBA and the expander)
when one of those SAS disks tries to get a connection back to the host
with the data (data-in transfer) from an earlier READ command.

In my code (ddpt and sg_dd) I treat it as a "retry" type error and in
my experience that works. IOW a follow-up READ with the same parameters
is successful.

Doug Gilbert

Reply via email to