Hi Himanshu & Co,
On Fri, 2016-02-12 at 00:48 -0800, Nicholas A. Bellinger wrote:
> On Fri, 2016-02-12 at 05:30 +0000, Himanshu Madhani wrote:
<SNIP>
> Thanks for the crash dump output.
>
> So it's a t_state = TRANSPORT_WRITE_PENDING descriptor with
> SAM_STAT_CHECK_CONDITION + cmd_kref.refcount = 0:
>
> struct qla_tgt_cmd {
> se_cmd = {
> scsi_status = 0x2
> se_cmd_flags = 0x80090d,
>
> <SNIP>
>
> cmd_kref = {
> refcount = {
> counter = 0x0
> }
> },
> }
>
> The se_cmd_flags=0x80090d translation to enum se_cmd_flags_table:
>
> - SCF_TRANSPORT_TASK_SENSE
> - SCF_EMULATED_TASK_SENSE
> - SCF_SCSI_DATA_CDB
> - SCF_SE_LUN_CMD
> - SCF_SENT_CHECK_CONDITION
> - SCF_USE_CPUID
>
After groking your dump some more:
For SAM_STAT_CHECK_CONDITION with t_state = TRANSPORT_WRITE_PENDING plus
se_cmd->transport_state = 0x880 bits set, is:
- CMD_T_DEV_ACTIVE
- CMD_T_FABRIC_STOP
and sense buffer = 0x70 00 0b 00 00 00 00 0a 00 00 00 00 29 03 00,
which is the following from sense_info_table[]:
[TCM_CHECK_CONDITION_ABORT_CMD] = {
.key = ABORTED_COMMAND,
.asc = 0x29, /* BUS DEVICE RESET FUNCTION OCCURRED */
.ascq = 0x03,
},
The descriptor looks like it did make it to tcm_qla2xxx_complete_free()
-> transport_generic_free_cmd() with both qla_tgt_cmd->cmd_sent_to_fw=0,
and qla_tgt_cmd->write_data_transferred=0 set.
The best I can tell, it looks like tcm_qla2xxx_handle_data_work() ->
transport_generic_request_failure() w/ TCM_CHECK_CONDITION_ABORT_CMD is
occurring..
So to confirm, this specific bug was not a result of active I/O
LUN_RESET w/ CMD_T_ABORTED during session disconnect, or otherwise.
>
> > I can recreate this issue at will within 5 minute of triggering sg_reset
> > with following steps
> >
> > 1. Export 4 RAM disk LUNs on each of 2 port adapter. Initiator will see 8
> > RAM disk targets
> > 2. Start IO with 4K block size and 8 threads with 80% write 20% read and
> > 100% dandom.
> > (I am using vdbench for generating IO. I can provide setup/config script
> > if needed)
> > 3. Start sg_reset for each LUNs with first device, bus and host with 120s
> > delay. (I¹ve attached
> > My script that I am using for triggering sg_reset)
> >
>
> Thanks, will keep looking and try to reproduce with your script.
So here's my test setup with 3x Intel P3600 NVMe/IBLOCK backends, across
dual ISP2532 ports:
o- /
.........................................................................................
[...]
o- backstores
..............................................................................
[...]
| o- fileio
................................................................... [0 Storage
Object]
| o- iblock
.................................................................. [3 Storage
Objects]
| | o- nvme0n1 ............................................................
[/dev/nvme0n1, in use]
| | o- nvme1n1 ............................................................
[/dev/nvme1n1, in use]
| | o- nvme2n1 ............................................................
[/dev/nvme2n1, in use]
| o- pscsi
.................................................................... [0 Storage
Object]
| o- rd_mcp
................................................................... [1 Storage
Object]
| o- ramdisk ...................................................... [16.0G,
ramdisk, not in use]
o- qla2xxx
........................................................................... [2
Targets]
| o- 21:00:00:24:ff:48:97:7e
........................................................... [enabled]
| | o- acls
..............................................................................
[1 ACL]
| | | o- 21:00:00:24:ff:48:97:7c
................................................. [3 Mapped LUNs]
| | | o- mapped_lun0
............................................................... [lun0 (rw)]
| | | o- mapped_lun1
............................................................... [lun1 (rw)]
| | | o- mapped_lun2
............................................................... [lun2 (rw)]
| | o- luns
.............................................................................
[3 LUNs]
| | o- lun0 ....................................................
[iblock/nvme0n1 (/dev/nvme0n1)]
| | o- lun1 ....................................................
[iblock/nvme1n1 (/dev/nvme1n1)]
| | o- lun2 ....................................................
[iblock/nvme2n1 (/dev/nvme2n1)]
| o- 21:00:00:24:ff:48:97:7f
........................................................... [enabled]
| o- acls
..............................................................................
[1 ACL]
| | o- 21:00:00:24:ff:48:97:7d
................................................. [3 Mapped LUNs]
| | o- mapped_lun0
............................................................... [lun0 (rw)]
| | o- mapped_lun1
............................................................... [lun1 (rw)]
| | o- mapped_lun2
............................................................... [lun2 (rw)]
| o- luns
.............................................................................
[3 LUNs]
| o- lun0 ....................................................
[iblock/nvme0n1 (/dev/nvme0n1)]
| o- lun1 ....................................................
[iblock/nvme1n1 (/dev/nvme1n1)]
| o- lun2 ....................................................
[iblock/nvme2n1 (/dev/nvme2n1)]
Attached is the fio write-verify workload for reference.
Also, a few changes made to your test script:
- Use sg_reset -H (-h is help :) for host reset op
- Wait sleep 10 between calls instead of 2 mins
- Limit sg_reset SCSI device list to 3x remote-ports
The last one is to verify with various sg_reset ops across remote-ports
only, separate from any existing active I/O session disconnect bugs that
may exist beyond this specific PATCH-v4 series.
To that end, after verifying tonight with 100x iterations of your script
with the above changes, fio write-verify is still functioning as
expected with remote-port only sg_reset ops using NVMe/IBLOCK backends.
So that said, I'll be pushing what's in target-pending/master as -rc4
code, and continue to debug the hung task as a separate active I/O
shutdown related issue.
Thanks again for your help,
--nab
[global]
thread=1
blocksize_range=4k-256k
direct=1
ioengine=libaio
verify=crc32c-intel
verify_interval=512
iodepth=32
size=1000G
loops=100
numjobs=1
invalidate=0
filename=/dev/sdb
filename=/dev/sdc
filename=/dev/sdd
[verify]
rw=randrw
do_verify=1