Hello Lee,

I saw there was a reply from Lee to googlegroups
(https://goo.gl/x8LhFm). I haven't responded before because I was
subscribed only to linux-scsi@, my bad.

Yes, it worked as expected.

>From your question:

> Rafael:

> Did you test this change, i.e. shutdowns no longer hang, under test
> circumstances, with this change?

Yes, my start work was:

https://pastebin.ubuntu.com/26292711/

And the tests during devel:

https://pastebin.ubuntu.com/26292701/
https://pastebin.ubuntu.com/26292702/

And finally with the submitted patch the expected behavior:

https://pastebin.ubuntu.com/26292706/ -> just 1 session
https://pastebin.ubuntu.com/26292708/ -> multiple sessions

Note:

[   78.427670]  session6: iscsi_eh_cmd_timed_out scsi cmd
ffff88b2ef499160 timedout
[   78.427671]  session6: iscsi_eh_cmd_timed_out sc on shutdown, handled
[   78.427671]  session6: iscsi_eh_cmd_timed_out return shutdown or nh

[   78.437637]  session7: iscsi_eh_cmd_timed_out scsi cmd
ffff88b2f161c160 timedout
[   78.438366]  session7: iscsi_eh_cmd_timed_out sc on shutdown, handled
[   78.439004]  session7: iscsi_eh_cmd_timed_out return shutdown or nh

[   78.441551]  session8: iscsi_eh_cmd_timed_out scsi cmd
ffff88b2ef49a160 timedout
[   78.442278]  session8: iscsi_eh_cmd_timed_out sc on shutdown, handled
[   78.442914]  session8: iscsi_eh_cmd_timed_out return shutdown or nh

[  109.149438]  session2: iscsi_eh_cmd_timed_out scsi cmd
ffff88b2ef1fd560 timedout
[  109.150251]  session2: iscsi_eh_cmd_timed_out sc on shutdown, handled
[  109.150969]  session2: iscsi_eh_cmd_timed_out return shutdown or nh

[   78.427506] sd 8:0:0:1: tag#0 Done: TIMEOUT_ERROR Result:
hostbyte=DID_OK driverbyte=DRIVER_OK
[   78.427662] sd 7:0:0:1: tag#0 Done: TIMEOUT_ERROR Result:
hostbyte=DID_OK driverbyte=DRIVER_OK
[   78.439548] sd 9:0:0:1: tag#0 Done: TIMEOUT_ERROR Result:
hostbyte=DID_OK driverbyte=DRIVER_OK
[  109.146728] sd 3:0:0:1: tag#0 Done: TIMEOUT_ERROR Result:
hostbyte=DID_OK driverbyte=DRIVER_OK

the iscsi_eh_cmd_timed_out logic after the ping timeouts.

And then:

[   78.427678] sd 7:0:0:1: tag#0 Done: SUCCESS Result:
hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[   78.443456] sd 8:0:0:1: tag#0 Done: SUCCESS Result:
hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[   78.447592] sd 9:0:0:1: tag#0 Done: SUCCESS Result:
hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[  109.151582] sd 3:0:0:1: tag#0 Done: SUCCESS Result:
hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK

The iscsi_queuecommand logic setting RESULT to DID_NO_CONNECT when
queueing under shutdown on disconnected transport.

[   78.427683] sd 7:0:0:1: Notifying upper driver of completion (result 10000)
[   78.445899] sd 8:0:0:1: Notifying upper driver of completion (result 10000)
[   78.450035] sd 9:0:0:1: Notifying upper driver of completion (result 10000)
[  109.154495] sd 3:0:0:1: Notifying upper driver of completion (result 10000)

> [side note: we *really* need an open-iscsi test suite! Anybody?]

I'm interested in creating/helping (specially now that I read big part
of the code because of this bug).

> As long as the upper levels handle this correctly, I'm good with it.

Yes, check it out. At the end:

[  109.354984] sd 8:0:0:1: tag#0 0 sectors total, 0 bytes done.
[  109.355596] sd 8:0:0:1: [sda] Synchronize Cache(10) failed: Result:
hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[  109.356980] reboot: Restarting system
[  109.357392] reboot: machine restart

You see the "sync cache failed" message (important to see you couldn't
sync that disk and you might need to fix userland shutdown order) with
DID_NO_CONNECT (since the sd_shutdown tries to sync 3 times you might
see lots of DID_NO_CONNECT errors, for all sessions, but all of the
commands will be handled after this change, and upper layer informed
of the error).

I hope that answers you. Let me know if you want me to provide any
other information.

Cheers
-Rafael

On Thu, Dec 21, 2017 at 12:39 AM, Martin K. Petersen
<martin.peter...@oracle.com> wrote:
>
>> If, for any reason, userland shuts down iscsi transport interfaces
>> before proper logouts - like when logging in to LUNs manually, without
>> logging out on server shutdown, or when automated scripts can't
>> umount/logout from logged LUNs - kernel will hang forever on its
>> sd_sync_cache() logic, after issuing the SYNCHRONIZE_CACHE cmd to all
>> still existent paths.
>
> Chris and Lee: Please review. Thanks!
>
> --
> Martin K. Petersen      Oracle Linux Engineering

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to open-iscsi+unsubscr...@googlegroups.com.
To post to this group, send email to open-iscsi@googlegroups.com.
Visit this group at https://groups.google.com/group/open-iscsi.
For more options, visit https://groups.google.com/d/optout.

Reply via email to