Re: [PATCH 00/11] First pass at merging Bart's HA work
On Thu, Dec 6, 2012 at 4:27 PM, Or Gerlitz ogerl...@mellanox.com wrote: [...] looking on the current locks in the system, we see that this kworker task holds four locks, but none of them seems to be mutually held by another task, That was ofcourse a wrong assertion, as a lock can't be mutually held by two tasks... I wanted to say that from the general view of which locks are being held there's no obvious deadlock here, since none of the other locks holders relates to the block/scsi layers... -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/11] First pass at merging Bart's HA work
Alex Turin wrote: On 12/6/2012 5:04 PM, Bart Van Assche wrote: On 12/06/12 15:27, Or Gerlitz wrote: The core problem here seems to be that scsi_remove_host simply never ends. Hello Or, The later patches in the srp-ha patch series avoided such behavior by checking whether the connection between SRP initiator and target is unique, and by removing duplicate SCSI hosts for which the transport layer failed. Unfortunately these patches are still under review. Unless someone can come up with a better solution I will post a patch one of the next days that makes ib_srp again fail all commands after host removal started. That will avoid spending a long time doing error recovery. Also, you might have noticed that Hannes Reinecke reported a few days ago that the SCSI error handler may need a lot of time for other transport types - this behavior is not SRP specific. Bart. Hello Bart, In our case we don't have duplicate hosts or targets. We are working with a single SCSI disk. To make scsi_remove_host hang we simply disabling a IB port and run dd if=/dev/sdb of=/dev/null count=1. Hello Bart, I applied your latest patch [PATCH for-next] IB/srp: Make SCSI error handling finish and test Let me capture what I'm seeing: Host has two paths (scsi_host 7 8) to target thru two physical ports 1 2 [root@rsws42 ~]# multipath -l size=50G features='0' hwhandler='0' wp=rw |-+- policy='round-robin 0' prio=0 status=active | `- 7:0:0:11 sdb 8:16 active undef running `-+- policy='round-robin 0' prio=0 status=enabled `- 8:0:0:11 sdc 8:32 active undef running Cable pull by disable port 1, I/Os fail-over fine, the problem is the cleaning of scsi_host 7 of fail path. IB RC failure, scsi error recovery kick in. srp _reconnect_target() failed, srp_remove_target() run to remove scsi_host 7; however, I think it get stuck at device_del(dev) inside __scsi_remove_device(dev) Error recovery continuously happen again and again on scsi host 7 for 9-10 minutes. scsi_host 7 cannot be cleaned up, its sysfs entry is still there (/sys/class/scsi_host/host7), its state is SHOST_CANCEL. I brought port 1 back online, scsi_host 7 cannot reconnect to target because its state in SRP_TARGET_REMOVED. scci_host 7 sysfs entry does not contain target login info (ioc_guid, id_ext, dgid...). I think srp_daemon can reconnect to target by creating new path with new scsi hosst; however, I cannot check because I currently don't have a working srp_daemon. I need to manually reconnect to target with echo command Bottom line, I/Os can fail-over/failback; however, old scsi hosts cannot be removed (sysfs entry is still there) with state SHOST_CANCEL, error recovery keep happening on old scsi hosts for 10-20 minutes. thanks, -vu -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/11] First pass at merging Bart's HA work
On 12/05/12 22:32, Or Gerlitz wrote: On Wed, Dec 5, 2012 at 8:50 PM, Bart Van Assche bvanass...@acm.org wrote: [...] The only way to make I/O work reliably if a failure can occur at the transport layer is to use multipathd on top of ib_srp. If a connection fails for some reason, then the SRP SCSI host will be removed after the SCSI error handler has finished with its error recovery strategy. And once the transport layer is operational again and srp_daemon detects that the initiator is no longer logged in srp_daemon will make ib_srp log in again. multipathd will then cause I/O to continue over the new path. Claim basically understood and agreed however, does this also hold when the link is back again, that is can't SRP login via this single path also when there's no multipath on top? As far as I can remember the behavior of ib_srp has always been to try to reconnect once to the SRP target after the SCSI error handler kicked in. Other SCSI LLDs, e.g. the iSCSI initiator, can be configured to keep trying to reconnect after a transport layer failure. That has the advantage that the SCSI host number remains the same after reconnecting succeeded as before reconnecting started. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/11] First pass at merging Bart's HA work
On 06/12/2012 16:10, Bart Van Assche wrote: On 12/05/12 22:32, Or Gerlitz wrote: On Wed, Dec 5, 2012 at 8:50 PM, Bart Van Assche bvanass...@acm.org wrote: [...] The only way to make I/O work reliably if a failure can occur at the transport layer is to use multipathd on top of ib_srp. If a connection fails for some reason, then the SRP SCSI host will be removed after the SCSI error handler has finished with its error recovery strategy. And once the transport layer is operational again and srp_daemon detects that the initiator is no longer logged in srp_daemon will make ib_srp log in again. multipathd will then cause I/O to continue over the new path. Claim basically understood and agreed however, does this also hold when the link is back again, that is can't SRP login via this single path also when there's no multipath on top? As far as I can remember the behavior of ib_srp has always been to try to reconnect once to the SRP target after the SCSI error handler kicked in. Other SCSI LLDs, e.g. the iSCSI initiator, can be configured to keep trying to reconnect after a transport layer failure. That has the advantage that the SCSI host number remains the same after reconnecting succeeded as before reconnecting started. Bart, The core problem here seems to be that scsi_remove_host simply never ends. Observing all the tasks in the system (e.g using echo t /proc/sysrq-trigger), we've noted that none of the SCSI EH are currently running, that is for all of them their trace is the following scsi_eh_0 S 0 380 2 0x 88042c31be08 0046 88042c31bfd8 00014380 88042c31a010 00014380 00014380 00014380 88042c31bfd8 00014380 88042f5be5c0 88042bb48c40 Call Trace: [8139b2c0] ? scsi_unjam_host+0x1f0/0x1f0 [8155c599] schedule+0x29/0x70 [8139b335] scsi_error_handler+0x75/0x1c0 [8139b2c0] ? scsi_unjam_host+0x1f0/0x1f0 [8107cc2e] kthread+0xee/0x100 [8107cb40] ? __init_kthread_worker+0x70/0x70 [8156676c] ret_from_fork+0x7c/0xb0 [8107cb40] ? __init_kthread_worker+0x70/0x70 However, the flow starting in srp_remove_target hangs somewhere in the block layer waiting for something to happen worker/11:1D 0 163 2 0x 88082be6f738 0046 88082be6ffd8 00014380 88082be6e010 00014380 00014380 00014380 88082be6ffd8 00014380 88042f5ba580 88082be6c1c0 Call Trace: [8155c599] schedule+0x29/0x70 [8155a60f] schedule_timeout+0x14f/0x240 [810674f0] ? lock_timer_base+0x70/0x70 [8155c43b] wait_for_common+0x11b/0x170 [81091ab0] ? try_to_wake_up+0x300/0x300 [8155c543] wait_for_completion_timeout+0x13/0x20 [8125ecc3] blk_execute_rq+0x133/0x1c0 [81257830] ? get_request+0x210/0x3d0 [8139dfb8] scsi_execute+0xe8/0x180 [8139e1f7] scsi_execute_req+0xa7/0x110 [a0086498] sd_sync_cache+0xd8/0x130 [sd_mod] [8137180e] ? __dev_printk+0x3e/0x90 [81371b45] ? dev_printk+0x45/0x50 [a0086700] sd_shutdown+0xd0/0x150 [sd_mod] [a008691c] sd_remove+0x7c/0xc0 [sd_mod] [81375dec] __device_release_driver+0x7c/0xe0 [81375f5f] device_release_driver+0x2f/0x50 [81374e46] bus_remove_device+0x126/0x190 [81372bbb] device_del+0x14b/0x250 [813a2878] __scsi_remove_device+0x1b8/0x1d0 [8139eba6] scsi_forget_host+0xf6/0x110 [81396448] scsi_remove_host+0x108/0x1e0 [a0536c38] srp_remove_target+0xb8/0x150 [ib_srp] [a0536d34] srp_remove_work+0x64/0xa0 [ib_srp] [81074ce2] process_one_work+0x1c2/0x4a0 [81074c70] ? process_one_work+0x150/0x4a0 [a0536cd0] ? srp_remove_target+0x150/0x150 [ib_srp] [8107746e] worker_thread+0x12e/0x370 [81077340] ? manage_workers+0x180/0x180 [8107cc2e] kthread+0xee/0x100 [8107cb40] ? __init_kthread_worker+0x70/0x70 [8156676c] ret_from_fork+0x7c/0xb0 [8107cb40] ? __init_kthread_worker+0x70/0x70 looking on the current locks in the system, we see that this kworker task holds four locks, but none of them seems to be mutually held by another task, Showing all locks held in the system: 4 locks held by kworker/11:1/163: #0: (events_long){.+.+.+}, at: [81074c70] process_one_work+0x150/0x4a0 #1: ((target-remove_work)){+.+.+.}, at: [81074c70] process_one_work+0x150/0x4a0 #2: (shost-scan_mutex){+.+.+.}, at: [81396374] scsi_remove_host+0x34/0x1e0 #3: (__lockdep_no_validate__){..}, at: [81375f57] device_release_driver+0x27/0x50 1 lock held by bash/6298: #0: (tty-atomic_read_lock){+.+...}, at: [81339a9e] n_tty_read+0x58e/0x960 1 lock held by mingetty/6319: #0: (tty-atomic_read_lock){+.+...},
Re: [PATCH 00/11] First pass at merging Bart's HA work
On 12/06/12 15:27, Or Gerlitz wrote: The core problem here seems to be that scsi_remove_host simply never ends. Hello Or, The later patches in the srp-ha patch series avoided such behavior by checking whether the connection between SRP initiator and target is unique, and by removing duplicate SCSI hosts for which the transport layer failed. Unfortunately these patches are still under review. Unless someone can come up with a better solution I will post a patch one of the next days that makes ib_srp again fail all commands after host removal started. That will avoid spending a long time doing error recovery. Also, you might have noticed that Hannes Reinecke reported a few days ago that the SCSI error handler may need a lot of time for other transport types - this behavior is not SRP specific. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/11] First pass at merging Bart's HA work
On 06/12/2012 17:04, Bart Van Assche wrote: On 12/06/12 15:27, Or Gerlitz wrote: The core problem here seems to be that scsi_remove_host simply never ends. Hello Or, The later patches in the srp-ha patch series avoided such behavior by checking whether the connection between SRP initiator and target is unique, and by removing duplicate SCSI hosts for which the transport layer failed. Unfortunately these patches are still under review. Unless someone can come up with a better solution I will post a patch one of the next days that makes ib_srp again fail all commands after host removal started. That will avoid spending a long time doing error recovery. Also, you might have noticed that Hannes Reinecke reported a few days ago that the SCSI error handler may need a lot of time for other transport types - this behavior is not SRP specific. I'm not sure what to you exactly refer by duplicated SCSI hosts in this context or why we have them. Again, at the time we've took the stack traces snapshot from the system none of the SCSI EH threads was active, so I'm not sure either your comment about spending long time in the error recovery flow, as the flow we've run into seems to simply wait forever. Or. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/11] First pass at merging Bart's HA work
On 12/6/2012 5:04 PM, Bart Van Assche wrote: On 12/06/12 15:27, Or Gerlitz wrote: The core problem here seems to be that scsi_remove_host simply never ends. Hello Or, The later patches in the srp-ha patch series avoided such behavior by checking whether the connection between SRP initiator and target is unique, and by removing duplicate SCSI hosts for which the transport layer failed. Unfortunately these patches are still under review. Unless someone can come up with a better solution I will post a patch one of the next days that makes ib_srp again fail all commands after host removal started. That will avoid spending a long time doing error recovery. Also, you might have noticed that Hannes Reinecke reported a few days ago that the SCSI error handler may need a lot of time for other transport types - this behavior is not SRP specific. Bart. Hello Bart, In our case we don't have duplicate hosts or targets. We are working with a single SCSI disk. To make scsi_remove_host hang we simply disabling a IB port and run dd if=/dev/sdb of=/dev/null count=1. Alex -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/11] First pass at merging Bart's HA work
On Fri, Nov 30, 2012 at 4:21 AM, David Dillow dillo...@ornl.gov wrote: [...] Modulo a few style issues (braces around one line if branches, etc.) and having three state variables vs one, I can live with everything up to aabfa852acd27962 at git://github.com/bvanassche/linux.git#srp-ha. Those two are small things that can be fixed later and are not worth holding things up any further. I'll try to spend some time on the final four patches tomorrow afternoon. Dave, Bart My colleague Alex Turin ale...@mellanox.com tried today the bits as they appear in Roland's kernel.org tree / for-next branch up to commit fb57e1dbbd4 and here's some feedback Basically, what he did was connecting to a target, next take down the IB port on the initiator side, and issue some IOs (dd if=/dev/sdb of=/dev/null count=1) Our recollection of events from the logs (below) is the following 1. queued command get completion status 5 2. as part of error handling srp_reset_host() was called, 3. srp_reset_host() calls to srp_reconnect_target() which fails cause port is down. 4. srp_reconnect_target() on failure calls to srp_queue_remove_work() which sets target-status to SRP_TARGET_REMOVED. 5.srp_reset_host() called second time. it calls to srp_reconnect_target() but target-state == SRP_TARGET_REMOVED. srp_reconnect_target() checks if target-state != SRP_TARGET_LIVE and return -EAGAIN. This probably means that even after enabling port it will still fail to reconnect? Or. Dec 5 16:19:13 rsws42 kernel: scsi host7: ib_srp: failed send status 5 Dec 5 16:19:42 rsws42 kernel: scsi host7: SRP abort called Dec 5 16:19:42 rsws42 kernel: scsi host7: SRP reset_device called Dec 5 16:19:42 rsws42 kernel: scsi host7: ib_srp: SRP reset_host called Dec 5 16:19:43 rsws42 kernel: scsi host7: ib_srp: Got failed path rec status -110 Dec 5 16:19:43 rsws42 kernel: scsi host7: ib_srp: Path record query failed Dec 5 16:19:43 rsws42 kernel: scsi host7: ib_srp: reconnect failed (-110), removing target port. Dec 5 16:19:43 rsws42 kernel: sd 7:0:0:11: Device offlined - not ready after error recovery Dec 5 16:19:43 rsws42 kernel: sd 7:0:0:11: [sdb] Synchronizing SCSI cache Dec 5 16:20:45 rsws42 kernel: scsi host7: SRP abort called Dec 5 16:20:50 rsws42 kernel: scsi host7: SRP abort called Dec 5 16:21:05 rsws42 kernel: scsi host7: SRP abort called Dec 5 16:21:10 rsws42 kernel: scsi host7: SRP reset_device called Dec 5 16:21:15 rsws42 kernel: scsi host7: ib_srp: SRP reset_host called Dec 5 16:21:15 rsws42 kernel: sd 7:0:0:11: Device offlined - not ready after error recovery Dec 5 16:21:15 rsws42 kernel: sd 7:0:0:11: Device offlined - not ready after error recovery repeating part: Dec 5 16:22:17 rsws42 kernel: scsi host7: SRP abort called Dec 5 16:22:22 rsws42 kernel: scsi host7: SRP abort called Dec 5 16:22:37 rsws42 kernel: scsi host7: SRP abort called Dec 5 16:22:42 rsws42 kernel: scsi host7: SRP reset_device called Dec 5 16:22:47 rsws42 kernel: scsi host7: ib_srp: SRP reset_host called Dec 5 16:22:47 rsws42 kernel: sd 7:0:0:11: Device offlined - not ready after error recovery Dec 5 16:22:47 rsws42 kernel: sd 7:0:0:11: Device offlined - not ready after error recovery -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/11] First pass at merging Bart's HA work
On 12/05/12 19:23, Or Gerlitz wrote: On Fri, Nov 30, 2012 at 4:21 AM, David Dillow dillo...@ornl.gov wrote: [...] Modulo a few style issues (braces around one line if branches, etc.) and having three state variables vs one, I can live with everything up to aabfa852acd27962 at git://github.com/bvanassche/linux.git#srp-ha. Those two are small things that can be fixed later and are not worth holding things up any further. I'll try to spend some time on the final four patches tomorrow afternoon. Dave, Bart My colleague Alex Turin ale...@mellanox.com tried today the bits as they appear in Roland's kernel.org tree / for-next branch up to commit fb57e1dbbd4 and here's some feedback Basically, what he did was connecting to a target, next take down the IB port on the initiator side, and issue some IOs (dd if=/dev/sdb of=/dev/null count=1) Our recollection of events from the logs (below) is the following 1. queued command get completion status 5 2. as part of error handling srp_reset_host() was called, 3. srp_reset_host() calls to srp_reconnect_target() which fails cause port is down. 4. srp_reconnect_target() on failure calls to srp_queue_remove_work() which sets target-status to SRP_TARGET_REMOVED. 5.srp_reset_host() called second time. it calls to srp_reconnect_target() but target-state == SRP_TARGET_REMOVED. srp_reconnect_target() checks if target-state != SRP_TARGET_LIVE and return -EAGAIN. This probably means that even after enabling port it will still fail to reconnect? Hello Or, The only way to make I/O work reliably if a failure can occur at the transport layer is to use multipathd on top of ib_srp. If a connection fails for some reason, then the SRP SCSI host will be removed after the SCSI error handler has finished with its error recovery strategy. And once the transport layer is operational again and srp_daemon detects that the initiator is no longer logged in srp_daemon will make ib_srp log in again. multipathd will then cause I/O to continue over the new path. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/11] First pass at merging Bart's HA work
On 12/05/12 19:50, Bart Van Assche wrote: On 12/05/12 19:23, Or Gerlitz wrote: On Fri, Nov 30, 2012 at 4:21 AM, David Dillow dillo...@ornl.gov wrote: [...] Modulo a few style issues (braces around one line if branches, etc.) and having three state variables vs one, I can live with everything up to aabfa852acd27962 at git://github.com/bvanassche/linux.git#srp-ha. Those two are small things that can be fixed later and are not worth holding things up any further. I'll try to spend some time on the final four patches tomorrow afternoon. Dave, Bart My colleague Alex Turin ale...@mellanox.com tried today the bits as they appear in Roland's kernel.org tree / for-next branch up to commit fb57e1dbbd4 and here's some feedback Basically, what he did was connecting to a target, next take down the IB port on the initiator side, and issue some IOs (dd if=/dev/sdb of=/dev/null count=1) Our recollection of events from the logs (below) is the following 1. queued command get completion status 5 2. as part of error handling srp_reset_host() was called, 3. srp_reset_host() calls to srp_reconnect_target() which fails cause port is down. 4. srp_reconnect_target() on failure calls to srp_queue_remove_work() which sets target-status to SRP_TARGET_REMOVED. 5.srp_reset_host() called second time. it calls to srp_reconnect_target() but target-state == SRP_TARGET_REMOVED. srp_reconnect_target() checks if target-state != SRP_TARGET_LIVE and return -EAGAIN. This probably means that even after enabling port it will still fail to reconnect? Hello Or, The only way to make I/O work reliably if a failure can occur at the transport layer is to use multipathd on top of ib_srp. If a connection fails for some reason, then the SRP SCSI host will be removed after the SCSI error handler has finished with its error recovery strategy. And once the transport layer is operational again and srp_daemon detects that the initiator is no longer logged in srp_daemon will make ib_srp log in again. multipathd will then cause I/O to continue over the new path. (replying to my own e-mail) Another possible approach would be to follow the FC model and to block I/O when a port goes down and to unblock I/O once I/O is again possible. Some time ago I had posted a patch that went somewhat in this direction and in which ib_srp tried to reconnect to a target repeatedly after a transport layer failure. That patch can be found here: http://www.mail-archive.com/linux-rdma@vger.kernel.org/msg10158.html Bart. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/11] First pass at merging Bart's HA work
On Wed, Dec 5, 2012 at 8:50 PM, Bart Van Assche bvanass...@acm.org wrote: [...] The only way to make I/O work reliably if a failure can occur at the transport layer is to use multipathd on top of ib_srp. If a connection fails for some reason, then the SRP SCSI host will be removed after the SCSI error handler has finished with its error recovery strategy. And once the transport layer is operational again and srp_daemon detects that the initiator is no longer logged in srp_daemon will make ib_srp log in again. multipathd will then cause I/O to continue over the new path. Claim basically understood and agreed however, does this also hold when the link is back again, that is can't SRP login via this single path also when there's no multipath on top? Or. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/11] First pass at merging Bart's HA work
On Mon, Nov 26, 2012 at 8:04 PM, David Dillow dillo...@ornl.gov wrote: We can push it through James's tree if need be, but Bart's code is pretty self-contained, and going through the SCSI tree will introduce merge dependencies. It'd be much easier to push it all through the RDMA tree, especially if we want to get this landed for 3.8. OK, I guess all the srp_transport stuff looks quite simple and good to me, so I'm OK merging it. Is there some subset of patches that you and Bart agree are good, which I can pick up now? I'd love to get at least some of the SRP stuff into 3.8, and that window is opening pretty soon. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/11] First pass at merging Bart's HA work
On Thu, 2012-11-29 at 12:21 -0800, Roland Dreier wrote: On Mon, Nov 26, 2012 at 8:04 PM, David Dillow dillo...@ornl.gov wrote: We can push it through James's tree if need be, but Bart's code is pretty self-contained, and going through the SCSI tree will introduce merge dependencies. It'd be much easier to push it all through the RDMA tree, especially if we want to get this landed for 3.8. OK, I guess all the srp_transport stuff looks quite simple and good to me, so I'm OK merging it. Is there some subset of patches that you and Bart agree are good, which I can pick up now? I'd love to get at least some of the SRP stuff into 3.8, and that window is opening pretty soon. Modulo a few style issues (braces around one line if branches, etc.) and having three state variables vs one, I can live with everything up to aabfa852acd27962 at git://github.com/bvanassche/linux.git#srp-ha. Those two are small things that can be fixed later and are not worth holding things up any further. I'll try to spend some time on the final four patches tomorrow afternoon. -- Dave Dillow National Center for Computational Science Oak Ridge National Laboratory (865) 241-6602 office -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/11] First pass at merging Bart's HA work
On 11/26/12 05:44, David Dillow wrote: Here is a first, UNTESTED, pass at preparing a merge of Bart's SRP HA work to upstream. It is not complete, as I have not yet added the transport layer error handling and related patches. It is also currently missing the patch to maintain a single connection for an I_T nexus. I swapped Ishai's code to recreate QP/CQs for each connection, as that adds recovery from fatal hardware errors, and reduces the code needed to avoid stale completions. Similarly, Vu's patch slightly reduces reconnect times. Target blocking/unblocking hasn't been added from Bart's patch that removes the SRP_TARGET_CONNECTING state; I think it is probably the right thing to do, but want to think it over a bit more. Similarly, I'm looking for a clean way to start the reconnection effort as soon as we get an error through the CQ -- we know that any pending commands are lost to us at that point, so we should be able to kick them back upstream quickly. This should allow multipath to reissue them on one of the remaining good paths. This series compiles, but is otherwise UNTESTED. I'll be working on that over the next few days, with an eye on getting as much of Bart's work into 3.8 as possible. Thanks Dave for doing all this work. A reworked and retested patch series that should address all comments that have been posted so far can be found here: http://github.com/bvanassche/linux/srp-ha. I can repost the entire patch series if you want. The changes compared to the previous time this patch series was posted are: * Integrated Ishai's and Vu's patches for recreating the QP and the CQs. * Took advantage of SCSI core changes that make scsi_remove_host() wait until device removal and error handling finished (will post these shortly). Bart. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/11] First pass at merging Bart's HA work
On 28 November 2012 03:34, Bart Van Assche bvanass...@acm.org wrote: On 11/26/12 05:44, David Dillow wrote: Here is a first, UNTESTED, pass at preparing a merge of Bart's SRP HA work to upstream. It is not complete, as I have not yet added the transport layer error handling and related patches. It is also currently missing the patch to maintain a single connection for an I_T nexus. I swapped Ishai's code to recreate QP/CQs for each connection, as that adds recovery from fatal hardware errors, and reduces the code needed to avoid stale completions. Similarly, Vu's patch slightly reduces reconnect times. Target blocking/unblocking hasn't been added from Bart's patch that removes the SRP_TARGET_CONNECTING state; I think it is probably the right thing to do, but want to think it over a bit more. Similarly, I'm looking for a clean way to start the reconnection effort as soon as we get an error through the CQ -- we know that any pending commands are lost to us at that point, so we should be able to kick them back upstream quickly. This should allow multipath to reissue them on one of the remaining good paths. This series compiles, but is otherwise UNTESTED. I'll be working on that over the next few days, with an eye on getting as much of Bart's work into 3.8 as possible. Thanks Dave for doing all this work. A reworked and retested patch series that should address all comments that have been posted so far can be found here: http://github.com/bvanassche/linux/srp-ha. I can repost the entire patch series if you want. The changes compared to the previous time this patch series was posted are: * Integrated Ishai's and Vu's patches for recreating the QP and the CQs. * Took advantage of SCSI core changes that make scsi_remove_host() wait until device removal and error handling finished (will post these shortly). Bart. This is grreat news, would be really good to see this merged for 3.8 Bart - I will test the new srp-ha branch in our environment. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Joseph. -- CTO | Orion Virtualisation Solutions | www.orionvm.com.au Phone: 1300 56 99 52 | Mobile: 0428 754 846 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/11] First pass at merging Bart's HA work
This series compiles, but is otherwise UNTESTED. I'll be working on that over the next few days, with an eye on getting as much of Bart's work into 3.8 as possible. Hi Dave, Great to have you back. Certainly I'd like to get this stuff into 3.8 too. A couple of comments: - I think the srp_transport stuff should go through linux-scsi / James B. instead of my tree, esp. since it's shared with the IBM vscsi stuff (I think) - I see Bart had a few comments about a few of your patches, I'll wait for you guys to hash that out. Otherwise definitely happy to merge this for 3.8! Thanks, Roland -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/11] First pass at merging Bart's HA work
On Mon, 2012-11-26 at 10:50 -0800, Roland Dreier wrote: - I think the srp_transport stuff should go through linux-scsi / James B. instead of my tree, esp. since it's shared with the IBM vscsi stuff (I think) - I see Bart had a few comments about a few of your patches, I'll wait for you guys to hash that out. I'm amenable to that, but we do need an agreed patch set, as Roland says. I also hate to apply the pressure, but I suspect -rc7 was the last -rc, so I'm expecting the merge window to open on 2/12. James -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/11] First pass at merging Bart's HA work
I'm amenable to that, but we do need an agreed patch set, as Roland says. I also hate to apply the pressure, but I suspect -rc7 was the last -rc, so I'm expecting the merge window to open on 2/12. I think the srp_transport bits are all simple and non-controversial. So at least from my perspective, OK to merge right now, except that Dave mentioned he screwed up the attribution on one email, but that should be easy to fix with a quick resend. - R. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/11] First pass at merging Bart's HA work
On Mon, 2012-11-26 at 23:15 +0400, James Bottomley wrote: On Mon, 2012-11-26 at 10:50 -0800, Roland Dreier wrote: - I think the srp_transport stuff should go through linux-scsi / James B. instead of my tree, esp. since it's shared with the IBM vscsi stuff (I think) - I see Bart had a few comments about a few of your patches, I'll wait for you guys to hash that out. I'm amenable to that, but we do need an agreed patch set, as Roland says. I also hate to apply the pressure, but I suspect -rc7 was the last -rc, so I'm expecting the merge window to open on 2/12. We can push it through James's tree if need be, but Bart's code is pretty self-contained, and going through the SCSI tree will introduce merge dependencies. It'd be much easier to push it all through the RDMA tree, especially if we want to get this landed for 3.8. I'd want Fujita Tomonori, Robert Jennings, and James to ack the changes the SRP transport code, though they've been pending for a long time w/o comment so perhaps silence is consent? -- Dave Dillow National Center for Computational Science Oak Ridge National Laboratory (865) 241-6602 office -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/11] First pass at merging Bart's HA work
On 27/11/2012 06:04, David Dillow wrote: We can push it through James's tree if need be, but Bart's code is pretty self-contained, and going through the SCSI tree will introduce merge dependencies. It'd be much easier to push it all through the RDMA tree Yep, this makes sense to me even without taking into account the time left for the merge window to open, the patches have been around for long time and relate directly to the SRP code in the IB subsystem, there's no point in introducing merge dependencies where it can be avoided. Or. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 00/11] First pass at merging Bart's HA work
Here is a first, UNTESTED, pass at preparing a merge of Bart's SRP HA work to upstream. It is not complete, as I have not yet added the transport layer error handling and related patches. It is also currently missing the patch to maintain a single connection for an I_T nexus. I swapped Ishai's code to recreate QP/CQs for each connection, as that adds recovery from fatal hardware errors, and reduces the code needed to avoid stale completions. Similarly, Vu's patch slightly reduces reconnect times. Target blocking/unblocking hasn't been added from Bart's patch that removes the SRP_TARGET_CONNECTING state; I think it is probably the right thing to do, but want to think it over a bit more. Similarly, I'm looking for a clean way to start the reconnection effort as soon as we get an error through the CQ -- we know that any pending commands are lost to us at that point, so we should be able to kick them back upstream quickly. This should allow multipath to reissue them on one of the remaining good paths. This series compiles, but is otherwise UNTESTED. I'll be working on that over the next few days, with an eye on getting as much of Bart's work into 3.8 as possible. One may also pull this series from github: git pull git://github.com/dillow/srp-initiator.git ha-merge-v1 Bart Van Assche (7): IB/srp: enlarge block layer timeout IB/srp: keep processing commands during host removal IB/srp: Document sysfs attributes srp_transport: Fix attribute registration srp_transport: Simplify attribute initialization code srp_transport: Document sysfs attributes IB/srp: Allow SRP disconnect through sysfs David Dillow (2): IB/srp: simplify state tracking IB/srp: don't send anything on a bad QP Ishai Rabinovitz (1): IB/srp: destroy and recreate QP and CQs on each connection Vu Pham (1): IB/srp: send disconnect request without waiting for CM timewait exit Documentation/ABI/stable/sysfs-driver-ib_srp | 156 +++ Documentation/ABI/stable/sysfs-transport-srp | 19 ++ drivers/infiniband/ulp/srp/ib_srp.c | 274 +++--- drivers/infiniband/ulp/srp/ib_srp.h | 13 +- drivers/scsi/scsi_transport_srp.c| 51 +++--- include/scsi/scsi_transport_srp.h|8 + 6 files changed, 378 insertions(+), 143 deletions(-) create mode 100644 Documentation/ABI/stable/sysfs-driver-ib_srp create mode 100644 Documentation/ABI/stable/sysfs-transport-srp -- 1.7.7.6 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/11] First pass at merging Bart's HA work
On Mon, Nov 26, 2012 at 6:44 AM, David Dillow dillo...@ornl.gov wrote: One may also pull this series from github: git pull git://github.com/dillow/srp-initiator.git ha-merge-v1 Hi Dave, The kernel maintainers file specifies the following tree git://git.kernel.org/pub/scm/linux/kernel/git/dad/srp-initiator.git for you -- I assume you've moved to github during the kernel.org sitedown period, could you go back to kernel.org? its nice to have the same look and feel e.g through git web for the IB, SRP, Networking, SCSI etc trees. Or. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html