This patch is incomplete. Although it shouldn't hurt to apply, it prepares for a functional change but does not quite get there, and would need a bit of polishing.
Problem - Running "dhcp; sanhook <args>; sanboot" propely initiates an SRP connection, and starts the boot loader. iPXE nevers performs an SRP_I_LOGOUT, even though it's already defined in include/ipxe/srp.h. After the kernel has booted and the sBFT parsed, attempting to start an SRP connection confuses the target machine's kernel, causing SRP commands to abort for about 10-30 seconds. The only workaround I can find once booted is to continually repeat attempting the SRP connection, waiting a second, and retrying. The error I get for 10-30 seconds is: ===== scsi host7: ib_srp: Got failed path rec status -22 scsi host7: ib_srp: Path record query failed scsi host7: ib_srp: Connection 0/4 failed scsi host7: ib_srp: Sending CM DREQ failed ===== After 10-30 seconds, the target kernel shows: ===== [ 115.941418] ib_srpt Relogin - closed existing channel 0x1d534b0003c902000002c903004b5275 [ 115.941470] ib_srpt Received CM DREP message for ch 0x1d534b0003c902000002c903004b5275-522. ===== And the new SRP connection is made. I'm successfully able to make the SRP connection, and successfully boot with the new root. With interesting timing, kernel 4.7 just got released in Arch linux 13 days ago. And, in 4.7, ib_srpt was patched to convert to the generic "RDMA READ/WRITE API" which never closes the existing channel and allows a new connection, but just blocks a new connection forever. So the issue started with < kernel 4.7 as just a 10-30 second delay during bootup, but now with >= 4.7 completely blocks booting. (I'm going to follow up with a kernel regression, but the best way to handle this is to close the initial iPXE connection.) This patch implements the missing srp_initiator_logout(), and modifies srpdev_close() to call it if srpdev->logged_in is set. However, in a successful "dhcp; sanhook <args>; sanboot" sequence, srpdev_close() is never called. (The (3) dbg_printf's I added in srpdev_close, and (1) in ib_srp.c, never execute - only temporarily using dbg_printf to avoid building with DEBUG to only see these debug errors.) The target kernel never gets an SRP_I_LOGOUT command. srpdev_close is only referenced in src/drivers/block/srp.c in these locations: (1) In srpdev_deliver, within the label err. (2) In srp_open(), within the labels err_scsi_open and err_login. (3) In static struct interface_operation srpdev_socket_op[] as INTF_OP ( intf_close, struct srp_device *, srpdev_close ) (4) In static struct interface_operation srpdev_scsi_op[] as INTF_OP ( intf_close, struct srp_device *, srpdev_close ) srpdev_socket_op is only referenced in srp.c within static struct interface_descriptor srpdev_socket_desc as INTF_DESC_PASSTHRU ( struct srp_device, socket, srpdev_socket_op, scsi ) srpdev_scsi_op is only referenced in srp.c within static struct interface_descriptor srpdev_scsi_desc as INTF_DESC_PASSTHRU ( struct srp_device, scsi, srpdev_scsi_op, socket ) srpdev_scsi_desc and srpdev_socket_desc are only referenced in srp.c::srp_open, given to intf_init Looking in net/infiniband/ib_srp.c::ib_srp_close(), it does call intf_shutdown( &ib_srp->cmrc, rc ) and intf_shutdown ( &ib_srp->srp, rc ). At about that point, I started thinking that iPXE probably never knows that it should fully exit, and that it's probably left in lowmem forever. Am I right here? I'm not sure there's a point where iPXE currently knows it can safely fully exit and disable the interrupt hook. So, I'm looking for guidance. Do we need to add to the sBFT specification, to include a memory address of new function which would call srp_initiator_logout with its struct srp_device *srpdev, if I'm correct that iPXE is basically staying in low mem forever and never really exiting? Not sure if I'm understanding that correctly. Patch is below and attached. Submitted by: James P. Harvey <jamespharve...@gmail.com> --- src/drivers/block/srp.c | 59 +++++++++++++++++++++++++++++++++++++++++++++ src/net/infiniband/ib_srp.c | 1 + 2 files changed, 60 insertions(+) diff --git a/src/drivers/block/srp.c b/src/drivers/block/srp.c index 85c7aa8..8ab4a97 100644 --- a/src/drivers/block/srp.c +++ b/src/drivers/block/srp.c @@ -229,15 +229,33 @@ static void srpcmd_close ( struct srp_command *srpcmd, int rc ) { * @v srpdev SRP device * @v rc Reason for close */ +static int srp_new_tag ( struct srp_device *srpdev ); +static int srp_initiator_logout ( struct srp_device *srpdev, uint32_t tag ); static void srpdev_close ( struct srp_device *srpdev, int rc ) { struct srp_command *srpcmd; struct srp_command *tmp; + dbg_printf ( "!srpdev_close()\n" ); if ( rc != 0 ) { DBGC ( srpdev, "SRP %p closed: %s\n", srpdev, strerror ( rc ) ); } + /* If logged in, attempt logging out */ + if ( srpdev->logged_in ) { + dbg_printf ( "!srpdev->logged_in true\n" ); + int logout_rc; + int tag; + + tag = srp_new_tag ( srpdev ); + if ( tag < 0 ) + DBGC ( srpdev, "SRP %p tag %08x invalid tag\n", srpdev, tag ); + else if ( ( logout_rc = srp_initiator_logout ( srpdev, tag ) ) != 0 ) + DBGC ( srpdev, "SRP %p tag %08x cannot logout\n", srpdev, tag ); + } else { + dbg_printf ( "!srpdev->logged_in false\n" ); + } + /* Shut down interfaces */ intf_shutdown ( &srpdev->socket, rc ); intf_shutdown ( &srpdev->scsi, rc ); @@ -393,6 +411,47 @@ static int srp_login_rej ( struct srp_device *srpdev, } /** + * Transmit SRP initiator logout + * + * @v srpdev SRP device + * @v tag Command tag + * @ret rc Return status code + */ +static int srp_initiator_logout ( struct srp_device *srpdev, uint32_t tag ) { + struct io_buffer *iobuf; + struct srp_i_logout *i_logout; + int rc; + + /* Allocate I/O buffer */ + iobuf = xfer_alloc_iob ( &srpdev->socket, sizeof ( *i_logout ) ); + if ( ! iobuf ) + return -ENOMEM; + + /* Construct initiator logout IU */ + i_logout = iob_put ( iobuf, sizeof ( *i_logout ) ); + memset ( i_logout, 0, sizeof ( *i_logout ) ); + i_logout->type = SRP_I_LOGOUT; + i_logout->tag.dwords[0] = htonl ( SRP_TAG_MAGIC ); + i_logout->tag.dwords[1] = htonl ( tag ); + + DBGC ( srpdev, "SRP %p tag %08x I_LOGOUT:\n", srpdev, tag ); + DBGC_HDA ( srpdev, 0, iobuf->data, iob_len ( iobuf ) ); + + /* Send initiator logout IU */ + if ( ( rc = xfer_deliver_iob ( &srpdev->socket, iobuf ) ) != 0 ) { + DBGC ( srpdev, "SRP %p tag %08x could not send I_LOGOUT: " + "%s\n", srpdev, tag, strerror ( rc ) ); + return rc; + } + + /* Mark as logged out */ + srpdev->logged_in = 0; + DBGC ( srpdev, "SRP %p logged out\n", srpdev ); + + return 0; +} + +/** * Transmit SRP SCSI command * * @v srpdev SRP device diff --git a/src/net/infiniband/ib_srp.c b/src/net/infiniband/ib_srp.c index 904a99b..2c718ef 100644 --- a/src/net/infiniband/ib_srp.c +++ b/src/net/infiniband/ib_srp.c @@ -106,6 +106,7 @@ static void ib_srp_free ( struct refcnt *refcnt ) { * @v rc Reason for close */ static void ib_srp_close ( struct ib_srp_device *ib_srp, int rc ) { + dbg_printf ("ib_srp_close\n" ); /* Shut down interfaces */ intf_shutdown ( &ib_srp->cmrc, rc ); intf_shutdown ( &ib_srp->srp, rc ); -- 2.9.0
From c9f2a3fc08ab5a508fa5f42ef9e1b8ddeaa26c0c Mon Sep 17 00:00:00 2001 From: "James P. Harvey" <jamespharve...@gmail.com> Date: Fri, 19 Aug 2016 23:52:41 -0400 Subject: [PATCH] Initial work on closing SRP connection --- src/drivers/block/srp.c | 59 +++++++++++++++++++++++++++++++++++++++++++++ src/net/infiniband/ib_srp.c | 1 + 2 files changed, 60 insertions(+) diff --git a/src/drivers/block/srp.c b/src/drivers/block/srp.c index 85c7aa8..8ab4a97 100644 --- a/src/drivers/block/srp.c +++ b/src/drivers/block/srp.c @@ -229,15 +229,33 @@ static void srpcmd_close ( struct srp_command *srpcmd, int rc ) { * @v srpdev SRP device * @v rc Reason for close */ +static int srp_new_tag ( struct srp_device *srpdev ); +static int srp_initiator_logout ( struct srp_device *srpdev, uint32_t tag ); static void srpdev_close ( struct srp_device *srpdev, int rc ) { struct srp_command *srpcmd; struct srp_command *tmp; + dbg_printf ( "!srpdev_close()\n" ); if ( rc != 0 ) { DBGC ( srpdev, "SRP %p closed: %s\n", srpdev, strerror ( rc ) ); } + /* If logged in, attempt logging out */ + if ( srpdev->logged_in ) { + dbg_printf ( "!srpdev->logged_in true\n" ); + int logout_rc; + int tag; + + tag = srp_new_tag ( srpdev ); + if ( tag < 0 ) + DBGC ( srpdev, "SRP %p tag %08x invalid tag\n", srpdev, tag ); + else if ( ( logout_rc = srp_initiator_logout ( srpdev, tag ) ) != 0 ) + DBGC ( srpdev, "SRP %p tag %08x cannot logout\n", srpdev, tag ); + } else { + dbg_printf ( "!srpdev->logged_in false\n" ); + } + /* Shut down interfaces */ intf_shutdown ( &srpdev->socket, rc ); intf_shutdown ( &srpdev->scsi, rc ); @@ -393,6 +411,47 @@ static int srp_login_rej ( struct srp_device *srpdev, } /** + * Transmit SRP initiator logout + * + * @v srpdev SRP device + * @v tag Command tag + * @ret rc Return status code + */ +static int srp_initiator_logout ( struct srp_device *srpdev, uint32_t tag ) { + struct io_buffer *iobuf; + struct srp_i_logout *i_logout; + int rc; + + /* Allocate I/O buffer */ + iobuf = xfer_alloc_iob ( &srpdev->socket, sizeof ( *i_logout ) ); + if ( ! iobuf ) + return -ENOMEM; + + /* Construct initiator logout IU */ + i_logout = iob_put ( iobuf, sizeof ( *i_logout ) ); + memset ( i_logout, 0, sizeof ( *i_logout ) ); + i_logout->type = SRP_I_LOGOUT; + i_logout->tag.dwords[0] = htonl ( SRP_TAG_MAGIC ); + i_logout->tag.dwords[1] = htonl ( tag ); + + DBGC ( srpdev, "SRP %p tag %08x I_LOGOUT:\n", srpdev, tag ); + DBGC_HDA ( srpdev, 0, iobuf->data, iob_len ( iobuf ) ); + + /* Send initiator logout IU */ + if ( ( rc = xfer_deliver_iob ( &srpdev->socket, iobuf ) ) != 0 ) { + DBGC ( srpdev, "SRP %p tag %08x could not send I_LOGOUT: " + "%s\n", srpdev, tag, strerror ( rc ) ); + return rc; + } + + /* Mark as logged out */ + srpdev->logged_in = 0; + DBGC ( srpdev, "SRP %p logged out\n", srpdev ); + + return 0; +} + +/** * Transmit SRP SCSI command * * @v srpdev SRP device diff --git a/src/net/infiniband/ib_srp.c b/src/net/infiniband/ib_srp.c index 904a99b..2c718ef 100644 --- a/src/net/infiniband/ib_srp.c +++ b/src/net/infiniband/ib_srp.c @@ -106,6 +106,7 @@ static void ib_srp_free ( struct refcnt *refcnt ) { * @v rc Reason for close */ static void ib_srp_close ( struct ib_srp_device *ib_srp, int rc ) { + dbg_printf ("ib_srp_close\n" ); /* Shut down interfaces */ intf_shutdown ( &ib_srp->cmrc, rc ); intf_shutdown ( &ib_srp->srp, rc ); -- 2.9.0
_______________________________________________ ipxe-devel mailing list ipxe-devel@lists.ipxe.org https://lists.ipxe.org/mailman/listinfo.cgi/ipxe-devel