Re: [ofa-general] Re: [PATCH 2/3] Add combined routing support to libibnetdisc
On Thu, May 7, 2009 at 11:58 AM, Ira Weiny wei...@llnl.gov wrote: On Thu, 7 May 2009 09:56:38 -0400 Hal Rosenstock hal.rosenst...@gmail.com wrote: Ira, On Wed, May 6, 2009 at 12:33 PM, Ira Weiny wei...@llnl.gov wrote: On Wed, 6 May 2009 13:07:44 +0300 Sasha Khapyorsky sas...@voltaire.com wrote: [snip] And wouldn't it be better instead of resolving selfport on each extend_path() call to keep it already resolved somewhere in fabric structure? This will only happen 1 time for each fabric being scan'ed because the path is reused... Oh wait a minute, I just reviewed the code... For the current use case the path is reused since I am only scanning 1 node. However, in the general case this is not true. Sorry about that. A new patch is below. Does combined routing always fall back on failure to using directed routing ? No, not automatically in the library. Also, would you summarize the use cases for combined routing in ibnetdiscover ? ibnetdiscover does not use this feature. It does a full scan which results in only DR routing. iblinkinfo and ibqueryerrors have the ability to request output for a single node. The library was written to be able to scan from a given portid and a number of hops around that node. However, at first this only supported a DR path in the portid. If the user specified something like GUID iblinkinfo would scan the entire fabric and search the data which came back for that node. Of course the problem with is that on a large fabric it could take 8 seconds to come back with a single node of data. If the SM/SA is up and running I decided it would be better to query for the LID of that node and start the scan from there. That is what this patch adds. iblinkinfo and ibqueryerrors will call ibnd_discover_fabric with the from == to the portid resolved from the SA and hops == 1. If resolving the GUID or the limited scan fails ibqueryerrors and iblinkinfo then call the library again for a full fabric scan (from == NULL) and then search for the node in the fabric data returned. So that is the use case for doing this in the library. But once again ibnetdiscover does not use this. The other use case I could think of is doing a more extensive scan of multiple hops around a single node. I have not implemented this yet but in my early testing it worked just fine starting with a DR path. I believe this will still work with combined routing. Make sense? Yes, this makes sense. Thanks for clarifying. -- Hal Ira ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Re: [PATCH 2/3] Add combined routing support to libibnetdisc
Ira, On Wed, May 6, 2009 at 12:33 PM, Ira Weiny wei...@llnl.gov wrote: On Wed, 6 May 2009 13:07:44 +0300 Sasha Khapyorsky sas...@voltaire.com wrote: On 14:29 Thu 30 Apr , Ira Weiny wrote: From: Ira Weiny wei...@llnl.gov Date: Wed, 29 Apr 2009 10:15:55 -0700 Subject: [PATCH] Add combined routing support to libibnetdisc Also allow a scan to start at a switch. Signed-off-by: Ira Weiny wei...@llnl.gov --- infiniband-diags/libibnetdisc/src/ibnetdisc.c | 28 ++-- 1 files changed, 21 insertions(+), 7 deletions(-) diff --git a/infiniband-diags/libibnetdisc/src/ibnetdisc.c b/infiniband-diags/libibnetdisc/src/ibnetdisc.c index 0ff5134..fc19633 100644 --- a/infiniband-diags/libibnetdisc/src/ibnetdisc.c +++ b/infiniband-diags/libibnetdisc/src/ibnetdisc.c @@ -177,11 +177,26 @@ add_port_to_dpath(ib_dr_path_t *path, int nextport) } static int -extend_dpath(struct ibnd_fabric *f, ib_dr_path_t *path, int nextport) +extend_dpath(struct ibnd_fabric *f, ib_portid_t *portid, int nextport) { - int rc = add_port_to_dpath(path, nextport); - if ((rc != -1) (path-cnt f-fabric.maxhops_discovered)) - f-fabric.maxhops_discovered = path-cnt; + int rc = 0; + + if (portid-lid !portid-drpath.drslid) { + /* If we were LID routed + * AND have not done so already + * we need to set up the drslid + */ + ib_portid_t selfportid = { 0 }; + if (ib_resolve_self_via(selfportid, NULL, NULL, f-fabric.ibmad_port) 0) + return -1; And wouldn't it be better instead of resolving selfport on each extend_path() call to keep it already resolved somewhere in fabric structure? This will only happen 1 time for each fabric being scan'ed because the path is reused... Oh wait a minute, I just reviewed the code... For the current use case the path is reused since I am only scanning 1 node. However, in the general case this is not true. Sorry about that. A new patch is below. Does combined routing always fall back on failure to using directed routing ? Also, would you summarize the use cases for combined routing in ibnetdiscover ? -- Hal Ira From: Ira Weiny wei...@llnl.gov Date: Wed, 29 Apr 2009 10:15:55 -0700 Subject: [PATCH] Fix ibnd_discover when the specified ib_portid_t starts LID routed. Signed-off-by: Ira Weiny wei...@llnl.gov --- infiniband-diags/libibnetdisc/src/ibnetdisc.c | 27 ++-- infiniband-diags/libibnetdisc/src/internal.h | 1 + 2 files changed, 21 insertions(+), 7 deletions(-) diff --git a/infiniband-diags/libibnetdisc/src/ibnetdisc.c b/infiniband-diags/libibnetdisc/src/ibnetdisc.c index 0ff5134..1e93ff8 100644 --- a/infiniband-diags/libibnetdisc/src/ibnetdisc.c +++ b/infiniband-diags/libibnetdisc/src/ibnetdisc.c @@ -177,11 +177,25 @@ add_port_to_dpath(ib_dr_path_t *path, int nextport) } static int -extend_dpath(struct ibnd_fabric *f, ib_dr_path_t *path, int nextport) +extend_dpath(struct ibnd_fabric *f, ib_portid_t *portid, int nextport) { - int rc = add_port_to_dpath(path, nextport); - if ((rc != -1) (path-cnt f-fabric.maxhops_discovered)) - f-fabric.maxhops_discovered = path-cnt; + int rc = 0; + + if (portid-lid) { + /* If we were LID routed we need to set up the drslid */ + if (!f-selfportid.lid) + if (ib_resolve_self_via(f-selfportid, NULL, NULL, + f-fabric.ibmad_port) 0) + return -1; + + portid-drpath.drslid = f-selfportid.lid; + portid-drpath.drdlid = 0x; + } + + rc = add_port_to_dpath(portid-drpath, nextport); + + if ((rc != -1) (portid-drpath.cnt f-fabric.maxhops_discovered)) + f-fabric.maxhops_discovered = portid-drpath.cnt; return (rc); } @@ -447,7 +461,7 @@ get_remote_node(struct ibnd_fabric *fabric, struct ibnd_node *node, struct ibnd_ != IB_PORT_PHYS_STATE_LINKUP) return -1; - if (extend_dpath(fabric, path-drpath, portnum) 0) + if (extend_dpath(fabric, path, portnum) 0) return -1; if (query_node(fabric, node_buf, port_buf, path)) { @@ -546,8 +560,7 @@ ibnd_discover_fabric(struct ibmad_port *ibmad_port, int timeout_ms, if (!port) IBPANIC(out of memory); - if (node-node.type != IB_NODE_SWITCH - get_remote_node(fabric, node, port, from, + if(get_remote_node(fabric, node, port, from, mad_get_field(node-node.info, 0, IB_NODE_LOCAL_PORT_F), 0) 0) return ((ibnd_fabric_t *)fabric); diff --git a/infiniband-diags/libibnetdisc/src/internal.h
Re: [ofa-general] Re: [PATCH 2/3] Add combined routing support to libibnetdisc
On Thu, 7 May 2009 09:56:38 -0400 Hal Rosenstock hal.rosenst...@gmail.com wrote: Ira, On Wed, May 6, 2009 at 12:33 PM, Ira Weiny wei...@llnl.gov wrote: On Wed, 6 May 2009 13:07:44 +0300 Sasha Khapyorsky sas...@voltaire.com wrote: [snip] And wouldn't it be better instead of resolving selfport on each extend_path() call to keep it already resolved somewhere in fabric structure? This will only happen 1 time for each fabric being scan'ed because the path is reused... Oh wait a minute, I just reviewed the code... For the current use case the path is reused since I am only scanning 1 node. However, in the general case this is not true. Sorry about that. A new patch is below. Does combined routing always fall back on failure to using directed routing ? No, not automatically in the library. Also, would you summarize the use cases for combined routing in ibnetdiscover ? ibnetdiscover does not use this feature. It does a full scan which results in only DR routing. iblinkinfo and ibqueryerrors have the ability to request output for a single node. The library was written to be able to scan from a given portid and a number of hops around that node. However, at first this only supported a DR path in the portid. If the user specified something like GUID iblinkinfo would scan the entire fabric and search the data which came back for that node. Of course the problem with is that on a large fabric it could take 8 seconds to come back with a single node of data. If the SM/SA is up and running I decided it would be better to query for the LID of that node and start the scan from there. That is what this patch adds. iblinkinfo and ibqueryerrors will call ibnd_discover_fabric with the from == to the portid resolved from the SA and hops == 1. If resolving the GUID or the limited scan fails ibqueryerrors and iblinkinfo then call the library again for a full fabric scan (from == NULL) and then search for the node in the fabric data returned. So that is the use case for doing this in the library. But once again ibnetdiscover does not use this. The other use case I could think of is doing a more extensive scan of multiple hops around a single node. I have not implemented this yet but in my early testing it worked just fine starting with a DR path. I believe this will still work with combined routing. Make sense? Ira ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general