Re: [ofa-general] [PATCH 2/2 v4] opensm: Compression of multicast group according to pkey
On Tue, Sep 29, 2009 at 9:54 AM, Slava Strebkov sla...@voltaire.com wrote: Additional data structure added: 1. Map of all partition keys opened in the fabric. 2. Map of all multicast group boxes shared same pkey. MLID assignment for multicast groups works in a usual manner, allocating free entry for newly created group. Proposed compression algorithm starts working when there are no more free entries in the mlid array. List of MLIDs for new multicast group will be chosen from the pkey indexed map according to the requested pkey. MLID which shares minimum number of ports will be given to newly created multicast group. Other suitability criteria aside from minimum number of ports (which is debatable), are MTU and rate matching. Are MTU and rate also checked (in addition to pkey) ? If not, IMO these checks should be added. -- Hal Signed-off-by: Slava Strebkov sla...@voltaire.com snip... ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] This list expires... tomorrow?
On Tue, Sep 29, 2009 at 8:40 PM, Jeff Becker jeffrey.c.bec...@nasa.gov wrote: Hal Rosenstock wrote: On Tue, Sep 29, 2009 at 3:06 PM, Jeff Becker jeffrey.c.bec...@nasa.gov wrote: Hi all. I propose the following plan to shutdown the general list: 1) unsubscribe all current subscribers 2) set the list to discard any incoming messages with an auto-discard message that points you to linux-r...@vger.kernel.org Please send comments/suggestions. Care should be taken on any patches not cross posted (to linux-rdma) once the cutover takes place. There are quite a number of outstanding patches on general only. From tomorrow on, the general list will continue to exist with searchable archives, but no new messages will be accepted. People who try to send to general will be told to send to linux-r...@vger.kernel.org instead. If someone posted a patch to general before the switch and it hasn't been accepted, they can repost to the new list. Hope this works for everyone. Thanks Sure; these could be reposted to linux-rdma if needed. I was trying to say that care should be taken to check the email addresses prior to hitting reply/reply all so threads will be moved over to linux-rdma. -- Hal snip... ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] opensm/osm_sa_lft_record.c: In lftr_rcv_new_lftr, handle osm_switch_get_lft_block failure
Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/opensm/opensm/osm_sa_lft_record.c b/opensm/opensm/osm_sa_lft_record.c index d092129..828b277 100644 --- a/opensm/opensm/osm_sa_lft_record.c +++ b/opensm/opensm/osm_sa_lft_record.c @@ -99,8 +99,12 @@ static ib_api_status_t lftr_rcv_new_lftr(IN osm_sa_t * sa, p_rec_item-rec.block_num = cl_hton16(block); /* copy the lft block */ - osm_switch_get_lft_block(p_sw, block, p_rec_item-rec.lft); - + if (!osm_switch_get_lft_block(p_sw, block, p_rec_item-rec.lft)) { + OSM_LOG(sa-p_log, OSM_LOG_ERROR, ERR 4403: + osm_switch_get_lft_block failed\n); + status = IB_INSUFFICIENT_RESOURCES; + goto Exit; + } cl_qlist_insert_tail(p_list, p_rec_item-list_item); Exit: ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCHv2] opensm/osm_mesh.c: Add dump_mesh routine at OSM_LOG_DEBUG level
Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- Changes since v1: Use snprintf rather than sprintf Also, moved output of ] diff --git a/opensm/opensm/osm_mesh.c b/opensm/opensm/osm_mesh.c index 260e2f8..53f0f58 100644 --- a/opensm/opensm/osm_mesh.c +++ b/opensm/opensm/osm_mesh.c @@ -1565,6 +1565,63 @@ err: return -1; } +static void dump_mesh(lash_t *p_lash) +{ + osm_log_t *p_log = p_lash-p_osm-log; + int sw; + int num_switches = p_lash-num_switches; + int dimension; + int i, j, k, n; + switch_t *s, *s2; + char buf[256]; + + OSM_LOG_ENTER(p_log); + + for (sw = 0; sw num_switches; sw++) { + s = p_lash-switches[sw]; + dimension = s-node-dimension; + n = sprintf(buf, [); + for (i = 0; i dimension; i++) { + n += snprintf(buf + n, sizeof(buf) - n, + %2d, s-node-coord[i]); + if (n sizeof(buf)) + n = sizeof(buf); + if (i != dimension - 1) { + n += snprintf(buf + n, sizeof(buf) - n, %s, ,); + if (n sizeof(buf)) + n = sizeof(buf); + } + } + n += snprintf(buf + n, sizeof(buf) - n, ]); + if (n sizeof(buf)) + n = sizeof(buf); + for (j = 0; j s-node-num_links; j++) { + s2 = p_lash-switches[s-node-links[j]-switch_id]; + n += snprintf(buf + n, sizeof(buf) - n, [%d]-[, j); + if (n sizeof(buf)) + n = sizeof(buf); + for (k = 0; k dimension; k++) { + n += snprintf(buf + n, sizeof(buf) - n, %2d, + s2-node-coord[k]); + if (n sizeof(buf)) + n = sizeof(buf); + if (k != dimension - 1) { + n += snprintf(buf + n, sizeof(buf) - n, + ,); + if (n sizeof(buf)) + n = sizeof(buf); + } + } + n += snprintf(buf + n, sizeof(buf) - n, ]); + if (n sizeof(buf)) + n = sizeof(buf); + } + OSM_LOG(p_log, OSM_LOG_DEBUG, %s\n, buf); + } + + OSM_LOG_EXIT(p_log); +} + /* * osm_do_mesh_analysis */ @@ -1653,6 +1710,9 @@ int osm_do_mesh_analysis(lash_t *p_lash) OSM_LOG(p_log, OSM_LOG_INFO, %s, buf); } + if (osm_log_is_active(p_log, OSM_LOG_DEBUG)) + dump_mesh(p_lash); + done: mesh_delete(mesh); OSM_LOG_EXIT(p_log); ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] This list expires... tomorrow?
On Tue, Sep 29, 2009 at 3:06 PM, Jeff Becker jeffrey.c.bec...@nasa.gov wrote: Hi all. I propose the following plan to shutdown the general list: 1) unsubscribe all current subscribers 2) set the list to discard any incoming messages with an auto-discard message that points you to linux-r...@vger.kernel.org Please send comments/suggestions. It's probably just me but I'm not ready yet. I haven't been able to post a patch to linux-rdma yet :-( -- Hal Thanks. -jeff Jeff Squyres wrote: What happens to this list after tomorrow? (i.e., general@lists.openfabrics.org ) Will mails bounce? The intent is that all mails to the general list should be sent to the linux-rdma list instead, right? ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] This list expires... tomorrow?
On Tue, Sep 29, 2009 at 5:20 PM, Roland Dreier rdre...@cisco.com wrote: It's probably just me but I'm not ready yet. I haven't been able to post a patch to linux-rdma yet :-( What is going wrong when you try? It disappears into the ether without any response. I can see it getting a status=Sent out of my SMTP relay to the linux-rdma list saying Message accepted for delivery: Sep 29 06:53:53 hal sm-msp-queue[26670]: n8TAr2Ae026642: to=gene...@lists.openfabrics.org,linux-r...@vger.linux.org,sas...@voltaire.com, ctladdr=hnrose (502/502), delay=00:00:51, xdelay=00:00:00, mailer=relay, pri=182536, relay=[127.0.0.1] [127.0.0.1], dsn=2.0.0, stat=Sent (n8TArlkX026673 Message accepted for delivery) Sep 29 15:24:19 hal sendmail[28326]: n8TJOIEP028326: from=hnrose, size=2481, class=0, nrcpts=1, msgid=20090929192417.ga28...@comcast.net, relay=hnr...@localhost -- Hal - R. ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Re: [PATCH] ibsim/sim_cmd.c: Only relink port if remote port is currently linked
Hi Sasha, On Tue, Sep 29, 2009 at 5:53 PM, Sasha Khapyorsky sas...@voltaire.com wrote: Hi Hal, On 15:34 Thu 24 Sep , Hal Rosenstock wrote: When multiple switches are unlinked and then a switch is relinked, it should behave like a cable pull or power down of switch so it depends on the state of the remote peer port (as to linked or not). This is not represented in the IB port/port physical state and is additional state. I'm not sure that I understand what this patch tries to achieve - I cannot see any changes related to port physical state handling. I can only see that you try to prevent linking with previously unlinked ports, and it is not clear for me why. Could you explain? The failure scenario is to unlink 2 connected switches and then relink the first one. It then relinks the second one even though it still should be unlinked. Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/ibsim/sim.h b/ibsim/sim.h index bf85875..52eb73b 100644 --- a/ibsim/sim.h +++ b/ibsim/sim.h @@ -210,6 +211,7 @@ struct Port { int remoteport; Node *previous_remotenode; int previous_remoteport; + int unlinked; Do you really need this flag? Existence of non NULL previous_remotenode pointer should be good indication. That's how I started (using previous_remotenode) but it didn't work correctly for all cases. It worked with the simple case above (unlink 2 switches and relink the first). It didn't work with a 3 switch case. -- Hal int errrate; uint16_t errattr; Node *node; diff --git a/ibsim/sim_cmd.c b/ibsim/sim_cmd.c index cb6e639..d27ab0f 100644 --- a/ibsim/sim_cmd.c +++ b/ibsim/sim_cmd.c @@ -1,5 +1,6 @@ /* * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. + * Copyright (c) 2009 HNR Consulting. All rights reserved. * * This file is part of ibsim. * @@ -146,12 +147,18 @@ static int do_link(FILE * f, char *line) rport = node_get_port(rnode, rportnum); + if (rport-unlinked) { + lport-unlinked = 0; + return -1; + } + Why? if (link_ports(lport, rport) 0) return -fprintf(f, # can't link: local/remote port are already connected\n); lport-previous_remotenode = NULL; rport-previous_remotenode = NULL; + lport-unlinked = 0; return 0; } @@ -194,7 +201,7 @@ static int do_relink(FILE * f, char *line) numports++; // To make the for-loop below run up to last port else lportnum--; - + if (lportnum = 0) { lport = ports + lnode-portsbase + lportnum; @@ -206,12 +213,18 @@ static int do_relink(FILE * f, char *line) rport = node_get_port(lport-previous_remotenode, lport-previous_remoteport); + if (rport-unlinked) { + lport-unlinked = 0; + return -1; + } + Why? if (link_ports(lport, rport) 0) return -fprintf(f, # can't link: local/remote port are already connected\n); lport-previous_remotenode = NULL; rport-previous_remotenode = NULL; + lport-unlinked = 0; return 1; } @@ -224,11 +237,17 @@ static int do_relink(FILE * f, char *line) rport = node_get_port(lport-previous_remotenode, lport-previous_remoteport); + if (rport-unlinked) { + lport-unlinked = 0; + continue; + } + Ditto. Sasha if (link_ports(lport, rport) 0) continue; lport-previous_remotenode = NULL; rport-previous_remotenode = NULL; + lport-unlinked = 0; relinked++; } @@ -246,6 +265,7 @@ static void unlink_port(Node * lnode, Port * lport, Node * rnode, int rportnum) lport-previous_remoteport = lport-remoteport; rport-previous_remotenode = rport-remotenode; rport-previous_remoteport = rport-remoteport; + lport-unlinked = 1; lport-remotenode = rport-remotenode = 0; lport-remoteport = rport-remoteport = 0; @@ -406,6 +426,7 @@ static int do_unlink(FILE * f, char *line, int clear) if (portnum = 0) { port = ports + node-portsbase + portnum; if (!clear !port-remotenode) { + port-unlinked = 1; fprintf(f, # port %d at nodeid \%s\ is not linked\n, portnum, nodeid); return -1; @@ -420,8 +441,10 @@ static int do_unlink(FILE * f, char *line, int clear) for (port = ports + node-portsbase, e = port + numports; port e; port++) { - if (!clear !port-remotenode
Re: [ofa-general] Re: [PATCH] ibsim/sim_cmd.c: Only relink port if remote port is currently linked
Hi again Sasha, In my previous post, I missed answering some of your (implied) questions. On Tue, Sep 29, 2009 at 5:53 PM, Sasha Khapyorsky sas...@voltaire.com wrote: Hi Hal, On 15:34 Thu 24 Sep , Hal Rosenstock wrote: When multiple switches are unlinked and then a switch is relinked, it should behave like a cable pull or power down of switch so it depends on the state of the remote peer port (as to linked or not). This is not represented in the IB port/port physical state and is additional state. I'm not sure that I understand what this patch tries to achieve - I cannot see any changes related to port physical state handling. Right; that's because there is none. My point was that this condition (e.g. simulated power off switch) cannot be represented in IB port state or port physical state. I can only see that you try to prevent linking with previously unlinked ports, Yes. -- Hal and it is not clear for me why. Could you explain? Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/ibsim/sim.h b/ibsim/sim.h index bf85875..52eb73b 100644 --- a/ibsim/sim.h +++ b/ibsim/sim.h @@ -210,6 +211,7 @@ struct Port { int remoteport; Node *previous_remotenode; int previous_remoteport; + int unlinked; Do you really need this flag? Existence of non NULL previous_remotenode pointer should be good indication. int errrate; uint16_t errattr; Node *node; diff --git a/ibsim/sim_cmd.c b/ibsim/sim_cmd.c index cb6e639..d27ab0f 100644 --- a/ibsim/sim_cmd.c +++ b/ibsim/sim_cmd.c @@ -1,5 +1,6 @@ /* * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. + * Copyright (c) 2009 HNR Consulting. All rights reserved. * * This file is part of ibsim. * @@ -146,12 +147,18 @@ static int do_link(FILE * f, char *line) rport = node_get_port(rnode, rportnum); + if (rport-unlinked) { + lport-unlinked = 0; + return -1; + } + Why? if (link_ports(lport, rport) 0) return -fprintf(f, # can't link: local/remote port are already connected\n); lport-previous_remotenode = NULL; rport-previous_remotenode = NULL; + lport-unlinked = 0; return 0; } @@ -194,7 +201,7 @@ static int do_relink(FILE * f, char *line) numports++; // To make the for-loop below run up to last port else lportnum--; - + if (lportnum = 0) { lport = ports + lnode-portsbase + lportnum; @@ -206,12 +213,18 @@ static int do_relink(FILE * f, char *line) rport = node_get_port(lport-previous_remotenode, lport-previous_remoteport); + if (rport-unlinked) { + lport-unlinked = 0; + return -1; + } + Why? if (link_ports(lport, rport) 0) return -fprintf(f, # can't link: local/remote port are already connected\n); lport-previous_remotenode = NULL; rport-previous_remotenode = NULL; + lport-unlinked = 0; return 1; } @@ -224,11 +237,17 @@ static int do_relink(FILE * f, char *line) rport = node_get_port(lport-previous_remotenode, lport-previous_remoteport); + if (rport-unlinked) { + lport-unlinked = 0; + continue; + } + Ditto. Sasha if (link_ports(lport, rport) 0) continue; lport-previous_remotenode = NULL; rport-previous_remotenode = NULL; + lport-unlinked = 0; relinked++; } @@ -246,6 +265,7 @@ static void unlink_port(Node * lnode, Port * lport, Node * rnode, int rportnum) lport-previous_remoteport = lport-remoteport; rport-previous_remotenode = rport-remotenode; rport-previous_remoteport = rport-remoteport; + lport-unlinked = 1; lport-remotenode = rport-remotenode = 0; lport-remoteport = rport-remoteport = 0; @@ -406,6 +426,7 @@ static int do_unlink(FILE * f, char *line, int clear) if (portnum = 0) { port = ports + node-portsbase + portnum; if (!clear !port-remotenode) { + port-unlinked = 1; fprintf(f, # port %d at nodeid \%s\ is not linked\n, portnum, nodeid); return -1; @@ -420,8 +441,10 @@ static int do_unlink(FILE * f, char *line, int clear) for (port = ports + node-portsbase, e = port + numports; port e; port++) { - if (!clear !port-remotenode) + if (!clear !port-remotenode) { + port-unlinked = 1
Re: [ofa-general] This list expires... tomorrow?
On Tue, Sep 29, 2009 at 3:06 PM, Jeff Becker jeffrey.c.bec...@nasa.gov wrote: Hi all. I propose the following plan to shutdown the general list: 1) unsubscribe all current subscribers 2) set the list to discard any incoming messages with an auto-discard message that points you to linux-r...@vger.kernel.org Please send comments/suggestions. Care should be taken on any patches not cross posted (to linux-rdma) once the cutover takes place. There are quite a number of outstanding patches on general only. -- Hal Thanks. -jeff Jeff Squyres wrote: What happens to this list after tomorrow? (i.e., general@lists.openfabrics.org ) Will mails bounce? The intent is that all mails to the general list should be sent to the linux-rdma list instead, right? ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] infiniband-diags/perfquery.c: Fix extended counter reset mask
to not have any bits on for reserved components Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/infiniband-diags/src/perfquery.c b/infiniband-diags/src/perfquery.c index d70af9e..5d4046b 100644 --- a/infiniband-diags/src/perfquery.c +++ b/infiniband-diags/src/perfquery.c @@ -91,6 +91,8 @@ struct perf_count perf_count = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; struct perf_count_ext perf_count_ext = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; +int not_def_mask = 0; + #define ALL_PORTS 0xFF /* Notes: IB semantics is to cap counters if count has exceeded limits. @@ -337,8 +339,10 @@ static void reset_counters(int extended, int timeout, int mask, IB_GSI_PORT_COUNTERS, srcport)) IBERROR(perf reset); } else { - if (!performance_reset_via(pc, portid, port, mask, timeout, - IB_GSI_PORT_COUNTERS_EXT, srcport)) + if (!performance_reset_via(pc, portid, port, + not_def_mask ? mask : mask 0xff, + timeout, IB_GSI_PORT_COUNTERS_EXT, + srcport)) IBERROR(perf ext reset); } } @@ -476,8 +480,10 @@ int main(int argc, char **argv) if (argc 1) port = strtoul(argv[1], 0, 0); - if (argc 2) + if (argc 2) { mask = strtoul(argv[2], 0, 0); + not_def_mask = 1; + } srcport = mad_rpc_open_port(ibd_ca, ibd_ca_port, mgmt_classes, 4); if (!srcport) ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] [PATCH] infiniband-diags/src/ibqueryerrors: Add clear errors and counters options
Ira, See one minor comment below: On Fri, Sep 25, 2009 at 2:50 AM, Ira Weiny wei...@llnl.gov wrote: Sasha, This applies after infiniband-diags/src/ibqueryerrors: move --all option and replace it with --switch, --ca, --router From: Ira Weiny wei...@llnl.gov Date: Thu, 24 Sep 2009 20:39:29 -0700 Subject: [PATCH] infiniband-diags/src/ibqueryerrors: Add clear errors and counters options Add -k and -K options to clear errors and counters. If both are specified they will both be cleared. Nice efficiency improvement over running a subsequent ibclearerrors/counters :-) Update man page In addition fix 2 bugs fix the printing of Xmt Wait errors properly skip the counter select field. Signed-off-by: Ira Weiny wei...@llnl.gov --- infiniband-diags/man/ibqueryerrors.8 | 20 +-- infiniband-diags/src/ibqueryerrors.c | 91 + 2 files changed, 94 insertions(+), 17 deletions(-) snip... diff --git a/infiniband-diags/src/ibqueryerrors.c b/infiniband-diags/src/ibqueryerrors.c index ecfd662..e379a42 100644 --- a/infiniband-diags/src/ibqueryerrors.c +++ b/infiniband-diags/src/ibqueryerrors.c snip... +static void clear_port(ib_portid_t * portid, uint16_t cap_mask, + ibnd_node_t * node, int port) +{ + uint8_t pc[1024]; + /* bits defined in Table 228 PortCounters CounterSelect and +* CounterSelect2 +*/ + uint32_t mask = 0; + + if (!clear_errors !clear_counts) + return; + + if (clear_errors) + mask |= 0x10FFF; Since PortXmitWait setting is new, shouldn't the setting of this bit in the mask be conditionalized on the CapabilityMask indicating that this is supported ? That seems safer to me. -- Hal + if (clear_counts) + mask |= 0xF000; + + if (!performance_reset_via(pc, portid, port, mask, ibd_timeout, + IB_GSI_PORT_COUNTERS, ibmad_port)) + IBERROR(Failed to reset errors %s port %d, + node-nodedesc, port); +} + snip... ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Re: [PATCH] opensm/osm_ucast_lash: fix use after free bug
On 9/25/09, Sasha Khapyorsky sas...@voltaire.com wrote: When LASH runs its switch structures cleanup OpenSM can rediscover a subnet and 'p_sw' pointer may refer already freed memory, so don't touch it, just free our own stuff. (Note also that for valids OpenSM switches objects' 'priv' pointers are cleared on lash_cleanup()). Signed-off-by: Sasha Khapyorsky sas...@voltaire.com Tested-by: Hal Rosenstock hal.rosenst...@gmail.com --- opensm/opensm/osm_ucast_lash.c |5 + 1 files changed, 1 insertions(+), 4 deletions(-) diff --git a/opensm/opensm/osm_ucast_lash.c b/opensm/opensm/osm_ucast_lash.c index dbc6bcc..3c424cb 100644 --- a/opensm/opensm/osm_ucast_lash.c +++ b/opensm/opensm/osm_ucast_lash.c @@ -628,8 +628,7 @@ static switch_t *switch_create(lash_t * p_lash, unsigned id, osm_switch_t * p_sw } sw-p_sw = p_sw; - if (p_sw) - p_sw-priv = sw; + p_sw-priv = sw; if (osm_mesh_node_create(p_lash, sw)) { free(sw-dij_channels); @@ -644,8 +643,6 @@ static void switch_delete(lash_t *p_lash, switch_t * sw) { if (sw-dij_channels) free(sw-dij_channels); - if (sw-p_sw) - sw-p_sw-priv = NULL; free(sw); } -- 1.6.5.rc1 ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] ibsim/sim_cmd.c: Only relink port if remote port is currently linked
When multiple switches are unlinked and then a switch is relinked, it should behave like a cable pull or power down of switch so it depends on the state of the remote peer port (as to linked or not). This is not represented in the IB port/port physical state and is additional state. Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/ibsim/sim.h b/ibsim/sim.h index bf85875..52eb73b 100644 --- a/ibsim/sim.h +++ b/ibsim/sim.h @@ -210,6 +211,7 @@ struct Port { int remoteport; Node *previous_remotenode; int previous_remoteport; + int unlinked; int errrate; uint16_t errattr; Node *node; diff --git a/ibsim/sim_cmd.c b/ibsim/sim_cmd.c index cb6e639..d27ab0f 100644 --- a/ibsim/sim_cmd.c +++ b/ibsim/sim_cmd.c @@ -1,5 +1,6 @@ /* * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. + * Copyright (c) 2009 HNR Consulting. All rights reserved. * * This file is part of ibsim. * @@ -146,12 +147,18 @@ static int do_link(FILE * f, char *line) rport = node_get_port(rnode, rportnum); + if (rport-unlinked) { + lport-unlinked = 0; + return -1; + } + if (link_ports(lport, rport) 0) return -fprintf(f, # can't link: local/remote port are already connected\n); lport-previous_remotenode = NULL; rport-previous_remotenode = NULL; + lport-unlinked = 0; return 0; } @@ -194,7 +201,7 @@ static int do_relink(FILE * f, char *line) numports++; // To make the for-loop below run up to last port else lportnum--; - + if (lportnum = 0) { lport = ports + lnode-portsbase + lportnum; @@ -206,12 +213,18 @@ static int do_relink(FILE * f, char *line) rport = node_get_port(lport-previous_remotenode, lport-previous_remoteport); + if (rport-unlinked) { + lport-unlinked = 0; + return -1; + } + if (link_ports(lport, rport) 0) return -fprintf(f, # can't link: local/remote port are already connected\n); lport-previous_remotenode = NULL; rport-previous_remotenode = NULL; + lport-unlinked = 0; return 1; } @@ -224,11 +237,17 @@ static int do_relink(FILE * f, char *line) rport = node_get_port(lport-previous_remotenode, lport-previous_remoteport); + if (rport-unlinked) { + lport-unlinked = 0; + continue; + } + if (link_ports(lport, rport) 0) continue; lport-previous_remotenode = NULL; rport-previous_remotenode = NULL; + lport-unlinked = 0; relinked++; } @@ -246,6 +265,7 @@ static void unlink_port(Node * lnode, Port * lport, Node * rnode, int rportnum) lport-previous_remoteport = lport-remoteport; rport-previous_remotenode = rport-remotenode; rport-previous_remoteport = rport-remoteport; + lport-unlinked = 1; lport-remotenode = rport-remotenode = 0; lport-remoteport = rport-remoteport = 0; @@ -406,6 +426,7 @@ static int do_unlink(FILE * f, char *line, int clear) if (portnum = 0) { port = ports + node-portsbase + portnum; if (!clear !port-remotenode) { + port-unlinked = 1; fprintf(f, # port %d at nodeid \%s\ is not linked\n, portnum, nodeid); return -1; @@ -420,8 +441,10 @@ static int do_unlink(FILE * f, char *line, int clear) for (port = ports + node-portsbase, e = port + numports; port e; port++) { - if (!clear !port-remotenode) + if (!clear !port-remotenode) { + port-unlinked = 1; continue; + } if (port-remotenode) unlink_port(node, port, port-remotenode, port-remoteport); diff --git a/ibsim/sim_net.c b/ibsim/sim_net.c index 8a5d281..0092068 100644 --- a/ibsim/sim_net.c +++ b/ibsim/sim_net.c @@ -492,6 +492,7 @@ static void init_ports(Node * node, int type, int maxports) port-linkwidth = LINKWIDTH_4x; port-linkspeedena = netspeed; port-linkspeed = LINKSPEED_SDR; + port-unlinked = 0; size = (type == SWITCH_NODE i) ? sw_pkey_size : ca_pkey_size; if (size) { ___ general mailing list general@lists.openfabrics.org http
[ofa-general] Re: osm_link_mgr.c:link_mgr_get_smsl question
On Tue, Sep 22, 2009 at 10:53 PM, Sasha Khapyorsky sas...@voltaire.comwrote: On 16:44 Tue 22 Sep , Hal Rosenstock wrote: Yeah, the port lid table will be OK but port's PortInfo won't (so base LID/LMC will be broken) for this scenario but it wouldn't affect this code in this way. Let me try this again... The port LID table is fine but the lookup is done based on the LID in the received portInfo as it is the result of osm_physp_get_base_lid() (osm_link_mgr.c:link_mgr_get_smsl line 83). In the case of failed Sets, this is invalid so LID 0 is used and that's what causes the NULL p_src_port which in turn causes the seg fault. So I'm back to: I can see two ways to fix this: 1. Replace with port GUID search 2. Have osm_get_lash_sl handle NULL for p_src_port Maybe you see other ways to deal with this. Do you have a preferred approach ? -- Hal ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] libibmad: Add support for PortXmitDiscardDetails
Also, some additional commentary changes to mad.h and fields.c Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/libibmad/include/infiniband/mad.h b/libibmad/include/infiniband/mad.h index 94b64cf..cfa9105 100644 --- a/libibmad/include/infiniband/mad.h +++ b/libibmad/include/infiniband/mad.h @@ -168,6 +168,7 @@ enum GSI_ATTR_ID { IB_GSI_PORT_SAMPLES_CONTROL = 0x10, IB_GSI_PORT_SAMPLES_RESULT = 0x11, IB_GSI_PORT_COUNTERS = 0x12, + IB_GSI_PORT_XMIT_DISCARD_DETAILS = 0x16, IB_GSI_PORT_COUNTERS_EXT = 0x1D, IB_GSI_PORT_XMIT_DATA_SL = 0x36, IB_GSI_PORT_RCV_DATA_SL = 0x37, @@ -604,6 +605,9 @@ enum MAD_FIELDS { IB_CPI_TRAP_QP_F, IB_CPI_TRAP_QKEY_F, + /* +* PortXmitDataSL fields +*/ IB_PC_XMT_DATA_SL_FIRST_F, IB_PC_XMT_DATA_SL0_F = IB_PC_XMT_DATA_SL_FIRST_F, IB_PC_XMT_DATA_SL1_F, @@ -623,6 +627,9 @@ enum MAD_FIELDS { IB_PC_XMT_DATA_SL15_F, IB_PC_XMT_DATA_SL_LAST_F, + /* +* PortRcvDataSL fields +*/ IB_PC_RCV_DATA_SL_FIRST_F, IB_PC_RCV_DATA_SL0_F = IB_PC_RCV_DATA_SL_FIRST_F, IB_PC_RCV_DATA_SL1_F, @@ -642,6 +649,15 @@ enum MAD_FIELDS { IB_PC_RCV_DATA_SL15_F, IB_PC_RCV_DATA_SL_LAST_F, + /* +* PortXmitDiscardDetails fields +*/ + IB_PC_XMT_INACT_DISC_F, + IB_PC_XMT_NEIGH_MTU_DISC_F, + IB_PC_XMT_SW_LIFE_DISC_F, + IB_PC_XMT_SW_HOL_DISC_F, + IB_PC_XMT_DISC_LAST_F, + IB_FIELD_LAST_ /* must be last */ }; @@ -963,7 +979,8 @@ MAD_EXPORT ib_mad_dump_fn mad_dump_node_type, mad_dump_sltovl, mad_dump_vlarbitration, mad_dump_nodedesc, mad_dump_nodeinfo, mad_dump_portinfo, mad_dump_switchinfo, mad_dump_perfcounters, mad_dump_perfcounters_ext, -mad_dump_perfcounters_xmt_sl, mad_dump_perfcounters_rcv_sl; +mad_dump_perfcounters_xmt_sl, mad_dump_perfcounters_rcv_sl, +mad_dump_perfcounters_xmt_disc; MAD_EXPORT int ibdebug; diff --git a/libibmad/src/dump.c b/libibmad/src/dump.c index 5151882..48f59ab 100644 --- a/libibmad/src/dump.c +++ b/libibmad/src/dump.c @@ -729,6 +729,16 @@ void mad_dump_perfcounters_rcv_sl(char *buf, int bufsz, void *val, int valsz) IB_PC_RCV_DATA_SL_LAST_F); } +void mad_dump_perfcounters_xmt_disc(char *buf, int bufsz, void *val, int valsz) +{ + int cnt; + + cnt = _dump_fields(buf, bufsz, val, IB_PC_EXT_PORT_SELECT_F, + IB_PC_EXT_XMT_BYTES_F); + _dump_fields(buf + cnt, bufsz - cnt, val, IB_PC_XMT_INACT_DISC_F, +IB_PC_XMT_DISC_LAST_F); +} + void xdump(FILE * file, char *msg, void *p, int size) { #define HEX(x) ((x) 10 ? '0' + (x) : 'a' + ((x) -10)) diff --git a/libibmad/src/fields.c b/libibmad/src/fields.c index 5f30116..f274aff 100644 --- a/libibmad/src/fields.c +++ b/libibmad/src/fields.c @@ -406,6 +406,9 @@ static const ib_field_t ib_mad_f[] = { {BITSOFFS(520, 24), TrapQP, mad_dump_hex}, {544, 32, TrapQKey, mad_dump_hex}, + /* +* PortXmitDataSL fields +*/ {32, 32, XmtDataSL0, mad_dump_uint}, {64, 32, XmtDataSL1, mad_dump_uint}, {96, 32, XmtDataSL2, mad_dump_uint}, @@ -424,6 +427,9 @@ static const ib_field_t ib_mad_f[] = { {512, 32, XmtDataSL15, mad_dump_uint}, {0, 0}, /* IB_PC_XMT_DATA_SL_LAST_F */ + /* +* PortRcvDataSL fields +*/ {32, 32, RcvDataSL0, mad_dump_uint}, {64, 32, RcvDataSL1, mad_dump_uint}, {96, 32, RcvDataSL2, mad_dump_uint}, @@ -442,6 +448,15 @@ static const ib_field_t ib_mad_f[] = { {512, 32, RcvDataSL15, mad_dump_uint}, {0, 0}, /* IB_PC_RCV_DATA_SL_LAST_F */ + /* +* PortXmitDiscardDetails fields +*/ + {32, 16, PortInactiveDiscards, mad_dump_uint}, + {48, 16, PortNeighborMTUDiscards, mad_dump_uint}, + {64, 16, PortSwLifetimeLimitDiscards, mad_dump_uint}, + {80, 16, PortSwHOQLifetimeLimitDiscards, mad_dump_uint}, + {0, 0}, /* IB_PC_XMT_DISC_LAST_F */ + {0, 0} /* IB_FIELD_LAST_ */ }; diff --git a/libibmad/src/libibmad.map b/libibmad/src/libibmad.map index b9a890c..2a6a253 100644 --- a/libibmad/src/libibmad.map +++ b/libibmad/src/libibmad.map @@ -24,6 +24,7 @@ IBMAD_1.3 { mad_dump_perfcounters_ext; mad_dump_perfcounters_xmt_sl; mad_dump_perfcounters_rcv_sl; + mad_dump_perfcounters_xmt_disc; mad_dump_physportstate; mad_dump_portcapmask; mad_dump_portinfo; ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] infiniband-diags/pergquery: Add support for optional PortXmitDiscardDetails counter
Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/infiniband-diags/man/perfquery.8 b/infiniband-diags/man/perfquery.8 index 2a80f30..4510e7d 100644 --- a/infiniband-diags/man/perfquery.8 +++ b/infiniband-diags/man/perfquery.8 @@ -1,4 +1,4 @@ -.TH PERFQUERY 8 March 10, 2009 OpenIB OpenIB Diagnostics +.TH PERFQUERY 8 September 21, 2009 OpenIB OpenIB Diagnostics .SH NAME perfquery \- query InfiniBand port counters @@ -6,6 +6,7 @@ perfquery \- query InfiniBand port counters .SH SYNOPSIS .B perfquery [\-d(ebug)] [\-G(uid)] [\-x|\-\-extended] [\-X|\-\-xmtsl] [\-S|\-\-rcvsl] +[\-D|\-\-xmtdisc] [-a(ll_ports)] [-l(oop_ports)] [-r(eset_after_read)] [-R(eset_only)] [\-C ca_name] [\-P ca_port] [\-t(imeout) timeout_ms] [\-V(ersion)] [\-h(elp)] [lid|guid [[port] [reset_mask]]] @@ -38,6 +39,9 @@ show transmit data SL counter. This is an optional counter for QoS. \fB\-S\fR, \fB\-\-rcvsl\fR show receive data SL counter. This is an optional counter for QoS. .TP +\fB\-D\fR, \fB\-\-xmtdisc\fR +show transmit discard details. This is an optional counter. +.TP \fB\-a\fR, \fB\-\-all_ports\fR show aggregated counters for all ports of the destination lid or reset all counters for all ports. If the destination lid diff --git a/infiniband-diags/src/perfquery.c b/infiniband-diags/src/perfquery.c index d70af9e..74f9235 100644 --- a/infiniband-diags/src/perfquery.c +++ b/infiniband-diags/src/perfquery.c @@ -344,7 +344,7 @@ static void reset_counters(int extended, int timeout, int mask, } static int reset, reset_only, all_ports, loop_ports, port, extended, xmt_sl, -rcv_sl; +rcv_sl, xmt_disc; void xmt_sl_query(ib_portid_t * portid, int port, int mask) { @@ -396,6 +396,33 @@ void rcv_sl_query(ib_portid_t * portid, int port, int mask) IBERROR(perfslreset); } +void xmt_disc_query(ib_portid_t * portid, int port, int mask) +{ + char buf[1024]; + + if (reset_only) { + if (!performance_reset_via(pc, portid, port, mask, ibd_timeout, + IB_GSI_PORT_XMIT_DISCARD_DETAILS, + srcport)) + IBERROR(xmtdiscreset); + return; + } + + if (!pma_query_via(pc, portid, port, ibd_timeout, + IB_GSI_PORT_XMIT_DISCARD_DETAILS, srcport)) + IBERROR(xmtdiscquery); + + mad_dump_perfcounters_xmt_disc(buf, sizeof buf, pc, sizeof pc); + printf(# PortXmitDiscardDetails: %s port %d\n%s, portid2str(portid), + port, buf); + + if (reset) + if (!performance_reset_via(pc, portid, port, mask, ibd_timeout, + IB_GSI_PORT_XMIT_DISCARD_DETAILS, + srcport)) + IBERROR(xmtdiscreset); +} + static int process_opt(void *context, int ch, char *optarg) { switch (ch) { @@ -408,6 +435,9 @@ static int process_opt(void *context, int ch, char *optarg) case 'S': rcv_sl = 1; break; + case 'D': + xmt_disc = 1; + break; case 'a': all_ports++; port = ALL_PORTS; @@ -446,6 +476,7 @@ int main(int argc, char **argv) {extended, 'x', 0, NULL, show extended port counters}, {xmtsl, 'X', 0, NULL, show Xmt SL port counters}, {rcvsl, 'S', 0, NULL, show Rcv SL port counters}, + {xmtdisc, 'D', 0, NULL, show Xmt Discard Details}, {all_ports, 'a', 0, NULL, show aggregated counters}, {loop_ports, 'l', 0, NULL, iterate through each port}, {reset_after_read, 'r', 0, NULL, reset counters after read}, @@ -516,6 +547,11 @@ int main(int argc, char **argv) goto done; } + if (xmt_disc) { + xmt_disc_query(portid, port, mask); + goto done; + } + if (all_ports_loop || (loop_ports (all_ports || port == ALL_PORTS))) { if (smp_query_via(data, portid, IB_ATTR_NODE_INFO, 0, 0, srcport) 0) ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] opensm/osm_link_mgr.c: In link_mgr_set_physp_pi, only call link_mgr_get_smsl when LID valid
Fix seg fault which occurs when get_osm_switch_from_port is called with NULL port (which in this case was caused by calling cl_ptr_vector_get on port LID table with LID 0) Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/opensm/opensm/osm_link_mgr.c b/opensm/opensm/osm_link_mgr.c index c9bdfee..35f83e2 100644 --- a/opensm/opensm/osm_link_mgr.c +++ b/opensm/opensm/osm_link_mgr.c @@ -131,27 +131,32 @@ static int link_mgr_set_physp_pi(osm_sm_t * sm, IN osm_physp_t * p_physp, if (ib_switch_info_is_enhanced_port0(p_node-sw-switch_info) == FALSE) { - /* Even for base port 0 we might have to set smsl - (if we are using lash routing) */ - smsl = link_mgr_get_smsl(sm, p_physp); - if (smsl != ib_port_info_get_master_smsl(p_old_pi)) { - send_set = TRUE; - OSM_LOG(sm-p_log, OSM_LOG_DEBUG, - Setting SMSL to %d on port 0 GUID 0x%016 - PRIx64 \n, smsl, - cl_ntoh64(osm_physp_get_port_guid - (p_physp))); - } else { - /* This means the switch doesn't support - enhanced port 0 and we don't need to - change SMSL. Can skip it. */ - OSM_LOG(sm-p_log, OSM_LOG_DEBUG, - Skipping port 0, GUID 0x%016 PRIx64 - \n, - cl_ntoh64(osm_physp_get_port_guid - (p_physp))); - goto Exit; + /* Make sure LID is valid prior to calling link_mgr_get_smsl */ + if (osm_physp_get_base_lid(p_physp)) { + + /* Even for base port 0 we might have to set + smsl (if we are using lash routing) */ + smsl = link_mgr_get_smsl(sm, p_physp); + if (smsl != ib_port_info_get_master_smsl(p_old_pi)) { + send_set = TRUE; + OSM_LOG(sm-p_log, OSM_LOG_DEBUG, + Setting SMSL to %d on port 0 + GUID 0x%016 PRIx64 \n, smsl, + cl_ntoh64(osm_physp_get_port_guid + (p_physp))); + } else { + /* This means the switch doesn't support + enhanced port 0 and we don't need to + change SMSL. Can skip it. */ + OSM_LOG(sm-p_log, OSM_LOG_DEBUG, + Skipping port 0, GUID 0x%016 + PRIx64 \n, + cl_ntoh64(osm_physp_get_port_guid + (p_physp))); + goto Exit; + } } + } else esp0 = TRUE; } @@ -217,18 +222,22 @@ static int link_mgr_set_physp_pi(osm_sm_t * sm, IN osm_physp_t * p_physp, sizeof(p_pi-master_sm_base_lid))) send_set = TRUE; - smsl = link_mgr_get_smsl(sm, p_physp); - if (smsl != ib_port_info_get_master_smsl(p_old_pi)) { + /* Make sure LID is valid prior to calling link_mgr_get_smsl */ + if (osm_physp_get_base_lid(p_physp)) { + smsl = link_mgr_get_smsl(sm, p_physp); + if (smsl != ib_port_info_get_master_smsl(p_old_pi)) { - ib_port_info_set_master_smsl(p_pi, smsl); + ib_port_info_set_master_smsl(p_pi, smsl); - OSM_LOG(sm-p_log, OSM_LOG_DEBUG, - Setting SMSL to %d on GUID 0x%016 - PRIx64 , port %d\n, smsl, - cl_ntoh64(osm_physp_get_port_guid - (p_physp)), port_num); + OSM_LOG(sm-p_log, OSM_LOG_DEBUG, + Setting SMSL to %d on GUID + 0x%016
[ofa-general] [PATCHv2] infiniband-diags/perfquery: Add support for optional PortXmitDiscardDetails counter
Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- Changes since v1: Fix typo in [PATCH] subject diff --git a/infiniband-diags/man/perfquery.8 b/infiniband-diags/man/perfquery.8 index 2a80f30..4510e7d 100644 --- a/infiniband-diags/man/perfquery.8 +++ b/infiniband-diags/man/perfquery.8 @@ -1,4 +1,4 @@ -.TH PERFQUERY 8 March 10, 2009 OpenIB OpenIB Diagnostics +.TH PERFQUERY 8 September 21, 2009 OpenIB OpenIB Diagnostics .SH NAME perfquery \- query InfiniBand port counters @@ -6,6 +6,7 @@ perfquery \- query InfiniBand port counters .SH SYNOPSIS .B perfquery [\-d(ebug)] [\-G(uid)] [\-x|\-\-extended] [\-X|\-\-xmtsl] [\-S|\-\-rcvsl] +[\-D|\-\-xmtdisc] [-a(ll_ports)] [-l(oop_ports)] [-r(eset_after_read)] [-R(eset_only)] [\-C ca_name] [\-P ca_port] [\-t(imeout) timeout_ms] [\-V(ersion)] [\-h(elp)] [lid|guid [[port] [reset_mask]]] @@ -38,6 +39,9 @@ show transmit data SL counter. This is an optional counter for QoS. \fB\-S\fR, \fB\-\-rcvsl\fR show receive data SL counter. This is an optional counter for QoS. .TP +\fB\-D\fR, \fB\-\-xmtdisc\fR +show transmit discard details. This is an optional counter. +.TP \fB\-a\fR, \fB\-\-all_ports\fR show aggregated counters for all ports of the destination lid or reset all counters for all ports. If the destination lid diff --git a/infiniband-diags/src/perfquery.c b/infiniband-diags/src/perfquery.c index d70af9e..74f9235 100644 --- a/infiniband-diags/src/perfquery.c +++ b/infiniband-diags/src/perfquery.c @@ -344,7 +344,7 @@ static void reset_counters(int extended, int timeout, int mask, } static int reset, reset_only, all_ports, loop_ports, port, extended, xmt_sl, -rcv_sl; +rcv_sl, xmt_disc; void xmt_sl_query(ib_portid_t * portid, int port, int mask) { @@ -396,6 +396,33 @@ void rcv_sl_query(ib_portid_t * portid, int port, int mask) IBERROR(perfslreset); } +void xmt_disc_query(ib_portid_t * portid, int port, int mask) +{ + char buf[1024]; + + if (reset_only) { + if (!performance_reset_via(pc, portid, port, mask, ibd_timeout, + IB_GSI_PORT_XMIT_DISCARD_DETAILS, + srcport)) + IBERROR(xmtdiscreset); + return; + } + + if (!pma_query_via(pc, portid, port, ibd_timeout, + IB_GSI_PORT_XMIT_DISCARD_DETAILS, srcport)) + IBERROR(xmtdiscquery); + + mad_dump_perfcounters_xmt_disc(buf, sizeof buf, pc, sizeof pc); + printf(# PortXmitDiscardDetails: %s port %d\n%s, portid2str(portid), + port, buf); + + if (reset) + if (!performance_reset_via(pc, portid, port, mask, ibd_timeout, + IB_GSI_PORT_XMIT_DISCARD_DETAILS, + srcport)) + IBERROR(xmtdiscreset); +} + static int process_opt(void *context, int ch, char *optarg) { switch (ch) { @@ -408,6 +435,9 @@ static int process_opt(void *context, int ch, char *optarg) case 'S': rcv_sl = 1; break; + case 'D': + xmt_disc = 1; + break; case 'a': all_ports++; port = ALL_PORTS; @@ -446,6 +476,7 @@ int main(int argc, char **argv) {extended, 'x', 0, NULL, show extended port counters}, {xmtsl, 'X', 0, NULL, show Xmt SL port counters}, {rcvsl, 'S', 0, NULL, show Rcv SL port counters}, + {xmtdisc, 'D', 0, NULL, show Xmt Discard Details}, {all_ports, 'a', 0, NULL, show aggregated counters}, {loop_ports, 'l', 0, NULL, iterate through each port}, {reset_after_read, 'r', 0, NULL, reset counters after read}, @@ -516,6 +547,11 @@ int main(int argc, char **argv) goto done; } + if (xmt_disc) { + xmt_disc_query(portid, port, mask); + goto done; + } + if (all_ports_loop || (loop_ports (all_ports || port == ALL_PORTS))) { if (smp_query_via(data, portid, IB_ATTR_NODE_INFO, 0, 0, srcport) 0) ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Re: Possible process deadlock in RMPP flow
On Wed, Sep 23, 2009 at 12:08 PM, Sean Hefty sean.he...@intel.com wrote: ibnetdiscover D 80149b8d 0 26968 26544 (L-TLB) 8102c900bd88 0046 81037e8e 81037e8e02e8 8102c900bd78 000a 8102c5b50820 81038a929820 011837bf6105 0ede 8102c5b50a08 0001 Call Trace: [80064207] wait_for_completion+0x79/0xa2 [8008b4cc] default_wake_function+0x0/0xe [882271d9] :ib_mad:ib_cancel_rmpp_recvs+0x87/0xde [88224485] :ib_mad:ib_unregister_mad_agent+0x30d/0x424 [883983e9] :ib_umad:ib_umad_close+0x9d/0xd6 [80012e22] __fput+0xae/0x198 [80023de6] filp_close+0x5c/0x64 [800393df] put_files_struct+0x63/0xae [80015b26] do_exit+0x31c/0x911 [8004971a] cpuset_exit+0x0/0x6c [8005e116] system_call+0x7e/0x83 From the dump it seems that the process is waits on the call to flush_workqueue() in ib_cancel_rmpp_recvs(). The package they use is OFED 1.4.2. Roland just submitted a patch in this area yesterday. I don't know if the patch would fix their issue, but it may be worth trying. What kernel does 1.4.2 map to? What RMPP messages does ibnetdiscover use? None AFAIK. -- Hal If the program is completing successfully, there may be a different race with the rmpp cleanup. I'll see if anything else stands out in that area. - Sean -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] libibmad/dump.c: Fix typo
Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/libibmad/src/dump.c b/libibmad/src/dump.c index 1b287c0..5151882 100644 --- a/libibmad/src/dump.c +++ b/libibmad/src/dump.c @@ -523,7 +523,7 @@ void mad_dump_portcapmask(char *buf, int bufsz, void *val, int valsz) if (mask (1 28)) s += sprintf(s, \t\t\t\tIsVendorSpecificMadsTableSupported\n); if (mask (1 29)) - s += sprintf(s, \t\t\t\tIsiMcastPkeyTrapSuppressionSupported\n); + s += sprintf(s, \t\t\t\tIsMcastPkeyTrapSuppressionSupported\n); if (mask (1 30)) s += sprintf(s, \t\t\t\tIsMulticastFDBTopSupported\n); if (mask (1 31)) ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] opensm/osm_mesh.c: Add dump_mesh routine at OSM_LOG_DEBUG log level
Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/opensm/opensm/osm_mesh.c b/opensm/opensm/osm_mesh.c index 260e2f8..beb6bd7 100644 --- a/opensm/opensm/osm_mesh.c +++ b/opensm/opensm/osm_mesh.c @@ -1565,6 +1565,39 @@ err: return -1; } +static void dump_mesh(lash_t *p_lash) +{ + osm_log_t *p_log = p_lash-p_osm-log; + int sw; + int num_switches = p_lash-num_switches; + int dimension; + int i, j, k; + switch_t *s, *s2; + char buf[256], *p; + + OSM_LOG_ENTER(p_log); + + for (sw = 0; sw num_switches; sw++) { + p = buf; + s = p_lash-switches[sw]; + dimension = s-node-dimension; + p += sprintf(p, [); + for (i = 0; i dimension; i++) + p += sprintf(p, %2d%s, s-node-coord[i], +(i == dimension - 1) ? ] : ,); + for (j = 0; j s-node-num_links; j++) { + s2 = p_lash-switches[s-node-links[j]-switch_id]; + p += sprintf(p, [%d]-[, j); + for (k = 0; k dimension; k++) + p += sprintf(p, %2d%s, s2-node-coord[k], +(k == dimension - 1) ? ] : ,); + } + OSM_LOG(p_log, OSM_LOG_DEBUG, %s\n, buf); + } + + OSM_LOG_EXIT(p_log); +} + /* * osm_do_mesh_analysis */ @@ -1653,6 +1686,9 @@ int osm_do_mesh_analysis(lash_t *p_lash) OSM_LOG(p_log, OSM_LOG_INFO, %s, buf); } + if (osm_log_is_active(p_log, OSM_LOG_DEBUG)) + dump_mesh(p_lash); + done: mesh_delete(mesh); OSM_LOG_EXIT(p_log); ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Re: osm_link_mgr.c:link_mgr_get_smsl question
On Tue, Sep 22, 2009 at 4:33 PM, Sasha Khapyorsky sas...@voltaire.comwrote: On 07:32 Thu 17 Sep , Hal Rosenstock wrote: Is that (lids in place) always the case ? I don't see immediately how it could be not. What about if the sets of PortInfo for LID fail. Set can fail, but internal OpenSM port_lid_tbl will be up to date. Yeah, the port lid table will be OK but port's PortInfo won't (so base LID/LMC will be broken) for this scenario but it wouldn't affect this code in this way. So I don't have any theories as to how this could occur. Do you ? -- Hal Sasha ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] infiniband-diags/ibportstate.c: Eliminate uninitialized variable compile warning
Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/infiniband-diags/src/ibportstate.c b/infiniband-diags/src/ibportstate.c index 6fb97a8..55e1dd5 100644 --- a/infiniband-diags/src/ibportstate.c +++ b/infiniband-diags/src/ibportstate.c @@ -208,7 +208,7 @@ int main(int argc, char **argv) int state, physstate, lwe, lws, lwa, lse, lss, lsa; int peerlocalportnum, peerlwe, peerlws, peerlwa, peerlse, peerlss, peerlsa; - int width, peerwidth, peerspeed; + int width = 255, peerwidth, peerspeed; uint8_t data[IB_SMP_DATA_SIZE]; ib_portid_t peerportid = { 0 }; int portnum = 0; ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Re: [PATCH] osmtest: Add SA get PathRecord stress test
Hi Sasha, On Sun, Sep 20, 2009 at 6:20 AM, Sasha Khapyorsky sas...@voltaire.comwrote: snip... diff --git a/opensm/osmtest/osmtest.c b/opensm/osmtest/osmtest.c index 986a8d2..8357d90 100644 --- a/opensm/osmtest/osmtest.c +++ b/opensm/osmtest/osmtest.c snip... + + /* + * Do a blocking query for the PathRecord. + */ + status = osmtest_get_path_rec_by_lid_pair(p_osmt, slid, dlid, context); + if (status != IB_SUCCESS) { + OSM_LOG(p_osmt-log, OSM_LOG_ERROR, ERR 000A: + osmtest_get_path_rec_by_lid_pair failed (%s)\n, + ib_get_err_str(status)); + goto Exit; + } It is not really stress testing, just pinging. So are the other tests (additionally those use RMPP). Isn't repetitive pinging a stress of a kind ? Shouldn't it be clarified in test description? Same level of description as other tests. They all could be made more descriptive. -- Hal ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCHv2] osmtest: Add SA get PathRecord stress test
Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- Changes since v1: Removed unneeded mode parameter diff --git a/opensm/man/osmtest.8 b/opensm/man/osmtest.8 index fa0cd52..f0d6323 100644 --- a/opensm/man/osmtest.8 +++ b/opensm/man/osmtest.8 @@ -1,4 +1,4 @@ -.TH OSMTEST 8 August 11, 2008 OpenIB OpenIB Management +.TH OSMTEST 8 August 31, 2009 OpenIB OpenIB Management .SH NAME osmtest \- InfiniBand subnet manager and administration (SM/SA) test program @@ -108,9 +108,10 @@ Stress test options are as follows: OPTDescription ---- - -s1 - Single-MAD response SA queries + -s1 - Single-MAD (RMPP) response SA queries -s2 - Multi-MAD (RMPP) response SA queries -s3 - Multi-MAD (RMPP) Path Record SA queries + -s4 - Single-MAD (non RMPP) get Path Record SA queries Without -s, stress testing is not performed .TP diff --git a/opensm/osmtest/include/osmtest_base.h b/opensm/osmtest/include/osmtest_base.h index 7c33da3..cda3a31 100644 --- a/opensm/osmtest/include/osmtest_base.h +++ b/opensm/osmtest/include/osmtest_base.h @@ -56,11 +56,12 @@ #define STRESS_SMALL_RMPP_THR 10 /* -Take long times when quering big clusters (over 40 nodes) , an average of : 0.25 sec for query +Take long times when querying big clusters (over 40 nodes), an average of : 0.25 sec for query each query receives 1000 records */ #define STRESS_LARGE_RMPP_THR 4000 #define STRESS_LARGE_PR_RMPP_THR 2 +#define STRESS_GET_PR 10 extern const char *const p_file; diff --git a/opensm/osmtest/main.c b/opensm/osmtest/main.c index bb2d6bc..4bb9f82 100644 --- a/opensm/osmtest/main.c +++ b/opensm/osmtest/main.c @@ -143,9 +143,10 @@ void show_usage() Stress test options are as follows:\n OPTDescription\n ----\n --s1 - Single-MAD response SA queries\n +-s1 - Single-MAD (RMPP) response SA queries\n -s2 - Multi-MAD (RMPP) response SA queries\n -s3 - Multi-MAD (RMPP) Path Record SA queries\n +-s4 - Single-MAD (non RMPP) get Path Record SA queries\n Without -s, stress testing is not performed\n\n); printf(-M\n --Multicast_Mode\n @@ -499,6 +500,9 @@ int main(int argc, char *argv[]) case 3: printf(Large Path Record SA queries\n); break; + case 4: + printf(SA Get Path Record queries\n); + break; default: printf(Unknown value %u (ignored)\n, opt.stress); diff --git a/opensm/osmtest/osmtest.c b/opensm/osmtest/osmtest.c index 986a8d2..c6ec955 100644 --- a/opensm/osmtest/osmtest.c +++ b/opensm/osmtest/osmtest.c @@ -2882,6 +2882,146 @@ Exit: /** **/ +ib_api_status_t +osmtest_stress_path_recs_by_lid(IN osmtest_t * const p_osmt, + OUT uint32_t * const p_num_recs, + OUT uint32_t * const p_num_queries) +{ + osmtest_req_context_t context; + ib_path_rec_t *p_rec; + cl_status_t status; + ib_net16_t dlid, slid; + int num_recs, i; + + OSM_LOG_ENTER(p_osmt-log); + + memset(context, 0, sizeof(context)); + + slid = cl_ntoh16(p_osmt-local_port.lid); + dlid = cl_ntoh16(p_osmt-local_port.sm_lid); + + /* +* Do a blocking query for the PathRecord. +*/ + status = osmtest_get_path_rec_by_lid_pair(p_osmt, slid, dlid, context); + if (status != IB_SUCCESS) { + OSM_LOG(p_osmt-log, OSM_LOG_ERROR, ERR 000A: + osmtest_get_path_rec_by_lid_pair failed (%s)\n, + ib_get_err_str(status)); + goto Exit; + } + + /* +* Populate the database with the received records. +*/ + num_recs = context.result.result_cnt; + *p_num_recs += num_recs; + ++*p_num_queries; + + if (osm_log_is_active(p_osmt-log, OSM_LOG_VERBOSE)) { + OSM_LOG(p_osmt-log, OSM_LOG_VERBOSE, + Received %u records\n, num_recs); + + for (i = 0; i num_recs; i++) { + p_rec = osmv_get_query_path_rec(context.result.p_result_madw, 0); + osm_dump_path_record(p_osmt-log, p_rec, OSM_LOG_VERBOSE); + } + } + +Exit: + /* +* Return the IB query MAD to the pool as necessary. +*/ + if (context.result.p_result_madw != NULL) { + osm_mad_pool_put(p_osmt-mad_pool
[ofa-general] [PATCHv2] opensm/osm_perfmgr_db.c: Fix memory leak of db nodes
Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- Changes since v1: Fix use after free issue diff --git a/opensm/opensm/osm_perfmgr_db.c b/opensm/opensm/osm_perfmgr_db.c index e5dfc19..03f988d 100644 --- a/opensm/opensm/osm_perfmgr_db.c +++ b/opensm/opensm/osm_perfmgr_db.c @@ -49,6 +49,8 @@ #include opensm/osm_perfmgr.h #include opensm/osm_opensm.h +static void free_node(db_node_t * node); + /** = */ perfmgr_db_t *perfmgr_db_construct(osm_perfmgr_t *perfmgr) @@ -68,7 +70,17 @@ perfmgr_db_t *perfmgr_db_construct(osm_perfmgr_t *perfmgr) */ void perfmgr_db_destroy(perfmgr_db_t * db) { + cl_map_item_t *item, *next_item; + db_node_t *node; + if (db) { + item = cl_qmap_head(db-pc_data); + while (item != cl_qmap_end(db-pc_data)) { + node = (db_node_t *)item; + next_item = cl_qmap_next(item); + free_node(node); + item = next_item; + } cl_plock_destroy(db-lock); free(db); } ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Re: [PATCHv2] opensm/osm_mesh.c: Remove edges in lash matrix
Hi Sasha, On Sun, Aug 30, 2009 at 6:36 AM, Sasha Khapyorsky sas...@voltaire.comwrote: snip... @@ -878,6 +950,12 @@ static void make_geometry(lash_t *p_lash, int sw) n = s1-node-num_links; /* + * ignore chain fragments + */ + if (n seed-node-num_links n = 2) + continue; + + /* * only process 'mesh' switches */ if (!s1-node-matrix) @@ -908,7 +986,8 @@ static void make_geometry(lash_t *p_lash, int sw) if (j == i) continue; - if (s1-node-matrix[i][j] != 2) { + if (s1-node-matrix[i][j] != 2 + s1-node-matrix[i][j] = 4) { What does this ' = 4' check? It's to rule out opposite nodes when distance is greater than 4. I've added a comment to the next version of the patch for this. -- Hal ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCHv3] opensm/osm_mesh.c: Remove edges in lash matrix
The intent of this change is to remove edge nodes (by not counting them). The point of this heuristic is to deal with the case of small lattices which can easily have more surface than interior, which leads to choosing a non representative seed. This causes impossible counts to get reported. Signed-off-by: Robert Pearson rpear...@systemfabricworks.com Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- Changes since v2: In make_geometry, added comment on meaning of magic number 4 In seed_axes, made log level DEBUG and placed in osm_log_is_active clause Changes since v1: Replaced printfs with OSM_LOG calls diff --git a/opensm/opensm/osm_mesh.c b/opensm/opensm/osm_mesh.c index 72a9aa9..260e2f8 100644 --- a/opensm/opensm/osm_mesh.c +++ b/opensm/opensm/osm_mesh.c @@ -170,6 +170,11 @@ static const struct mesh_info { {8, {2, 2, 2, 2, 2, 2, 2, 2}, 8, {-1792, -6144, -8960, -7168, -3360, -896, -112, 0, 1}, }, + /* +* mesh errors +*/ + {2, {6, 6}, 4, {-192, -256, -80, 0, 1}, }, + {-1, {0,}, 0, {0, },}, }; @@ -727,6 +732,42 @@ done: } /* + * remove_edges + * + * remove type from nodes that have fewer links + * than adjacent nodes + */ +static void remove_edges(lash_t *p_lash) +{ + osm_log_t *p_log = p_lash-p_osm-log; + int sw; + mesh_node_t *n, *nn; + int i; + + OSM_LOG_ENTER(p_log); + + for (sw = 0; sw p_lash-num_switches; sw++) { + n = p_lash-switches[sw]-node; + if (!n-type) + continue; + + for (i = 0; i n-num_links; i++) { + nn = p_lash-switches[n-links[i]-switch_id]-node; + + if (nn-num_links n-num_links) { + OSM_LOG(p_log, OSM_LOG_DEBUG, + removed edge switch %s\n, + p_lash-switches[sw]-p_sw-p_node-print_desc); + n-type = -1; + break; + } + } + } + + OSM_LOG_EXIT(p_log); +} + +/* * get_local_geometry * * analyze the local geometry around each switch @@ -735,6 +776,7 @@ static int get_local_geometry(lash_t *p_lash, mesh_t *mesh) { osm_log_t *p_log = p_lash-p_osm-log; int sw; + int status = 0; OSM_LOG_ENTER(p_log); @@ -747,15 +789,38 @@ static int get_local_geometry(lash_t *p_lash, mesh_t *mesh) continue; if (get_switch_metric(p_lash, sw)) { - OSM_LOG_EXIT(p_log); - return -1; + status = -1; + goto Exit; } - classify_switch(p_lash, mesh, sw); classify_mesh_type(p_lash, sw); } + remove_edges(p_lash); + + for (sw = 0; sw p_lash-num_switches; sw++) { + if (p_lash-switches[sw]-node-type 0) + continue; + classify_switch(p_lash, mesh, sw); + } + +Exit: OSM_LOG_EXIT(p_log); - return 0; + return status; +} + +static void print_axis(lash_t *p_lash, char *p, int sw, int port) +{ + mesh_node_t *node = p_lash-switches[sw]-node; + char *name = p_lash-switches[sw]-p_sw-p_node-print_desc; + int c = node-axes[port]; + + p += sprintf(p, %s[%d] = , name, port); + if (c) + p += sprintf(p, %s%c - , ((c - 1) 1) ? - : +, 'X' + (c - 1)/2); + else + p += sprintf(p, N/A - ); + p += sprintf(p, %s\n, + p_lash-switches[node-links[port]-switch_id]-p_sw-p_node-print_desc); } /* @@ -775,6 +840,7 @@ static void seed_axes(lash_t *p_lash, int sw) int i, j, c; OSM_LOG_ENTER(p_log); + if (!node-matrix || !node-dimension) goto done; @@ -805,6 +871,16 @@ static void seed_axes(lash_t *p_lash, int sw) } } + if (osm_log_is_active(p_log, OSM_LOG_DEBUG)) { + char buf[256], *p; + + for (i = 0; i n; i++) { + p = buf; + print_axis(p_lash, p, sw, i); + OSM_LOG(p_log, OSM_LOG_DEBUG, %s, buf); + } + } + done: OSM_LOG_EXIT(p_log); } @@ -878,6 +954,12 @@ static void make_geometry(lash_t *p_lash, int sw) n = s1-node-num_links; /* +* ignore chain fragments +*/ + if (n seed-node-num_links n = 2) + continue; + + /* * only process 'mesh' switches */ if (!s1-node-matrix) @@ -908,7 +990,9 @@ static void make_geometry(lash_t *p_lash, int
Re: [ofa-general] Re: [PATCH] opensm: Add support for MulticastFDBTop
On Mon, Sep 21, 2009 at 8:59 AM, Sasha Khapyorsky sas...@voltaire.comwrote: On 10:31 Wed 02 Sep , Hal Rosenstock wrote: Add support for SwitchInfo:MulticastFDBTop Added by MgtWG errata #4505-4508 Also, per MgtWG RefID #4640, MulticastFDBTop value of 0xbfff means no entries In osm_mcast_mgr.c:mcast_mgr_set_mftables call new routine mcast_mgr_set_mfttop to set MulticastFDBTop in SwitchInfo based on max_block_in_use when switch port 0 indicates IsMulticastFDBTop is supported. Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c index d7c5ce1..3671e08 100644 --- a/opensm/opensm/osm_mcast_mgr.c +++ b/opensm/opensm/osm_mcast_mgr.c @@ -1066,6 +1066,83 @@ Exit: /** **/ +static void mcast_mgr_set_mfttop(IN osm_sm_t * sm, IN osm_switch_t * p_sw) +{ + osm_node_t *p_node; + osm_dr_path_t *p_path; + osm_physp_t *p_physp; + osm_mcast_tbl_t *p_tbl; + osm_madw_context_t context; + ib_api_status_t status; + ib_switch_info_t si; + boolean_t set_swinfo_require = FALSE; + uint16_t mcast_top; + uint8_t life_state; + + OSM_LOG_ENTER(sm-p_log); + + CL_ASSERT(p_sw); + + p_node = p_sw-p_node; + + CL_ASSERT(p_node); + + p_physp = osm_node_get_physp_ptr(p_node, 0); + p_path = osm_physp_get_dr_path_ptr(p_physp); + p_tbl = osm_switch_get_mcast_tbl_ptr(p_sw); + + if (p_physp-port_info.capability_mask IB_PORT_CAP_HAS_MCAST_FDB_TOP) { BTW any reason why this capability bit if placed in PortInfo and not in SwitchInfo (it is not port but switch related feature)? I don't recall. + /* +Set the top of the multicast forwarding table. + */ + si = p_sw-switch_info; + if (p_tbl-max_block_in_use == -1) + mcast_top = cl_hton16(IB_LID_MCAST_START_HO - 1); + else + mcast_top = cl_hton16(IB_LID_MCAST_START_HO + + (p_tbl-max_block_in_use + 1) * IB_MCAST_BLOCK_SIZE - 1); + if (mcast_top != si.mcast_top) { + set_swinfo_require = TRUE; + si.mcast_top = mcast_top; + } + + /* check to see if the change state bit is on. If it is - then +we need to clear it. */ + if (ib_switch_info_get_state_change(si)) + life_state = ((sm-p_subn-opt.packet_life_time 3) + | (si.life_state IB_SWITCH_PSC)) 0xfc; + else + life_state = (sm-p_subn-opt.packet_life_time 3) 0xf8; + + if (life_state != si.life_state || + ib_switch_info_get_state_change(si)) { + set_swinfo_require = TRUE; + si.life_state = life_state; + } Switch's StateChange and LifeState are handled when unicast routing is configured. Why do we need duplicate it here? I thought we could lose a PortStateChange but it looks like just making sure that this bit is 0 on set should be fine. I'll send a revised patch shortly. -- Hal ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Re: [PATCH] osmtest: Add SA get PathRecord stress test
On Mon, Sep 21, 2009 at 10:28 AM, Sasha Khapyorsky sas...@voltaire.comwrote: On 09:20 Mon 21 Sep , Hal Rosenstock wrote: snip... + + /* + * Do a blocking query for the PathRecord. + */ + status = osmtest_get_path_rec_by_lid_pair(p_osmt, slid, dlid, context); + if (status != IB_SUCCESS) { + OSM_LOG(p_osmt-log, OSM_LOG_ERROR, ERR 000A: + osmtest_get_path_rec_by_lid_pair failed (%s)\n, + ib_get_err_str(status)); + goto Exit; + } It is not really stress testing, just pinging. So are the other tests (additionally those use RMPP). Isn't repetitive pinging a stress of a kind ? No. Stress test assumes full load. What do you mean full load ? In ping case the only one thread is loaded and in only request processing time. I'm not following what you mean by this. Shouldn't it be clarified in test description? Same level of description as other tests. They all could be made more descriptive. Agree. And we need to start somewhere. Separate patch ? -- Hal Sasha ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCHv2] opensm: Add support for MulticastFDBTop
Add support for SwitchInfo:MulticastFDBTop Added by MgtWG errata #4505-4508 Also, per MgtWG RefID #4640, MulticastFDBTop value of 0xbfff means no entries In osm_sm.c:osm_sm_set_mcast_tbl, when switch port 0 indicates IsMulticastFDBTop supported, set MulticastFDBTop in SwitchInfo based on max_block_in_use Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- Changes since v1: In mcast_mgr_set_mfttop, eliminated PortStateChange checking diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c index c1d1916..0da0ef1 100644 --- a/opensm/opensm/osm_mcast_mgr.c +++ b/opensm/opensm/osm_mcast_mgr.c @@ -1044,6 +1044,64 @@ static ib_api_status_t mcast_mgr_process_mlid(osm_sm_t * sm, uint16_t mlid) /** **/ +static void mcast_mgr_set_mfttop(IN osm_sm_t * sm, IN osm_switch_t * p_sw) +{ + osm_node_t *p_node; + osm_dr_path_t *p_path; + osm_physp_t *p_physp; + osm_mcast_tbl_t *p_tbl; + osm_madw_context_t context; + ib_api_status_t status; + ib_switch_info_t si; + uint16_t mcast_top; + + OSM_LOG_ENTER(sm-p_log); + + CL_ASSERT(p_sw); + + p_node = p_sw-p_node; + + CL_ASSERT(p_node); + + p_physp = osm_node_get_physp_ptr(p_node, 0); + p_path = osm_physp_get_dr_path_ptr(p_physp); + p_tbl = osm_switch_get_mcast_tbl_ptr(p_sw); + + if (p_physp-port_info.capability_mask IB_PORT_CAP_HAS_MCAST_FDB_TOP) { + /* + Set the top of the multicast forwarding table. +*/ + si = p_sw-switch_info; + if (p_tbl-max_block_in_use == -1) + mcast_top = cl_hton16(IB_LID_MCAST_START_HO - 1); + else + mcast_top = cl_hton16(IB_LID_MCAST_START_HO + + (p_tbl-max_block_in_use + 1) * IB_MCAST_BLOCK_SIZE - 1); + if (mcast_top != si.mcast_top) { + si.mcast_top = mcast_top; + + OSM_LOG(sm-p_log, OSM_LOG_DEBUG, + Setting switch MFT top to MLID 0x%x\n, + cl_ntoh16(si.mcast_top)); + + context.si_context.light_sweep = FALSE; + context.si_context.node_guid = osm_node_get_node_guid(p_node); + context.si_context.set_method = TRUE; + + status = osm_req_set(sm, p_path, (uint8_t *) si, +sizeof(si), IB_MAD_ATTR_SWITCH_INFO, +0, CL_DISP_MSGID_NONE, context); + + if (status != IB_SUCCESS) + OSM_LOG(sm-p_log, OSM_LOG_ERROR, ERR 0A1B: + Sending SwitchInfo attribute failed (%s)\n, + ib_get_err_str(status)); + } + } +} + +/** + **/ static int mcast_mgr_set_mftables(osm_sm_t * sm) { cl_qmap_t *p_sw_tbl = sm-p_subn-sw_guid_tbl; @@ -1059,6 +1117,7 @@ static int mcast_mgr_set_mftables(osm_sm_t * sm) p_tbl = osm_switch_get_mcast_tbl_ptr(p_sw); if (osm_mcast_tbl_get_max_block_in_use(p_tbl) max_block) max_block = osm_mcast_tbl_get_max_block_in_use(p_tbl); + mcast_mgr_set_mfttop(sm, p_sw); p_sw = (osm_switch_t *) cl_qmap_next(p_sw-map_item); } diff --git a/opensm/opensm/osm_sa_class_port_info.c b/opensm/opensm/osm_sa_class_port_info.c index d2ab96a..fb58fe5 100644 --- a/opensm/opensm/osm_sa_class_port_info.c +++ b/opensm/opensm/osm_sa_class_port_info.c @@ -1,6 +1,6 @@ /* * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. + * Copyright (c) 2002-2009 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two @@ -159,8 +159,10 @@ static void cpi_rcv_respond(IN osm_sa_t * sa, IN const osm_madw_t * p_madw) OSM_CAP_IS_PORT_INFO_CAPMASK_MATCH_SUPPORTED; #endif if (sa-p_subn-opt.qos) - ib_class_set_cap_mask2(p_resp_cpi, OSM_CAP2_IS_QOS_SUPPORTED); - + ib_class_set_cap_mask2(p_resp_cpi, OSM_CAP2_IS_QOS_SUPPORTED | + OSM_CAP2_IS_MCAST_TOP_SUPPORTED); + else + ib_class_set_cap_mask2(p_resp_cpi, OSM_CAP2_IS_MCAST_TOP_SUPPORTED); if (!sa-p_subn-opt.disable_multicast) p_resp_cpi-cap_mask |= OSM_CAP_IS_UD_MCAST_SUP
[ofa-general] [PATCHv3] opensm/osm_perfmgr_db.c: Fix memory leak of db nodes
Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- Changes since v2: Eliminated node variable Changes since v1: Fix use after free issue diff --git a/opensm/opensm/osm_perfmgr_db.c b/opensm/opensm/osm_perfmgr_db.c index e5dfc19..5321c59 100644 --- a/opensm/opensm/osm_perfmgr_db.c +++ b/opensm/opensm/osm_perfmgr_db.c @@ -49,6 +49,8 @@ #include opensm/osm_perfmgr.h #include opensm/osm_opensm.h +static void free_node(db_node_t * node); + /** = */ perfmgr_db_t *perfmgr_db_construct(osm_perfmgr_t *perfmgr) @@ -68,7 +70,15 @@ perfmgr_db_t *perfmgr_db_construct(osm_perfmgr_t *perfmgr) */ void perfmgr_db_destroy(perfmgr_db_t * db) { + cl_map_item_t *item, *next_item; + if (db) { + item = cl_qmap_head(db-pc_data); + while (item != cl_qmap_end(db-pc_data)) { + next_item = cl_qmap_next(item); + free_node((db_node_t *)item); + item = next_item; + } cl_plock_destroy(db-lock); free(db); } ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Re: [PATCH] osmtest: Add SA get PathRecord stress test
On Mon, Sep 21, 2009 at 10:57 AM, Sasha Khapyorsky sas...@voltaire.comwrote: snip... How this ping test's timeline looks? (1) client sends one request, (2) it travels to a server, (3) server processes it and replies, (4) the response travels to client, (5) client gets it and continue from beginning. Ok? The server works only in (3) and does nothing in other test stages. How is this different from the other stress tests ? Aren't they all blocking too ? -- Hal ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Re: [PATCH] opensm: Add support for MulticastFDBTop
On Mon, Sep 21, 2009 at 11:09 AM, Sasha Khapyorsky sas...@voltaire.comwrote: On 10:20 Mon 21 Sep , Hal Rosenstock wrote: + + if (p_physp-port_info.capability_mask IB_PORT_CAP_HAS_MCAST_FDB_TOP) { BTW any reason why this capability bit if placed in PortInfo and not in SwitchInfo (it is not port but switch related feature)? I don't recall. Could this be verified? I'll try. For me it does not look very reasonable to leak PortInfo:CapabilityMask bits for this purpose, it is meanless for CA and switch external ports. Right; it would never be set for such ports. Switch's StateChange and LifeState are handled when unicast routing is configured. Why do we need duplicate it here? I thought we could lose a PortStateChange Basically we could lose this bit when doing reset twice - link state can change in window between two resets. I removed this in the latest patch version. -- Hal Sasha ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Re: [PATCH] opensm: Add support for MulticastFDBTop
On Mon, Sep 21, 2009 at 11:48 AM, Sasha Khapyorsky sas...@voltaire.comwrote: On 11:22 Mon 21 Sep , Hal Rosenstock wrote: I'll try. Thanks. I removed this in the latest patch version. Ok. Let's wait with this patch up to capability mask clarification/resolution. Other than this, is the patch acceptable ? I want to get this as ready as possible. -- Hal Sasha ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCHv3] opensm: Add support for MulticastFDBTop
Add support for SwitchInfo:MulticastFDBTop Added by MgtWG errata #4505-4508 Also, per MgtWG RefID #4640, MulticastFDBTop value of 0xbfff means no entries In osm_sm.c:osm_sm_set_mcast_tbl, when switch port 0 indicates IsMulticastFDBTop supported, set MulticastFDBTop in SwitchInfo based on max_block_in_use Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- Changes since v2: In mcast_mgr_set_mfttop, reverse sense of mft top test so can remove indentation of code doing update Changes since v1: In mcast_mgr_set_mfttop, eliminated PortStateChange checking diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c index c1d1916..c6c6d6d 100644 --- a/opensm/opensm/osm_mcast_mgr.c +++ b/opensm/opensm/osm_mcast_mgr.c @@ -1044,6 +1044,65 @@ static ib_api_status_t mcast_mgr_process_mlid(osm_sm_t * sm, uint16_t mlid) /** **/ +static void mcast_mgr_set_mfttop(IN osm_sm_t * sm, IN osm_switch_t * p_sw) +{ + osm_node_t *p_node; + osm_dr_path_t *p_path; + osm_physp_t *p_physp; + osm_mcast_tbl_t *p_tbl; + osm_madw_context_t context; + ib_api_status_t status; + ib_switch_info_t si; + uint16_t mcast_top; + + OSM_LOG_ENTER(sm-p_log); + + CL_ASSERT(p_sw); + + p_node = p_sw-p_node; + + CL_ASSERT(p_node); + + p_physp = osm_node_get_physp_ptr(p_node, 0); + p_path = osm_physp_get_dr_path_ptr(p_physp); + p_tbl = osm_switch_get_mcast_tbl_ptr(p_sw); + + if (p_physp-port_info.capability_mask IB_PORT_CAP_HAS_MCAST_FDB_TOP) { + /* + Set the top of the multicast forwarding table. +*/ + si = p_sw-switch_info; + if (p_tbl-max_block_in_use == -1) + mcast_top = cl_hton16(IB_LID_MCAST_START_HO - 1); + else + mcast_top = cl_hton16(IB_LID_MCAST_START_HO + + (p_tbl-max_block_in_use + 1) * IB_MCAST_BLOCK_SIZE - 1); + if (mcast_top == si.mcast_top) + return; + + si.mcast_top = mcast_top; + + OSM_LOG(sm-p_log, OSM_LOG_DEBUG, + Setting switch MFT top to MLID 0x%x\n, + cl_ntoh16(si.mcast_top)); + + context.si_context.light_sweep = FALSE; + context.si_context.node_guid = osm_node_get_node_guid(p_node); + context.si_context.set_method = TRUE; + + status = osm_req_set(sm, p_path, (uint8_t *) si, +sizeof(si), IB_MAD_ATTR_SWITCH_INFO, +0, CL_DISP_MSGID_NONE, context); + + if (status != IB_SUCCESS) + OSM_LOG(sm-p_log, OSM_LOG_ERROR, ERR 0A1B: + Sending SwitchInfo attribute failed (%s)\n, + ib_get_err_str(status)); + } +} + +/** + **/ static int mcast_mgr_set_mftables(osm_sm_t * sm) { cl_qmap_t *p_sw_tbl = sm-p_subn-sw_guid_tbl; @@ -1059,6 +1118,7 @@ static int mcast_mgr_set_mftables(osm_sm_t * sm) p_tbl = osm_switch_get_mcast_tbl_ptr(p_sw); if (osm_mcast_tbl_get_max_block_in_use(p_tbl) max_block) max_block = osm_mcast_tbl_get_max_block_in_use(p_tbl); + mcast_mgr_set_mfttop(sm, p_sw); p_sw = (osm_switch_t *) cl_qmap_next(p_sw-map_item); } diff --git a/opensm/opensm/osm_sa_class_port_info.c b/opensm/opensm/osm_sa_class_port_info.c index d2ab96a..fb58fe5 100644 --- a/opensm/opensm/osm_sa_class_port_info.c +++ b/opensm/opensm/osm_sa_class_port_info.c @@ -1,6 +1,6 @@ /* * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. + * Copyright (c) 2002-2009 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two @@ -159,8 +159,10 @@ static void cpi_rcv_respond(IN osm_sa_t * sa, IN const osm_madw_t * p_madw) OSM_CAP_IS_PORT_INFO_CAPMASK_MATCH_SUPPORTED; #endif if (sa-p_subn-opt.qos) - ib_class_set_cap_mask2(p_resp_cpi, OSM_CAP2_IS_QOS_SUPPORTED); - + ib_class_set_cap_mask2(p_resp_cpi, OSM_CAP2_IS_QOS_SUPPORTED | + OSM_CAP2_IS_MCAST_TOP_SUPPORTED); + else + ib_class_set_cap_mask2(p_resp_cpi, OSM_CAP2_IS_MCAST_TOP_SUPPORTED); if (!sa-p_subn-opt.disable_multicast) p_resp_cpi-cap_mask
[ofa-general] [PATCH] opensm/osm_perfmgr_db.c: Fix memory leak of db nodes
Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/opensm/opensm/osm_perfmgr_db.c b/opensm/opensm/osm_perfmgr_db.c index e5dfc19..329743a 100644 --- a/opensm/opensm/osm_perfmgr_db.c +++ b/opensm/opensm/osm_perfmgr_db.c @@ -49,6 +49,8 @@ #include opensm/osm_perfmgr.h #include opensm/osm_opensm.h +static void free_node(db_node_t * node); + /** = */ perfmgr_db_t *perfmgr_db_construct(osm_perfmgr_t *perfmgr) @@ -68,7 +70,16 @@ perfmgr_db_t *perfmgr_db_construct(osm_perfmgr_t *perfmgr) */ void perfmgr_db_destroy(perfmgr_db_t * db) { + cl_map_item_t *item; + db_node_t *node; + if (db) { + item = cl_qmap_head(db-pc_data); + while (item != cl_qmap_end(db-pc_data)) { + node = (db_node_t *)item; + free_node(node); + item = cl_qmap_next(item); + } cl_plock_destroy(db-lock); free(db); } ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] ibsim/sim_cmd.c: Cosmetic change to error message
Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/ibsim/sim_cmd.c b/ibsim/sim_cmd.c index cb6e639..6d3a893 100644 --- a/ibsim/sim_cmd.c +++ b/ibsim/sim_cmd.c @@ -295,7 +295,7 @@ static int do_seterror(FILE * f, char *line) orig = strsep(s, \); if (!s) { - fprintf(f, # unlink: bad parameter in \%s\\n, line); + fprintf(f, # set error: bad parameter in \%s\\n, line); return -1; } ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Re: osm_link_mgr.c:link_mgr_get_smsl question
On Sun, Aug 30, 2009 at 8:00 AM, Sasha Khapyorsky sas...@voltaire.comwrote: On 07:32 Sun 30 Aug , Hal Rosenstock wrote: osm_link_mgr.c:link_mgr_get_smsl has the following: /* Find osm_port of the source = p_physp */ slid = osm_physp_get_base_lid(p_physp); p_src_port = cl_ptr_vector_get(sm-p_subn-port_lid_tbl, cl_ntoh16(slid)); /* Call lash to find proper SL */ sl = osm_get_lash_sl(p_osm, p_src_port, p_sm_port); It may be that this code is invoked prior to the LID being assigned How is it possible? In the code I can see that link_mgr_process() is always executed after lid_mgr run. When nodes use gPXE, the LID is not passed from the gPXE to the Linux environment. How is it related to gPXE? OpenSM's lid manager runs and assigns lids to all available endports, only after this link manager runs and try with SMSL - at this point all lids should be in place and p_subn-port_lid_tbl should be fine. Is that (lids in place) always the case ? What about if the sets of PortInfo for LID fail. -- Hal Am I missing something? Sasha ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm
On Thu, Sep 17, 2009 at 4:20 PM, Ira Weiny wei...@llnl.gov wrote: On Thu, 17 Sep 2009 10:35:39 -0700 Sean Hefty sean.he...@intel.com wrote: #define IB_PATH_RECORD_REVERSIBLE 0x80 struct ib_path_record { uint64_tservice_id; union ibv_gid dgid; union ibv_gid sgid; uint16_tdlid; uint16_tslid; uint32_tflowlabel_hoplimit; /* resv-31:28 flow label-27:8 hop limit-7:0*/ uint8_t tclass; uint8_t reversible_numpath; /* reversible-7:7 num path-6:0 */ uint16_tpkey; uint16_tqosclass_sl;/* qos class-15:4 sl-3:0 */ uint8_t mtu;/* mtu selector-7:6 mtu-5:0 */ uint8_t rate; /* rate selector-7:6 rate-5:0 */ uint8_t packetlifetime; /* lifetime selector-7:6 lifetime-5:0 */ uint8_t preference; uint8_t reserved[6]; }; I would prefer to use the structures already defined in ib_types.h... I understand your not wanting to make ACM dependant on the OpenSM packages so is it time to move ib_types.h out of the OpenSM tree and somewhere more generic? Perhaps libibumad? This also applies to ib_sa_mad in your 5th patch. OTOH, ib_types.h is a 10K line file with multiple long (10 lines) inlined functions. Perhaps it deserves it's own library? Defining some of these types in libibumad isn't a bad idea. Although, WinOF actually has 2 copies of ib_types.h (that differ...) I find using ib_types.h painful given its size; separate header files may help. Yes I was thinking multiple headers. There seems like there is already some precedent in ib_cm_types.h (although that entire file seems to be enclosed in a #ifndef WIN32 clause? So am I wrong on this?) In the end I would like to make ib_types.h just list the specific headers. Sasha, would you be willing to accept such a patch? First move ib_types.h to umad I'm not sure this is a good idea. ibutils (ibis and ibmgtsim) wants ib_types.h but does not want libibumad. -- Hal and then move the long inline functions into the lib and separate out the remaining header. Or would you prefer a new library? I think there is enough code there but I leave it up to you. Ira - Sean -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 wei...@llnl.gov ___ ofw mailing list o...@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm
On Thu, Sep 17, 2009 at 5:40 PM, Ira Weiny wei...@llnl.gov wrote: On Thu, 17 Sep 2009 17:41:30 -0400 Hal Rosenstock hal.rosenst...@gmail.com wrote: On Thu, Sep 17, 2009 at 4:20 PM, Ira Weiny wei...@llnl.gov wrote: On Thu, 17 Sep 2009 10:35:39 -0700 Sean Hefty sean.he...@intel.com wrote: #define IB_PATH_RECORD_REVERSIBLE 0x80 struct ib_path_record { uint64_tservice_id; union ibv_gid dgid; union ibv_gid sgid; uint16_tdlid; uint16_tslid; uint32_tflowlabel_hoplimit; /* resv-31:28 flow label-27:8 hop limit-7:0*/ uint8_t tclass; uint8_t reversible_numpath; /* reversible-7:7 num path-6:0 */ uint16_tpkey; uint16_tqosclass_sl;/* qos class-15:4 sl-3:0 */ uint8_t mtu;/* mtu selector-7:6 mtu-5:0 */ uint8_t rate; /* rate selector-7:6 rate-5:0 */ uint8_t packetlifetime; /* lifetime selector-7:6 lifetime-5:0 */ uint8_t preference; uint8_t reserved[6]; }; I would prefer to use the structures already defined in ib_types.h... I understand your not wanting to make ACM dependant on the OpenSM packages so is it time to move ib_types.h out of the OpenSM tree and somewhere more generic? Perhaps libibumad? This also applies to ib_sa_mad in your 5th patch. OTOH, ib_types.h is a 10K line file with multiple long (10 lines) inlined functions. Perhaps it deserves it's own library? Defining some of these types in libibumad isn't a bad idea. Although, WinOF actually has 2 copies of ib_types.h (that differ...) I find using ib_types.h painful given its size; separate header files may help. Yes I was thinking multiple headers. There seems like there is already some precedent in ib_cm_types.h (although that entire file seems to be enclosed in a #ifndef WIN32 clause? So am I wrong on this?) In the end I would like to make ib_types.h just list the specific headers. Sasha, would you be willing to accept such a patch? First move ib_types.h to umad I'm not sure this is a good idea. ibutils (ibis and ibmgtsim) wants ib_types.h but does not want libibumad. I miswrote about ibis as it uses osm_vendor layer so it can use libibumad but there are other vendor layers other than osm_vendor_ibumad in use. There are other combinations where umad isn't used (even Windows is not fully moved over still). Would a separate library be a better solution then? Maybe but what aside from the header would be in the library ? -- Hal I would prefer that as well. Ira -- Hal and then move the long inline functions into the lib and separate out the remaining header. Or would you prefer a new library? I think there is enough code there but I leave it up to you. Ira - Sean -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 wei...@llnl.gov ___ ofw mailing list o...@lists.openfabrics.org http://*lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 wei...@llnl.gov ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm
On Thu, Sep 17, 2009 at 5:49 PM, Sean Hefty sean.he...@intel.com wrote: I'm not sure this is a good idea. ibutils (ibis and ibmgtsim) wants ib_types.h but does not want libibumad. Well, libibumad is pretty useless without some network structure definitions. ib_types.h is more akin to what is in libibmad rather than libibumad. Currently, the alternatives are to install opensm, which also requires installing libibmad, libibcommon, and complib, or for the app to define what they need, which is what was done here. I'm not sure how you pick up ib_types.h without libibumad getting installed, but you can make a reasonable argument that libibumad should define the MAD and SA attribute structures. libibumad is currently transparent to the MAD details. It's the MAD library which knows this and that's more a diagnostic library. In a different world, this might all just be one library... -- Hal - Sean ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm
On Thu, Sep 17, 2009 at 6:11 PM, Ira Weiny wei...@llnl.gov wrote: On Thu, 17 Sep 2009 14:49:50 -0700 Sean Hefty sean.he...@intel.com wrote: I'm not sure this is a good idea. ibutils (ibis and ibmgtsim) wants ib_types.h but does not want libibumad. Well, libibumad is pretty useless without some network structure definitions. Currently, the alternatives are to install opensm, which also requires installing libibmad, libibcommon, and complib, or for the app to define what they need, which is what was done here. I'm not sure how you pick up ib_types.h without libibumad getting installed, but you can make a reasonable argument that libibumad should define the MAD and SA attribute structures. Actually, now that I think about it... does ibutils depend on OpenSM then? I think it has to as it uses the OpenSM vendor layer (at least ibis). ibmgtsim is another story. I would think that it would be better to have it depend on ibumad rather than OpenSM... This may be historical but it was built on the OpenSM vendor layer before there was umad. Mellanox is best to comment on these aspects. -- Hal :-/ Ok I think I am starting to see why you mention this... Does ibutils actually link with anything? It looks like ibutils is using the inline functions to effectively make a static link to this functionality? I don't see any dependencies on any libs in the Makefile.am's. Is that correct? :-/ In this case I don't know that it matters if we move the header. However, it would matter if we moved the inline functions... Does ibutils form it's own packets and open the mad devices on it's own, outside of ibumad? From my quick look it seems it would have to. Ira - Sean -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 wei...@llnl.gov ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm
On Thu, Sep 17, 2009 at 7:16 PM, Hal Rosenstock hal.rosenst...@gmail.comwrote: On Thu, Sep 17, 2009 at 6:11 PM, Ira Weiny wei...@llnl.gov wrote: On Thu, 17 Sep 2009 14:49:50 -0700 Sean Hefty sean.he...@intel.com wrote: I'm not sure this is a good idea. ibutils (ibis and ibmgtsim) wants ib_types.h but does not want libibumad. Well, libibumad is pretty useless without some network structure definitions. Currently, the alternatives are to install opensm, which also requires installing libibmad, libibcommon, and complib, or for the app to define what they need, which is what was done here. I'm not sure how you pick up ib_types.h without libibumad getting installed, but you can make a reasonable argument that libibumad should define the MAD and SA attribute structures. Actually, now that I think about it... does ibutils depend on OpenSM then? I think it has to as it uses the OpenSM vendor layer (at least ibis). ibmgtsim is another story. Also, configure takes --with-osm for OpenSM location. I would think that it would be better to have it depend on ibumad rather than OpenSM... This may be historical but it was built on the OpenSM vendor layer before there was umad. Mellanox is best to comment on these aspects. -- Hal :-/ Ok I think I am starting to see why you mention this... Does ibutils actually link with anything? It looks like ibutils is using the inline functions to effectively make a static link to this functionality? I don't see any dependencies on any libs in the Makefile.am's. Is that correct? :-/ In this case I don't know that it matters if we move the header. However, it would matter if we moved the inline functions... Does ibutils form it's own packets and open the mad devices on it's own, outside of ibumad? From my quick look it seems it would have to. Ira - Sean -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 wei...@llnl.gov ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm
On Thu, Sep 17, 2009 at 7:12 PM, Hal Rosenstock hal.rosenst...@gmail.comwrote: On Thu, Sep 17, 2009 at 5:49 PM, Sean Hefty sean.he...@intel.com wrote: I'm not sure this is a good idea. ibutils (ibis and ibmgtsim) wants ib_types.h but does not want libibumad. Well, libibumad is pretty useless without some network structure definitions. ib_types.h is more akin to what is in libibmad rather than libibumad. Currently, the alternatives are to install opensm, which also requires installing libibmad, libibcommon, and complib, or for the app to define what they need, which is what was done here. I'm not sure how you pick up ib_types.h without libibumad getting installed, but you can make a reasonable argument that libibumad should define the MAD and SA attribute structures. libibumad is currently transparent to the MAD details. It's the MAD library which knows this and that's more a diagnostic library. In a different world, this might all just be one library... Although not a fit IMO, the pragmatic solution is to move ib_types,h into libibumad. I think it is better there than OpenSM which was never quite right either. That can at least start to eliminate the duplications in this area. -- Hal -- Hal - Sean ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] [PATCH] opensm: use mgrp pointer in port mcm_info
On Tue, Sep 15, 2009 at 7:26 AM, Hal Rosenstock hal.rosenst...@gmail.comwrote: On Tue, Sep 15, 2009 at 6:08 AM, Sasha Khapyorsky sas...@voltaire.comwrote: On 08:45 Mon 14 Sep , Hal Rosenstock wrote: Does this mean consolidate_ipv6_snm_req does not work now ? No, it doesn't. As you may remember 'consolidate_ipv6_snm_req' workaround does nothing with MGIDs to MLID mapping, but instead enforces all IPv6 SNM matching requests to join a single multicast group (MGID). Is consolidate_ipv6_snm_req working for you ? Never mind; My bad. It's working... -- Hal -- Hal Sasha ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] [PATCH] opensm: use mgrp pointer in port mcm_info
On Tue, Sep 15, 2009 at 6:08 AM, Sasha Khapyorsky sas...@voltaire.comwrote: On 08:45 Mon 14 Sep , Hal Rosenstock wrote: Does this mean consolidate_ipv6_snm_req does not work now ? No, it doesn't. As you may remember 'consolidate_ipv6_snm_req' workaround does nothing with MGIDs to MLID mapping, but instead enforces all IPv6 SNM matching requests to join a single multicast group (MGID). Is consolidate_ipv6_snm_req working for you ? -- Hal Sasha ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] perftest: Remove unneeded executable permissions
Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/COPYING b/COPYING old mode 100755 new mode 100644 diff --git a/Makefile b/Makefile old mode 100755 new mode 100644 diff --git a/README b/README old mode 100755 new mode 100644 diff --git a/clock_test.c b/clock_test.c old mode 100755 new mode 100644 diff --git a/get_clock.c b/get_clock.c old mode 100755 new mode 100644 diff --git a/get_clock.h b/get_clock.h old mode 100755 new mode 100644 diff --git a/perftest.spec b/perftest.spec old mode 100755 new mode 100644 diff --git a/rdma_bw.c b/rdma_bw.c old mode 100755 new mode 100644 diff --git a/rdma_lat.c b/rdma_lat.c old mode 100755 new mode 100644 diff --git a/read_bw.c b/read_bw.c old mode 100755 new mode 100644 diff --git a/read_lat.c b/read_lat.c old mode 100755 new mode 100644 diff --git a/send_bw.c b/send_bw.c old mode 100755 new mode 100644 diff --git a/send_lat.c b/send_lat.c old mode 100755 new mode 100644 diff --git a/write_bw.c b/write_bw.c old mode 100755 new mode 100644 diff --git a/write_bw_postlist.c b/write_bw_postlist.c old mode 100755 new mode 100644 diff --git a/write_lat.c b/write_lat.c old mode 100755 new mode 100644 ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] perftest: Make rdma_lat, rdma_bw, and clock_test executable names rdma neutral
Since rdma_lat and rdma_bw use RDMA CM, they can be used with both IB and iWARP so make their executable names neutral (by removing ib_) IB only tests only require linking with libibverbs Also, spec file change for executable name changes Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/Makefile b/Makefile index 8042531..83c22c3 100755 --- a/Makefile +++ b/Makefile @@ -1,7 +1,8 @@ -TESTS = write_bw_postlist rdma_lat rdma_bw send_lat send_bw write_lat write_bw read_lat read_bw +RDMACM_TESTS = rdma_lat rdma_bw +TESTS = write_bw_postlist send_lat send_bw write_lat write_bw read_lat read_bw UTILS = clock_test -all: ${TESTS} ${UTILS} +all: ${RDMACM_TESTS} ${TESTS} ${UTILS} CFLAGS += -Wall -g -D_GNU_SOURCE -O2 EXTRA_FILES = get_clock.c @@ -10,11 +11,18 @@ EXTRA_HEADERS = get_clock.h LOADLIBES += LDFLAGS += -${TESTS}: LOADLIBES += -libverbs -lrdmacm +${RDMACM_TESTS} ${UTILS}: LOADLIBES += -libverbs -lrdmacm +${TESTS}: LOADLIBES += -libverbs -${TESTS} ${UTILS}: %: %.c ${EXTRA_FILES} ${EXTRA_HEADERS} +${RDMACM_TESTS}: %: %.c ${EXTRA_FILES} ${EXTRA_HEADERS} + $(CC) $(CPPFLAGS) $(CFLAGS) $(LDFLAGS) $ ${EXTRA_FILES} $(LOADLIBES) $(LDLIBS) -o $@ +${TESTS}: %: %.c ${EXTRA_FILES} ${EXTRA_HEADERS} $(CC) $(CPPFLAGS) $(CFLAGS) $(LDFLAGS) $ ${EXTRA_FILES} $(LOADLIBES) $(LDLIBS) -o ib_$@ +${UTILS}: %: %.c ${EXTRA_FILES} ${EXTRA_HEADERS} + $(CC) $(CPPFLAGS) $(CFLAGS) $(LDFLAGS) $ ${EXTRA_FILES} $(LOADLIBES) $(LDLIBS) -o rdma_$@ + clean: - $(foreach fname,${TESTS} ${UTILS}, rm -f ib_${fname}) + $(foreach fname,${RDMACM_TESTS} ${UTILS}, rm -f ${fname}) + $(foreach fname,${TESTS}, rm -f ib_${fname}) .DELETE_ON_ERROR: .PHONY: all clean diff --git a/perftest.spec b/perftest.spec index bd234e1..81ca90a 100755 --- a/perftest.spec +++ b/perftest.spec @@ -23,8 +23,8 @@ export CFLAGS=$RPM_OPT_FLAGS chmod -x runme %install -install -D -m 0755 ib_rdma_lat $RPM_BUILD_ROOT%{_bindir}/ib_rdma_lat -install -D -m 0755 ib_rdma_bw $RPM_BUILD_ROOT%{_bindir}/ib_rdma_bw +install -D -m 0755 rdma_lat $RPM_BUILD_ROOT%{_bindir}/rdma_lat +install -D -m 0755 rdma_bw $RPM_BUILD_ROOT%{_bindir}/rdma_bw install -D -m 0755 ib_write_lat $RPM_BUILD_ROOT%{_bindir}/ib_write_lat install -D -m 0755 ib_write_bw $RPM_BUILD_ROOT%{_bindir}/ib_write_bw install -D -m 0755 ib_send_lat $RPM_BUILD_ROOT%{_bindir}/ib_send_lat @@ -32,7 +32,7 @@ install -D -m 0755 ib_send_bw $RPM_BUILD_ROOT%{_bindir}/ib_send_bw install -D -m 0755 ib_read_lat $RPM_BUILD_ROOT%{_bindir}/ib_read_lat install -D -m 0755 ib_read_bw $RPM_BUILD_ROOT%{_bindir}/ib_read_bw install -D -m 0755 ib_write_bw_postlist $RPM_BUILD_ROOT%{_bindir}/ib_write_bw_postlist -install -D -m 0755 ib_clock_test $RPM_BUILD_ROOT%{_bindir}/ib_clock_test +install -D -m 0755 rdma_clock_test $RPM_BUILD_ROOT%{_bindir}/rdma_clock_test %clean rm -rf ${RPM_BUILD_ROOT} @@ -43,6 +43,8 @@ rm -rf ${RPM_BUILD_ROOT} %_bindir/* %changelog +* Sat Apr 18 2009 - hal.rosenst...@gmail.com +- Change executable names for rdma_lat, rdma_bw, and clock_test * Mon Jul 09 2007 - hvo...@suse.de - Use correct version * Wed Jul 04 2007 - hvo...@suse.de ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] opensm/opensm.8.in: Indicate default rule for Default partition
Also, similar change to doc/partition-config.txt Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/opensm/doc/partition-config.txt b/opensm/doc/partition-config.txt index f855268..ae3d6f6 100644 --- a/opensm/doc/partition-config.txt +++ b/opensm/doc/partition-config.txt @@ -3,10 +3,13 @@ OpenSM Partition configuration The default name of OpenSM partitions configuration file is '/etc/opensm/partitions.conf'. The default may be changed by -using --Pconfig (-P) option with OpenSM. +using the --Pconfig (-P) option with OpenSM. The default partition will be created by OpenSM unconditionally even when partition configuration file does not exist or cannot be accessed. +Effectively, this amounts to the same as if the following line appears in the +partitions config file: +Default=0x7fff : ALL=limited, SELF=full ; The default partition has P_Key value 0x7fff. OpenSM's port will have full membership in default partition. All other end ports will have diff --git a/opensm/man/opensm.8.in b/opensm/man/opensm.8.in index fcdc168..caee2ef 100644 --- a/opensm/man/opensm.8.in +++ b/opensm/man/opensm.8.in @@ -1,4 +1,4 @@ -.TH OPENSM 8 September 3, 2009 OpenIB OpenIB Management +.TH OPENSM 8 September 13, 2009 OpenIB OpenIB Management .SH NAME opensm \- InfiniBand subnet manager and administration (SM/SA) @@ -418,11 +418,15 @@ logrotate purposes. .SH PARTITION CONFIGURATION .PP The default name of OpenSM partitions configuration file is -\f...@opensm_config_dir@/@partition_config_f...@\fp. The default may be changed by using ---Pconfig (-P) option with OpenSM. +\f...@opensm_config_dir@/@partition_config_f...@\fp. The default may be changed +by using the --Pconfig (-P) option with OpenSM. The default partition will be created by OpenSM unconditionally even when partition configuration file does not exist or cannot be accessed. +Effectively, this amounts to the same as if the following line appears in the +partitions config file: + +Default=0x7fff : ALL=limited, SELF=full ; The default partition has P_Key value 0x7fff. OpenSM\'s port will have full membership in default partition. All other end ports will have ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] [PATCH] opensm: use mgrp pointer in port mcm_info
On Sun, Sep 6, 2009 at 11:49 AM, Sasha Khapyorsky sas...@voltaire.comwrote: Port needs to access multicast groups where it is joined to. Now it is implemented by keeping list of list of mcm_info elements where MLID of each multicast group is stored. Obviously this assumes single MGID to MLID mapping model. Does this mean consolidate_ipv6_snm_req does not work now ? If so, did OFED 1.5 Beta go out this way ? Also, what is the plan/timeframe to restore this functionality ? -- Hal This patch changes this so that instead of MLID mcm_info stores pointer to multicast group object (mgrp). Such model makes it possible to have MGIDs to MLID compression. Signed-off-by: Sasha Khapyorsky sas...@voltaire.com --- opensm/include/opensm/osm_mcm_info.h | 13 +++-- opensm/include/opensm/osm_port.h | 13 +++-- opensm/opensm/osm_drop_mgr.c | 10 +++--- opensm/opensm/osm_mcm_info.c |8 opensm/opensm/osm_port.c | 10 +- opensm/opensm/osm_sm.c |6 +++--- 6 files changed, 29 insertions(+), 31 deletions(-) diff --git a/opensm/include/opensm/osm_mcm_info.h b/opensm/include/opensm/osm_mcm_info.h index dec607f..62ae326 100644 --- a/opensm/include/opensm/osm_mcm_info.h +++ b/opensm/include/opensm/osm_mcm_info.h @@ -47,6 +47,7 @@ #include iba/ib_types.h #include complib/cl_qlist.h #include opensm/osm_base.h +#include opensm/osm_multicast.h #ifdef __cplusplus # define BEGIN_C_DECLS extern C { @@ -73,15 +74,15 @@ BEGIN_C_DECLS */ typedef struct osm_mcm_info { cl_list_item_t list_item; - ib_net16_t mlid; + osm_mgrp_t *mgrp; } osm_mcm_info_t; /* * FIELDS * list_item * Linkage structure for cl_qlist. MUST BE FIRST MEMBER! * -* mlid -* MLID of this multicast group. +* mgrp +* The pointer to multicast group where this port is member of * * SEE ALSO */ @@ -95,11 +96,11 @@ typedef struct osm_mcm_info { * * SYNOPSIS */ -osm_mcm_info_t *osm_mcm_info_new(IN const ib_net16_t mlid); +osm_mcm_info_t *osm_mcm_info_new(IN osm_mgrp_t *mgrp); /* * PARAMETERS -* mlid -* [in] MLID value for this multicast group. +* mgrp +* [in] the pointer to multicast group. * * RETURN VALUES * Pointer to an initialized tree node. diff --git a/opensm/include/opensm/osm_port.h b/opensm/include/opensm/osm_port.h index 7079e74..0e0d3d2 100644 --- a/opensm/include/opensm/osm_port.h +++ b/opensm/include/opensm/osm_port.h @@ -65,6 +65,7 @@ BEGIN_C_DECLS */ struct osm_port; struct osm_node; +struct osm_mgrp; /h* OpenSM/Physical Port * NAME @@ -1420,14 +1421,14 @@ osm_get_port_by_base_lid(IN const osm_subn_t * const p_subn, * SYNOPSIS */ ib_api_status_t -osm_port_add_mgrp(IN osm_port_t * const p_port, IN const ib_net16_t mlid); +osm_port_add_mgrp(IN osm_port_t * const p_port, IN struct osm_mgrp *mgrp); /* * PARAMETERS * p_port * [in] Pointer to an osm_port_t object. * -* mlid -* [in] MLID of the multicast group. +* mgrp +* [in] Pointer to the multicast group. * * RETURN VALUES * IB_SUCCESS @@ -1449,14 +1450,14 @@ osm_port_add_mgrp(IN osm_port_t * const p_port, IN const ib_net16_t mlid); * SYNOPSIS */ void -osm_port_remove_mgrp(IN osm_port_t * const p_port, IN const ib_net16_t mlid); +osm_port_remove_mgrp(IN osm_port_t * const p_port, IN struct osm_mgrp *mgrp); /* * PARAMETERS * p_port * [in] Pointer to an osm_port_t object. * -* mlid -* [in] MLID of the multicast group. +* mgrp +* [in] Pointer to the multicast group. * * RETURN VALUES * None. diff --git a/opensm/opensm/osm_drop_mgr.c b/opensm/opensm/osm_drop_mgr.c index c9a4f33..4891bb8 100644 --- a/opensm/opensm/osm_drop_mgr.c +++ b/opensm/opensm/osm_drop_mgr.c @@ -158,7 +158,6 @@ static void drop_mgr_remove_port(osm_sm_t * sm, IN osm_port_t * p_port) osm_port_t *p_port_check; cl_qmap_t *p_sm_guid_tbl; osm_mcm_info_t *p_mcm; - osm_mgrp_t *p_mgrp; cl_ptr_vector_t *p_port_lid_tbl; uint16_t min_lid_ho; uint16_t max_lid_ho; @@ -212,12 +211,9 @@ static void drop_mgr_remove_port(osm_sm_t * sm, IN osm_port_t * p_port) p_mcm = (osm_mcm_info_t *) cl_qlist_remove_head(p_port-mcm_list); while (p_mcm != (osm_mcm_info_t *) cl_qlist_end(p_port-mcm_list)) { - p_mgrp = osm_get_mgrp_by_mlid(sm-p_subn, p_mcm-mlid); - if (p_mgrp) { - osm_mgrp_delete_port(sm-p_subn, sm-p_log, -p_mgrp, p_port-guid); - osm_mcm_info_delete((osm_mcm_info_t *) p_mcm); - } + osm_mgrp_delete_port(sm-p_subn, sm-p_log, p_mcm-mgrp,
Re: [ofa-general] [PATCH] infiniband-diags/scripts: Add ibcheckroutes to scripts
On Sun, Sep 13, 2009 at 3:55 AM, Doron Shoham dor...@voltaire.com wrote: Hal Rosenstock wrote: On Thu, Sep 10, 2009 at 7:56 AM, Doron Shoham dor...@voltaire.com mailto:dor...@voltaire.com wrote: ibcheckroutes validates route between all hosts in the fabric. This script finds all leaf switches (switches that are connected to HCAs) This script parses the output of ibnetdiscoer. It finds all leaf switches (from the topology file generated by ibnetdiscover). The it checks if a route exists between all leaf switches using ibtracert. Why leaf switches (and not CAs) ? How are they determined (from the ibnetdiscover output) ? CAs or HCAs ? CAs What about switch port 0s ? It checks connectivity only between leaf switches (not all switches). I assume that traffic is generated only between CAs and therefor connectivity between other switches (not leaf switches) does not important. It's important for a couple of reasons: first PMA access on switches and secondly it's an IBA requirement although some OpenSM routing protocols ignore this. IMO it should be an option (not the default) to add these LIDs in too to the ones checked. and runs ibtracert between them. When using various routing algorithms (e.g. up-down), With which routing algorithms has this been tried ? I assume that from complexity perspective, the routing algorithms calculate routes only between leaf switches and not between all CAs. Then it adds one hop for all CAs connected to the leaf switches. It depends on the routing algorithm (some violate this) but the basic IBA requirement is: * C14-62.1.4: *From every endport within the subnet, the SM *shall *provide at least one reversible path to every other endport. -- Hal I've tested it with up-down but it really doesn't matter which routing algorithm you are using. It just check the routes between leaf switches (and if the routing algorithm behave as above, it means that it checks all CAs connectivity). -- Hal if fabric topology is not suitable there will be no routes between some nodes. It reports when the route exists between source and destination LIDs. Signed-off-by: Doron Shoham dor...@voltaire.com mailto:dor...@voltaire.com snip... ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add 'ibcheckspeed' and 'ibcheckportspeed' to scripts
On Mon, Sep 14, 2009 at 2:20 AM, Keshetti Mahesh keshetti.mah...@gmail.comwrote: I have a small question. If there are all 5 Gbps (maximum supported speed) ports except one with 10 Gbps in a subnet then what is the expected behavior of OpenSM while setting active link speed ? It depends on the peer port and the link between them. Does OpenSM force the port with 10 Gbps to operate at 5 Gbps or not ? SM (including OpenSM) sets PortInfo enabled components based on peer ports' supported components and link negotiation determines the active components. So in the case where one port supports 10 Gbps speed and it's peer port only supports 5, the SM sets LinkSpeedEnabled components on the peer ports to 5 Gbps (encoded as 3). In the case where the peer port supports 10 Gbps, it is set to 10 Gbps (encoded as 5 or 7 depending on what is supported). The link then negotiates to one of the enabled speeds and sets LinkSpeedActive accoridingly. -- Hal -- Keshetti Mahesh On Thu, Sep 10, 2009 at 9:32 PM, Ira Weiny wei...@llnl.gov wrote: Also, iblinkinfo will report links which it finds capable of either faster or wider operation. iblinkinfo checks both ends of the link as Hal mentions. It reports this with output like. Switch 0x0005ad092106 Cisco Switch SFS7000D: ... 78[ ] ==( 4X 2.5 Gbps Active/ LinkUp)== 8 12[ ] MT47396 Infiniscale-III Mellanox Technologies ( Could be 5.0 Gbps) ... Also the portstatus console command in OpenSM will report links which are running at reduced speed or width. Although this does not check the remote port. OpenSM $ help portstatus portstatus [ca|switch|router] summarize port status [ca|switch|router] -- limit the results to the node type specified OpenSM $ portstatus ALL port status: 115 port(s) scanned on 9 nodes in 26 us 85 down 30 active 32 at 4X 22 at 2.5 Gbps 8 at 5.0 Gbps 2 at 10.0 Gbps Possible issues: 2 disabled 0x0008f10400411b18 5 (ISR9024D Voltaire) 0x0005ad092106 13 (Cisco Switch SFS7000D) 6 with reduced speed 0x0008f10500200220 33 (Voltaire 4036 - 36 QDR ports switch) 0x0008f10500200220 19 (Voltaire 4036 - 36 QDR ports switch) 0x0005ad092106 21 (Cisco Switch SFS7000D) 0x0005ad092106 20 (Cisco Switch SFS7000D) 0x0005ad092106 9 (Cisco Switch SFS7000D) 0x0005ad092106 8 (Cisco Switch SFS7000D) Ira On Thu, 10 Sep 2009 09:23:35 -0400 Hal Rosenstock hal.rosenst...@gmail.com wrote: On Thu, Sep 10, 2009 at 9:02 AM, Keshetti Mahesh keshetti.mah...@gmail.comwrote: Added 'ibcheckspeed' and 'ibcheckportspeed': Similar to 'ibcheckwidth/ibcheckportwidth' in functionality and implementation. Reports error/warning messages if the LinkSpeedActive is configured as 2.5 Gbps when the LinkSpeedSupported is more than 2.5 Gbps. ibportstate checks for more than this in terms of speed (and width) anomalies. Would it be better for these scripts to use that tool now ? Alternatively, the additional speed/width anomaly checks could be implemented in these scripts but it does involve checking the peer port so there's a little more to it. -- Hal Signed-off-by: Keshetti Mahesh keshetti.mah...@gmail.com --- infiniband-diags/scripts/ibcheckportspeed.in | 146 ++ infiniband-diags/scripts/ibcheckportwidth.in |2 +- infiniband-diags/scripts/ibcheckspeed.in | 135 3 files changed, 282 insertions(+), 1 deletions(-) create mode 100644 infiniband-diags/scripts/ibcheckportspeed.in create mode 100644 infiniband-diags/scripts/ibcheckspeed.in snip... -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 wei...@llnl.gov ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] [PATCH] infiniband-diags/scripts: Add ibcheckroutes to scripts
On Thu, Sep 10, 2009 at 7:56 AM, Doron Shoham dor...@voltaire.com wrote: ibcheckroutes validates route between all hosts in the fabric. This script finds all leaf switches (switches that are connected to HCAs) and runs ibtracert between them. When using various routing algorithms (e.g. up-down), if fabric topology is not suitable there will be no routes between some nodes. It reports when the route exists between source and destination LIDs. Signed-off-by: Doron Shoham dor...@voltaire.com --- infiniband-diags/Makefile.am |4 +- infiniband-diags/configure.in |1 + infiniband-diags/man/ibcheckroutes.8 | 39 +++ infiniband-diags/scripts/ibcheckroutes.in | 101 + 4 files changed, 143 insertions(+), 2 deletions(-) create mode 100644 infiniband-diags/man/ibcheckroutes.8 create mode 100755 infiniband-diags/scripts/ibcheckroutes.in diff --git a/infiniband-diags/Makefile.am b/infiniband-diags/Makefile.am index 1cdb60e..57363c4 100644 --- a/infiniband-diags/Makefile.am +++ b/infiniband-diags/Makefile.am @@ -33,7 +33,7 @@ sbin_SCRIPTS = scripts/ibcheckerrs scripts/ibchecknet scripts/ibchecknode \ scripts/iblinkinfo.pl scripts/ibprintswitch.pl \ scripts/ibprintca.pl scripts/ibprintrt.pl \ scripts/ibfindnodesusing.pl scripts/ibidsverify.pl \ - scripts/check_lft_balance.pl + scripts/check_lft_balance.pl scripts/ibcheckroutes noinst_LIBRARIES = libcommon.a @@ -76,7 +76,7 @@ man_MANS = man/ibaddr.8 man/ibcheckerrors.8 man/ibcheckerrs.8 \ man/ibprintswitch.8 man/ibprintca.8 man/ibfindnodesusing.8 \ man/ibdatacounts.8 man/ibdatacounters.8 \ man/ibrouters.8 man/ibprintrt.8 man/ibidsverify.8 \ - man/check_lft_balance.8 + man/check_lft_balance.8 man/ibcheckroutes.8 BUILT_SOURCES = ibdiag_version ibdiag_version: diff --git a/infiniband-diags/configure.in b/infiniband-diags/configure.in index 3ef35cc..aa178c5 100644 --- a/infiniband-diags/configure.in +++ b/infiniband-diags/configure.in @@ -158,6 +158,7 @@ AC_CONFIG_FILES([\ scripts/ibcheckportwidth \ scripts/ibcheckstate \ scripts/ibcheckwidth \ + scripts/ibcheckroutes \ scripts/ibclearcounters \ scripts/ibclearerrors \ scripts/ibdatacounts \ diff --git a/infiniband-diags/man/ibcheckroutes.8 b/infiniband-diags/man/ibcheckroutes.8 new file mode 100644 index 000..a6a073f --- /dev/null +++ b/infiniband-diags/man/ibcheckroutes.8 @@ -0,0 +1,39 @@ +.TH IBCHECKPORT 8 September 10, 2009 OpenIB OpenIB Diagnostics + +.SH NAME +ibcheckroutes \- validates routes between all hosts in fabric + +.SH SYNOPSIS +.B ibcheckroutes +[\-h] [\-N] [\-b] [\-e] [\-C ca_name] [\-P ca_port] [\-t(imeout) timeout_ms] + +.SH DESCRIPTION +.PP +ibcheckroutes is a script which uses a full topology file that was created by ibnetdiscover, +scans the network to validate routes between all hosts in the fabric. Based on what has been discussed, this really isn't the case. It only validates routes between leaf switches (at least currently). + +.SH OPTIONS +.PP +\-h Show help. +.PP +\-N Use mono rather than color mode. +.PP +\-b Suppress output. +.PP +\-e Show errors only. +.PP +\-C ca_nameUse the specified ca_name. +.PP +\-P ca_portUse the specified ca_port. +.PP +\-t timeout_ms Override the default timeout for the solicited mads. + +.SH SEE ALSO +.BR ibnetdiscover(8), +.BR ibtracert(8), +.BR ibroute(8) Is ibroute used ? + +.SH AUTHOR +.TP +Doron Shoham +.RI dor...@voltaire.com diff --git a/infiniband-diags/scripts/ibcheckroutes.inb/infiniband-diags/scripts/ ibcheckroutes.in new file mode 100755 index 000..eb3ad30 --- /dev/null +++ b/infiniband-diags/scripts/ibcheckroutes.in @@ -0,0 +1,101 @@ +#!/bin/sh + +IBPATH=${IBPATH:-...@ibscriptpath@} + +function usage() { + echo Usage: `basename $0` [-h] [-N] [-b] [-e] [-C ca_name] [-P ca_port] [-t(imeout) timeout_ms] + echo -evalidate routes between all hosts in fabric + echo -e -h - Show help + echo -e -N - Use mono rather than color mode + echo -e -b - Suppress output + echo -e -e - Show errors only + echo -e -C - Use the specified ca_name + echo -e -P - Use the specified ca_port + echo -e -t - Override the default timeout for the solicited add mads to the end of this + exit -1 +} + +function user_abort() { + echo Aborted + exit 1 +} + +function green() { + if [ $bw = yes ]; then + printf ${res_col}[OK]\n $1 + return + fi + printf \033[1;032m${res_col}[OK]\033[0;39m\n $1 +} + +function red() { + if [ $bw = yes ]; then + printf ${res_col}[FAILED]\n $1 + return + fi + printf
Re: [ofa-general] [PATCH] infiniband-diags/scripts: Add ibcheckroutes to scripts
On Mon, Sep 14, 2009 at 11:05 AM, Eli Dorfman (Voltaire) dorfman@gmail.com wrote: Hal Rosenstock wrote: On Mon, Sep 14, 2009 at 10:32 AM, Eli Dorfman (Voltaire) dorfman@gmail.com mailto:dorfman@gmail.com wrote: Hal Rosenstock wrote: On Sun, Sep 13, 2009 at 3:55 AM, Doron Shoham dor...@voltaire.com mailto:dor...@voltaire.com mailto:dor...@voltaire.com mailto:dor...@voltaire.com wrote: Hal Rosenstock wrote: On Thu, Sep 10, 2009 at 7:56 AM, Doron Shoham dor...@voltaire.com mailto:dor...@voltaire.com mailto:dor...@voltaire.com mailto:dor...@voltaire.com mailto:dor...@voltaire.com mailto:dor...@voltaire.com mailto:dor...@voltaire.com mailto:dor...@voltaire.com wrote: ibcheckroutes validates route between all hosts in the fabric. This script finds all leaf switches (switches that are connected to HCAs) This script parses the output of ibnetdiscoer. It finds all leaf switches (from the topology file generated by ibnetdiscover). The it checks if a route exists between all leaf switches using ibtracert. Why leaf switches (and not CAs) ? How are they determined (from the ibnetdiscover output) ? How are the leaf switches determined (from core switches) in the ibnetdiscover output ? Is it any switch which has an attached CA versus any switch which has no attached CAs ? yes because there are much less combinations (routes) of leaf switches than CAs. So is the check is that there are routes between all the leaf switches ? yes And since we assume that opensm routing builds lid matrix based on switch connectivity than if two switches have route between each other then all CAs that are connected to them will have route to each other. I can't parse this sentence. Also, this should have nothing to do with OpenSM as it is SM independent AFAIT. since we check routes between leaf switches LIDs, for opensm this also assures that we have route between CAs that are attached to them. Not quite. It assures there should be a path but not necessarily a route as all the LFTs are not checked with the CA port LIDs. -- Hal In ibnetdiscover you can see to which switch (LID) each CA is connected. Sure. CAs or HCAs ? CAs What about switch port 0s ? It checks connectivity only between leaf switches (not all switches). I assume that traffic is generated only between CAs and therefor connectivity between other switches (not leaf switches) does not important. It's important for a couple of reasons: first PMA access on switches and secondly it's an IBA requirement although some OpenSM routing protocols ignore this. IMO it should be an option (not the default) to add these LIDs in too to the ones checked. Ok, we can this option once this patch is applied. I have some other specific comments on the patch. Also it may be better to provide the switch LID(s) from which PM is running to reduce number of tested routes. This is in the vein of only checking leaf switch connectivity but is not the IBA general requirement. -- Hal Eli and runs ibtracert between them. When using various routing algorithms (e.g. up-down), With which routing algorithms has this been tried ? I assume that from complexity perspective, the routing algorithms calculate routes only between leaf switches and not between all CAs. Then it adds one hop for all CAs connected to the leaf switches. It depends on the routing algorithm (some violate this) but the basic IBA requirement is: * C14-62.1.4: *From every endport within the subnet, the SM *shall *provide at least one reversible path to every other endport. -- Hal I've tested it with up-down but it really doesn't matter which routing algorithm you are using. It just check the routes between leaf switches (and if the routing algorithm behave as above, it means that it checks all CAs connectivity). -- Hal if fabric topology is not suitable there will be no routes between some nodes. It reports when the route exists between source
[ofa-general] [PATCHv2] opensm/opensm.8.in: Indicate default rule for Default partition
Also, similar change to doc/partition-config.txt Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- Changes since v1: Fixed Default rule for non SM ports based on comment from Eli diff --git a/opensm/doc/partition-config.txt b/opensm/doc/partition-config.txt index f855268..cb3bcf7 100644 --- a/opensm/doc/partition-config.txt +++ b/opensm/doc/partition-config.txt @@ -3,14 +3,23 @@ OpenSM Partition configuration The default name of OpenSM partitions configuration file is '/etc/opensm/partitions.conf'. The default may be changed by -using --Pconfig (-P) option with OpenSM. +using the --Pconfig (-P) option with OpenSM. The default partition will be created by OpenSM unconditionally even when partition configuration file does not exist or cannot be accessed. -The default partition has P_Key value 0x7fff. OpenSM's port will have -full membership in default partition. All other end ports will have -limited membership. +The default partition has P_Key value 0x7fff. OpenSM's port will always +have full membership in default partition. All other end ports will have +full membership if the partition configuration file is not found or cannot +be accessed, or limited membership if the file exists and can be accessed +but there is no rule for the Default partition. + +Effectively, this amounts to the same as if one of the following rules +below appear in the partition configuration file: +In the case of no rule for the Default partition: +Default=0x7fff : ALL=limited, SELF=full ; +In the case of no partition configuration file or file cannot be accessed: +Default=0x7fff : ALL=full ; File Format diff --git a/opensm/man/opensm.8.in b/opensm/man/opensm.8.in index fcdc168..03002c0 100644 --- a/opensm/man/opensm.8.in +++ b/opensm/man/opensm.8.in @@ -1,4 +1,4 @@ -.TH OPENSM 8 September 3, 2009 OpenIB OpenIB Management +.TH OPENSM 8 September 14, 2009 OpenIB OpenIB Management .SH NAME opensm \- InfiniBand subnet manager and administration (SM/SA) @@ -418,15 +418,29 @@ logrotate purposes. .SH PARTITION CONFIGURATION .PP The default name of OpenSM partitions configuration file is -\f...@opensm_config_dir@/@partition_config_f...@\fp. The default may be changed by using ---Pconfig (-P) option with OpenSM. +\f...@opensm_config_dir@/@partition_config_f...@\fp. The default may be changed +by using the --Pconfig (-P) option with OpenSM. The default partition will be created by OpenSM unconditionally even when partition configuration file does not exist or cannot be accessed. -The default partition has P_Key value 0x7fff. OpenSM\'s port will have -full membership in default partition. All other end ports will have -limited membership. +The default partition has P_Key value 0x7fff. OpenSM\'s port will always +have full membership in default partition. All other end ports will have +full membership if the partition configuration file is not found or cannot +be accessed, or limited membership if the file exists and can be accessed +but there is no rule for the Default partition. + +Effectively, this amounts to the same as if one of the following rules +below appear in the partition configuration file. + +In the case of no rule for the Default partition: + +Default=0x7fff : ALL=limited, SELF=full ; + +In the case of no partition configuration file or file cannot be accessed: + +Default=0x7fff : ALL=full ; + File Format ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] [PATCH] infiniband-diags/scripts: Add ibcheckroutes to scripts
On Mon, Sep 14, 2009 at 10:32 AM, Eli Dorfman (Voltaire) dorfman@gmail.com wrote: Hal Rosenstock wrote: On Sun, Sep 13, 2009 at 3:55 AM, Doron Shoham dor...@voltaire.com mailto:dor...@voltaire.com wrote: Hal Rosenstock wrote: On Thu, Sep 10, 2009 at 7:56 AM, Doron Shoham dor...@voltaire.com mailto:dor...@voltaire.com mailto:dor...@voltaire.com mailto:dor...@voltaire.com wrote: ibcheckroutes validates route between all hosts in the fabric. This script finds all leaf switches (switches that are connected to HCAs) This script parses the output of ibnetdiscoer. It finds all leaf switches (from the topology file generated by ibnetdiscover). The it checks if a route exists between all leaf switches using ibtracert. Why leaf switches (and not CAs) ? How are they determined (from the ibnetdiscover output) ? How are the leaf switches determined (from core switches) in the ibnetdiscover output ? Is it any switch which has an attached CA versus any switch which has no attached CAs ? because there are much less combinations (routes) of leaf switches than CAs. So is the check is that there are routes between all the leaf switches ? And since we assume that opensm routing builds lid matrix based on switch connectivity than if two switches have route between each other then all CAs that are connected to them will have route to each other. I can't parse this sentence. Also, this should have nothing to do with OpenSM as it is SM independent AFAIT. In ibnetdiscover you can see to which switch (LID) each CA is connected. Sure. CAs or HCAs ? CAs What about switch port 0s ? It checks connectivity only between leaf switches (not all switches). I assume that traffic is generated only between CAs and therefor connectivity between other switches (not leaf switches) does not important. It's important for a couple of reasons: first PMA access on switches and secondly it's an IBA requirement although some OpenSM routing protocols ignore this. IMO it should be an option (not the default) to add these LIDs in too to the ones checked. Ok, we can this option once this patch is applied. I have some other specific comments on the patch. Also it may be better to provide the switch LID(s) from which PM is running to reduce number of tested routes. This is in the vein of only checking leaf switch connectivity but is not the IBA general requirement. -- Hal Eli and runs ibtracert between them. When using various routing algorithms (e.g. up-down), With which routing algorithms has this been tried ? I assume that from complexity perspective, the routing algorithms calculate routes only between leaf switches and not between all CAs. Then it adds one hop for all CAs connected to the leaf switches. It depends on the routing algorithm (some violate this) but the basic IBA requirement is: * C14-62.1.4: *From every endport within the subnet, the SM *shall *provide at least one reversible path to every other endport. -- Hal I've tested it with up-down but it really doesn't matter which routing algorithm you are using. It just check the routes between leaf switches (and if the routing algorithm behave as above, it means that it checks all CAs connectivity). -- Hal if fabric topology is not suitable there will be no routes between some nodes. It reports when the route exists between source and destination LIDs. Signed-off-by: Doron Shoham dor...@voltaire.com mailto:dor...@voltaire.com mailto:dor...@voltaire.com mailto:dor...@voltaire.com snip... ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add 'ibcheckspeed' and 'ibcheckportspeed' to scripts
On Mon, Sep 14, 2009 at 2:02 PM, Ira Weiny wei...@llnl.gov wrote: On Fri, 11 Sep 2009 09:32:39 +0530 Keshetti Mahesh keshetti.mah...@gmail.com wrote: My badness. I have not used 'iblinkinfo' before. So, I guess there is no need for the above script. Apart from that, I feel there should be a program/script which will first scan the fabric to find the maximum common supported width/speed and then report the warning messages of the links/ports which are configured with active width/speed less than the found value. Is there any tool already exists which does the same ? Not that I know of. ibportstate does this but is on a per port basis. This could be readily scripted (ad hoc or in tree) for this purpose. -- Hal snip... While I could see the usefulness of such a tool in some environments I have gone down the path of making the OFED diags more generic and then writing some wrappers for our local needs. Currently I have a script which runs iblinkinfo with the -l option and then returns total number of links at SDR, DDR, QDR as well as the number of links at 1, 4, or 12X. I then leave it up to the sys admin to know if their cluster is homo or heterogenious and how many links should be at what speeds. They can then use iblinkinfo to identify which links are incorrect for their particular installation. Ira - Keshetti Mahesh On Thu, Sep 10, 2009 at 9:32 PM, Ira Weiny wei...@llnl.gov wrote: Also, iblinkinfo will report links which it finds capable of either faster or wider operation. iblinkinfo checks both ends of the link as Hal mentions. It reports this with output like. Switch 0x0005ad092106 Cisco Switch SFS7000D: ... 78[ ] ==( 4X 2.5 Gbps Active/ LinkUp)== 8 12[ ] MT47396 Infiniscale-III Mellanox Technologies ( Could be 5.0 Gbps) ... Also the portstatus console command in OpenSM will report links which are running at reduced speed or width. Although this does not check the remote port. OpenSM $ help portstatus portstatus [ca|switch|router] summarize port status [ca|switch|router] -- limit the results to the node type specified OpenSM $ portstatus ALL port status: 115 port(s) scanned on 9 nodes in 26 us 85 down 30 active 32 at 4X 22 at 2.5 Gbps 8 at 5.0 Gbps 2 at 10.0 Gbps Possible issues: 2 disabled 0x0008f10400411b18 5 (ISR9024D Voltaire) 0x0005ad092106 13 (Cisco Switch SFS7000D) 6 with reduced speed 0x0008f10500200220 33 (Voltaire 4036 - 36 QDR ports switch) 0x0008f10500200220 19 (Voltaire 4036 - 36 QDR ports switch) 0x0005ad092106 21 (Cisco Switch SFS7000D) 0x0005ad092106 20 (Cisco Switch SFS7000D) 0x0005ad092106 9 (Cisco Switch SFS7000D) 0x0005ad092106 8 (Cisco Switch SFS7000D) Ira On Thu, 10 Sep 2009 09:23:35 -0400 Hal Rosenstock hal.rosenst...@gmail.com wrote: On Thu, Sep 10, 2009 at 9:02 AM, Keshetti Mahesh keshetti.mah...@gmail.comwrote: Added 'ibcheckspeed' and 'ibcheckportspeed': Similar to 'ibcheckwidth/ibcheckportwidth' in functionality and implementation. Reports error/warning messages if the LinkSpeedActive is configured as 2.5 Gbps when the LinkSpeedSupported is more than 2.5 Gbps. ibportstate checks for more than this in terms of speed (and width) anomalies. Would it be better for these scripts to use that tool now ? Alternatively, the additional speed/width anomaly checks could be implemented in these scripts but it does involve checking the peer port so there's a little more to it. -- Hal Signed-off-by: Keshetti Mahesh keshetti.mah...@gmail.com --- infiniband-diags/scripts/ibcheckportspeed.in | 146 ++ infiniband-diags/scripts/ibcheckportwidth.in |2 +- infiniband-diags/scripts/ibcheckspeed.in | 135 3 files changed, 282 insertions(+), 1 deletions(-) create mode 100644 infiniband-diags/scripts/ibcheckportspeed.in create mode 100644 infiniband-diags/scripts/ibcheckspeed.in snip... -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 wei...@llnl.gov -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 wei...@llnl.gov ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] opensm/opensm.8.in: Cosmetic formatting change
Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/opensm/man/opensm.8.in b/opensm/man/opensm.8.in index 5ad7631..fcdc168 100644 --- a/opensm/man/opensm.8.in +++ b/opensm/man/opensm.8.in @@ -257,7 +257,6 @@ This option provides the means to define a set of ports equalization algorithm. .TP \fB\-w\fR, \fB\-\-hop_weights_file\fR path to file - This option provides weighting factors per port representing a hop cost in computing the lid matrix. The file consists of lines containing a switch port GUID (specified as a 64 bit hex number, with leading 0x), output port number, @@ -265,7 +264,6 @@ and weighting factor. Any port not listed in the file defaults to a weighting factor of 1. Lines starting with # are comments. Weights affect only the output route from the port, so many useful configurations will require weights to be specified in pairs. - .TP \fB\-x\fR, \fB\-\-honor_guid2lid\fR This option forces OpenSM to honor the guid2lid file, ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add 'ibcheckspeed' and 'ibcheckportspeed' to scripts
On Fri, Sep 11, 2009 at 2:13 AM, Barry Mavin barry.ma...@recital.comwrote: When I start the subnet manager on redhat 5.3 with: # service opensm restart I get the following messages in the log file. Sep 11 11:41:46 252576 [4EF83940] 0x01 - __osm_mcmr_rcv_join_mgrp: ERR 1B12: __validate_more_comp_fields, __validate_port_caps, or JoinState = 0 failed from port 0x0002c90300047b91 (ibas1 HCA-1), sending IB_SA_MAD_STATUS_REQ_INVALID Sep 11 11:41:46 252688 [4EF83940] 0x01 - __osm_mcmr_rcv_join_mgrp: ERR 1B12: __validate_more_comp_fields, __validate_port_caps, or JoinState = 0 failed from port 0x0002c90300044c61 (ibds2 HCA-1), sending IB_SA_MAD_STATUS_REQ_INVALID Sep 11 11:41:46 252731 [4EF83940] 0x01 - __osm_mcmr_rcv_join_mgrp: ERR 1B12: __validate_more_comp_fields, __validate_port_caps, or JoinState = 0 failed from port 0x0002c90300047b7d (ibds1 HCA-1), sending IB_SA_MAD_STATUS_REQ_INVALID Sep 11 11:41:46 252929 [43170940] 0x01 - __osm_mcmr_rcv_join_mgrp: ERR 1B12: __validate_more_comp_fields, __validate_port_caps, or JoinState = 0 failed from port 0x0002c90300047b75 (ibas2 HCA-1), sending IB_SA_MAD_STATUS_REQ_INVALID What is the cause of these? Those ports are unable to join some multicast group likely due to rate or MTU mismatch with the group. What are their rates/MTUs ? See opensm man page on partition configuration for the default partition for information on how to change the MTU/rate. -- Hal --- Regards Barry Mavin Recital Corporation Chairman and CEO Website: http://www.recital.com MSN Messenger: barry_ma...@msn.com Skype: BarryMavin Direct line worldwide: +1 9785224139 From: Keshetti Mahesh keshetti.mah...@gmail.com Date: Fri, 11 Sep 2009 11:16:44 +0530 To: Barry Mavin barry.ma...@recital.com Cc: OFED mailing list linux-r...@vger.kernel.org, OFED mailing list general@lists.openfabrics.org Subject: Re: [ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add 'ibcheckspeed' and 'ibcheckportspeed' to scripts # ibtracert 10.10.10.1 10.10.10.3 ibtracert only supports source/destination addresses to be specified in LID/GUID format. See man page of ibtracert. - Keshetti Mahesh -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] question about partitioning IB networks
On Fri, Sep 11, 2009 at 11:18 AM, Sasha Khapyorsky sas...@voltaire.comwrote: On 09:17 Thu 03 Sep , Hal Rosenstock wrote: Also it says the default partition will be created “unconditionally even when partition configuration file does not exist or cannot be accessed.” Will it also be created if the partition configuration file exists but does not have a default partition defined? No. AFAIR OpenSM will create the default partition even before partitions config file parsing, when the file exists it will be: Default=0x7fff: ALL=limited, SELF=full; (no IPoIB there). And this can be overwritten by configuration. Indeed it does (since the default partition is always required). The man page (and partition doc) should be updated to clarify this. -- Hal Sasha ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] [PATCH] infiniband-diags/scripts: Add ibcheckroutes to scripts
On Thu, Sep 10, 2009 at 7:56 AM, Doron Shoham dor...@voltaire.com wrote: ibcheckroutes validates route between all hosts in the fabric. This script finds all leaf switches (switches that are connected to HCAs) CAs or HCAs ? What about switch port 0s ? and runs ibtracert between them. When using various routing algorithms (e.g. up-down), With which routing algorithms has this been tried ? -- Hal if fabric topology is not suitable there will be no routes between some nodes. It reports when the route exists between source and destination LIDs. Signed-off-by: Doron Shoham dor...@voltaire.com snip... ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add 'ibcheckspeed' and 'ibcheckportspeed' to scripts
On Thu, Sep 10, 2009 at 9:02 AM, Keshetti Mahesh keshetti.mah...@gmail.comwrote: Added 'ibcheckspeed' and 'ibcheckportspeed': Similar to 'ibcheckwidth/ibcheckportwidth' in functionality and implementation. Reports error/warning messages if the LinkSpeedActive is configured as 2.5 Gbps when the LinkSpeedSupported is more than 2.5 Gbps. ibportstate checks for more than this in terms of speed (and width) anomalies. Would it be better for these scripts to use that tool now ? Alternatively, the additional speed/width anomaly checks could be implemented in these scripts but it does involve checking the peer port so there's a little more to it. -- Hal Signed-off-by: Keshetti Mahesh keshetti.mah...@gmail.com --- infiniband-diags/scripts/ibcheckportspeed.in | 146 ++ infiniband-diags/scripts/ibcheckportwidth.in |2 +- infiniband-diags/scripts/ibcheckspeed.in | 135 3 files changed, 282 insertions(+), 1 deletions(-) create mode 100644 infiniband-diags/scripts/ibcheckportspeed.in create mode 100644 infiniband-diags/scripts/ibcheckspeed.in snip... ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCHv2] opensm/osm_inform.c: For traps 64-67, use GID from DataDetails in log message
Issuer GID is uninteresting for SM generated notices Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- Changes since v1: Unified OSM_LOG call for traps 64-67 Also, added log level check diff --git a/opensm/opensm/osm_inform.c b/opensm/opensm/osm_inform.c index 990f1e0..6e1a2b5 100644 --- a/opensm/opensm/osm_inform.c +++ b/opensm/opensm/osm_inform.c @@ -312,7 +312,7 @@ static ib_api_status_t send_report(IN osm_infr_t * p_infr_rec, /* the informinfo /* it is better to use LIDs since the GIDs might not be there for SMI traps */ OSM_LOG(p_log, OSM_LOG_DEBUG, Forwarding Notice Event from LID:%u -to InformInfo LID: %u TID:0x%X\n, +to InformInfo LID:%u TID:0x%X\n, cl_ntoh16(p_ntc-issuer_lid), cl_ntoh16(p_infr_rec-report_addr.dest_lid), trap_fwd_trans_id); @@ -545,6 +545,7 @@ ib_api_status_t osm_report_notice(IN osm_log_t * p_log, IN osm_subn_t * p_subn, cl_list_t infr_to_remove_list; osm_infr_t *p_infr_rec; osm_infr_t *p_next_infr_rec; + ib_gid_t *p_gid; OSM_LOG_ENTER(p_log); @@ -559,8 +560,18 @@ ib_api_status_t osm_report_notice(IN osm_log_t * p_log, IN osm_subn_t * p_subn, return (IB_ERROR); } + if (!osm_log_is_active(p_log, OSM_LOG_INFO)) + goto skip_log; + /* an official Event information log */ - if (ib_notice_is_generic(p_ntc)) + if (ib_notice_is_generic(p_ntc)) { + if ((p_ntc-g_or_v.generic.trap_num == CL_HTON16(64)) || + (p_ntc-g_or_v.generic.trap_num == CL_HTON16(65)) || + (p_ntc-g_or_v.generic.trap_num == CL_HTON16(66)) || + (p_ntc-g_or_v.generic.trap_num == CL_HTON16(67))) + p_gid = (ib_gid_t *)p_ntc-data_details.ntc_64_67.gid.raw; + else + p_gid = (ib_gid_t *)p_ntc-issuer_gid.raw; OSM_LOG(p_log, OSM_LOG_INFO, Reporting Generic Notice type:%u num:%u (%s) from LID:%u GID:%s\n, @@ -568,9 +579,8 @@ ib_api_status_t osm_report_notice(IN osm_log_t * p_log, IN osm_subn_t * p_subn, cl_ntoh16(p_ntc-g_or_v.generic.trap_num), ib_get_trap_str(p_ntc-g_or_v.generic.trap_num), cl_ntoh16(p_ntc-issuer_lid), - inet_ntop(AF_INET6, p_ntc-issuer_gid.raw, gid_str, - sizeof gid_str)); - else + inet_ntop(AF_INET6, p_gid-raw, gid_str, sizeof gid_str)); + } else OSM_LOG(p_log, OSM_LOG_INFO, Reporting Vendor Notice type:%u vend:%u dev:%u from LID:%u GID:%s\n, @@ -581,6 +591,7 @@ ib_api_status_t osm_report_notice(IN osm_log_t * p_log, IN osm_subn_t * p_subn, inet_ntop(AF_INET6, p_ntc-issuer_gid.raw, gid_str, sizeof gid_str)); +skip_log: /* Create a list that will hold all the infr records that should be removed due to violation. o13-17.1.2 */ cl_list_construct(infr_to_remove_list); ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] infiniband-diags/ibportstate: Support changing of link width
Also, update man page Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/infiniband-diags/man/ibportstate.8 b/infiniband-diags/man/ibportstate.8 index 9b5e618..b64c18d 100644 --- a/infiniband-diags/man/ibportstate.8 +++ b/infiniband-diags/man/ibportstate.8 @@ -1,4 +1,4 @@ -.TH IBPORTSTATE 8 October 19, 2006 OpenIB OpenIB Diagnostics +.TH IBPORTSTATE 8 September 8, 2009 OpenIB OpenIB Diagnostics .SH NAME ibportstate \- handle port (physical) state and link speed of an InfiniBand port @@ -23,16 +23,19 @@ also allows the link speed enabled on any IB port to be adjusted. .TP op Port operations allowed - supported ops: enable, disable, reset, speed, query + supported ops: enable, disable, reset, speed, width, query Default is query .PP ops enable, disable, and reset are only allowed on switch ports (An error is indicated if attempted on CA or router ports) - speed op is allowed on any port + speed and width ops are allowed on any port speed values are legal values for PortInfo:LinkSpeedEnabled (An error is indicated if PortInfo:LinkSpeedSupported does not support this setting) - (NOTE: Speed changes are not effected until the port goes through + width valyes are legal values for PortInfo:LinkWidthEnabled + (An error is indicated if PortInfo:LinkWidthSupported does not support + this setting) + (NOTE: Speed and width changes are not effected until the port goes through link renegotiation) query also validates port characteristics (link width and speed) based on the peer port. This checking is done when the port @@ -108,8 +111,10 @@ ibportstate -D 0 1 # (query) by direct route ibportstate 3 1 reset # by lid .PP ibportstate 3 1 speed 1# by lid +.PP +ibportstate 3 1 width 1# by lid .SH AUTHOR .TP Hal Rosenstock -.RI h...@voltaire.com +.RI hal.rosenst...@gmail.com diff --git a/infiniband-diags/src/ibportstate.c b/infiniband-diags/src/ibportstate.c index 76e74f7..d20961f 100644 --- a/infiniband-diags/src/ibportstate.c +++ b/infiniband-diags/src/ibportstate.c @@ -204,6 +204,7 @@ int main(int argc, char **argv) int err; int port_op = 0;/* default to query */ int speed = 15; + int new_width = 255; int is_switch = 1; int state, physstate, lwe, lws, lwa, lse, lss, lsa; int peerlocalportnum, peerlwe, peerlws, peerlwa, peerlse, peerlss, @@ -216,13 +217,14 @@ int main(int argc, char **argv) int selfport = 0; char usage_args[] = dest dr_path|lid|guid portnum [op]\n - \nSupported ops: enable, disable, reset, speed, query; + \nSupported ops: enable, disable, reset, speed, width, query; const char *usage_examples[] = { 3 1 disable\t\t\t# by lid, -G 0x2C9000100D051 1 enable\t# by guid, -D 0 1\t\t\t# (query) by direct route, 3 1 reset\t\t\t# by lid, 3 1 speed 1\t\t\t# by lid, + 3 1 width 1\t\t\t# by lid, NULL }; @@ -263,6 +265,15 @@ int main(int argc, char **argv) speed = strtoul(argv[3], 0, 0); if (speed 15) IBERROR(invalid speed value %d, speed); + } else if (!strcmp(argv[2], width)) { + if (argc 4) + IBERROR + (width requires an additional parameter); + port_op = 5; + /* Parse width value */ + new_width = strtoul(argv[3], 0, 0); + if (new_width 255) + IBERROR(invalid width value %d, new_width); } } @@ -298,6 +309,11 @@ int main(int argc, char **argv) speed); mad_set_field(data, 0, IB_PORT_STATE_F, 0); mad_set_field(data, 0, IB_PORT_PHYS_STATE_F, 0); + } else if (port_op == 5) { /* Set width */ + mad_set_field(data, 0, IB_PORT_LINK_WIDTH_ENABLED_F, + new_width); + mad_set_field(data, 0, IB_PORT_STATE_F, 0); + mad_set_field(data, 0, IB_PORT_PHYS_STATE_F, 0); } err = set_port_info(portid, data, portnum, port_op); ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] opensm doc: Indicated limited (rather than partial) partition membership
Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/opensm/doc/partition-config.txt b/opensm/doc/partition-config.txt index ead3f76..f855268 100644 --- a/opensm/doc/partition-config.txt +++ b/opensm/doc/partition-config.txt @@ -10,7 +10,7 @@ when partition configuration file does not exist or cannot be accessed. The default partition has P_Key value 0x7fff. OpenSM's port will have full membership in default partition. All other end ports will have -partial membership. +limited membership. File Format diff --git a/opensm/man/opensm.8.in b/opensm/man/opensm.8.in index b23a973..5ad7631 100644 --- a/opensm/man/opensm.8.in +++ b/opensm/man/opensm.8.in @@ -1,4 +1,4 @@ -.TH OPENSM 8 May 28, 2009 OpenIB OpenIB Management +.TH OPENSM 8 September 3, 2009 OpenIB OpenIB Management .SH NAME opensm \- InfiniBand subnet manager and administration (SM/SA) @@ -428,7 +428,7 @@ when partition configuration file does not exist or cannot be accessed. The default partition has P_Key value 0x7fff. OpenSM\'s port will have full membership in default partition. All other end ports will have -partial membership. +limited membership. File Format ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] question about partitioning IB networks
On Mon, Aug 31, 2009 at 3:29 PM, Meyer, Donald J donald.j.me...@intel.comwrote: I am trying to partition my IB network but I don’t seem to be able to understand the opensm man page. First it says “The default partition has P_Key value 0x7fff. OpenSM´s port will have full membership in default partition. All other end ports will have partial membership.” but I don’t see the difference defined between full and partial membership anywhere. Is it possible the reference was to full and limited membership instead? Yes, partial == limited. I've just sent a patch to change that word in the man page and doc. Does this partition have to exist on all CA’s so the SM can “talk” them? Yes, this is an IBA requirement. Also it says the default partition will be created “unconditionally even when partition configuration file does not exist or cannot be accessed.” Will it also be created if the partition configuration file exists but does not have a default partition defined? No. Second, I see where CA’s can be members of multiple partitions (have multiple P_keys). If a CA is in multiple partitions (has multiple P_Keys assigned to it), which partition does it “send” on when the CA has packets to send if more than one partition can reach the destination CA? That's up to the application/ULP to set the proper PKey index. The application/ULP needs to ensure the destination is reachable via a common PKey. It does that via some sort of PathRecord request to the SA. Also do switches (or any non CA’s) have to have P_Keys assigned for any reason? Yes, but with OpenSM they do not need configuration. OpenSM detects which switches are leaf switches with peer CA ports and sets up their partition tables appropriately. Just as a sanity check, my interpretation so far is that my network should have a partition configuration file similar to the following. Can anyone tell me if I have this correct? In this example configuration, I am trying to create two partitions. One with rack one and two, the other with rack three and four: #Default partition (for SM control of the CA’s) Default=0x7fff,ipoib,rate=7:ALL=limited; Default=0x7fff,ipoib,rate=7:ALL,SELF=full; #rack1 rack1=0x111,ipoib,rate=7,defmember=full:GUID_list; #rack2 rack2=0x111,ipoib,rate=7,defmember=full:GUID_list; #rack3 rack3=0x112,ipoib,rate=7,defmember=full:GUID_list; #rack4 rack4=0x112,ipoib,rate=7,defmember=full:GUID_list; I've never done it this way but it does look like the partition create code will detect the duplicated partitions (0x111 and 0x112) and merge ports from rack2 with rack1 and rack4 with rack3. -- Hal *Thanks,* *Don Meyer* *Senior Network/System Engineer/Programmer* US+ (253) 371-9532 iNet 8-371-9532 **Other names and brands may be claimed as the property of others* ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] question about partitioning IB networks
On Thu, Sep 3, 2009 at 9:43 AM, Yevgeny Kliteynik klit...@dev.mellanox.co.il wrote: Hal Rosenstock wrote: On Mon, Aug 31, 2009 at 3:29 PM, Meyer, Donald J donald.j.me...@intel.com mailto:donald.j.me...@intel.com wrote: ... ... Just as a sanity check, my interpretation so far is that my network should have a partition configuration file similar to the following. Can anyone tell me if I have this correct? In this example configuration, I am trying to create two partitions. One with rack one and two, the other with rack three and four: #Default partition (for SM control of the CA’s) Default=0x7fff,ipoib,rate=7:ALL=limited; Default=0x7fff,ipoib,rate=7:ALL,SELF=full; #rack1 rack1=0x111,ipoib,rate=7,defmember=full:GUID_list; #rack2 rack2=0x111,ipoib,rate=7,defmember=full:GUID_list; #rack3 rack3=0x112,ipoib,rate=7,defmember=full:GUID_list; #rack4 rack4=0x112,ipoib,rate=7,defmember=full:GUID_list; I've never done it this way but it does look like the partition create code will detect the duplicated partitions (0x111 and 0x112) and merge ports from rack2 with rack1 and rack4 with rack3. It will. Note that partition names are meaningless in terms of IB management. Basically they are used just for logging. The only real partition ID is its pkey. The low 7 bits (without membership bit) of pkey denotes partition. -- Hal -- Yevgeny -- Hal *Thanks,* *Don Meyer* /Senior Network/System Engineer/Programmer/ US+ (253) 371-9532 iNet 8-371-9532 /*Other names and brands may be claimed as the property of others/ ___ general mailing list general@lists.openfabrics.org mailto:general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] question about partitioning IB networks
Don, On Thu, Sep 3, 2009 at 1:12 PM, Meyer, Donald J donald.j.me...@intel.comwrote: Hal, If you would like to use my example configuration in the man page (the one with realistic GUID’s) please feel free to do so. The GUID’s are all imaginary but realistic. Are you sure the default partition should be “Default=0x7fff,ipoib,rate=7:ALL,SELF=full;” and not “Default=0x7fff:SELF=full,ALL=limited;”? The second version forces both known and unknown CA’s to be unable to reach any CA but the sm except via their own partition. I thought that was what you wanted. I thought you only wanted the CAs to be able to talk with each other on the designated non default partitions. It also seems to me that the first version bypasses partitioning by allowing CA’s to use the default partition to reach other CA’s not in the same partition. You mean the since they are all full members of the default partition they can talk to each other on that partition despite that not being allowed on some other partition. If so, yes. Also, if you would like, I would be happy to work on a version of the man page where I would try to possibly explain a bit more and have more complete examples. Sure; if you want you are welcome to post patches to the list for review, comment, etc. -- Hal *Thanks,* *Don Meyer* *Senior Network/System Engineer/Programmer* US+ (253) 371-9532 iNet 8-371-9532 **Other names and brands may be claimed as the property of others* -- *From:* Hal Rosenstock [mailto:hal.rosenst...@gmail.com] *Sent:* Thursday, September 03, 2009 6:46 AM *To:* klit...@dev.mellanox.co.il *Cc:* Meyer, Donald J; general@lists.openfabrics.org *Subject:* Re: [ofa-general] question about partitioning IB networks On Thu, Sep 3, 2009 at 9:43 AM, Yevgeny Kliteynik klit...@dev.mellanox.co.il wrote: Hal Rosenstock wrote: On Mon, Aug 31, 2009 at 3:29 PM, Meyer, Donald J donald.j.me...@intel.com mailto:donald.j.me...@intel.com wrote: ... ... Just as a sanity check, my interpretation so far is that my network should have a partition configuration file similar to the following. Can anyone tell me if I have this correct? In this example configuration, I am trying to create two partitions. One with rack one and two, the other with rack three and four: #Default partition (for SM control of the CA’s) Default=0x7fff,ipoib,rate=7:ALL=limited; Default=0x7fff,ipoib,rate=7:ALL,SELF=full; #rack1 rack1=0x111,ipoib,rate=7,defmember=full:GUID_list; #rack2 rack2=0x111,ipoib,rate=7,defmember=full:GUID_list; #rack3 rack3=0x112,ipoib,rate=7,defmember=full:GUID_list; #rack4 rack4=0x112,ipoib,rate=7,defmember=full:GUID_list; I've never done it this way but it does look like the partition create code will detect the duplicated partitions (0x111 and 0x112) and merge ports from rack2 with rack1 and rack4 with rack3. It will. Note that partition names are meaningless in terms of IB management. Basically they are used just for logging. The only real partition ID is its pkey. The low 7 bits (without membership bit) of pkey denotes partition. -- Hal -- Yevgeny -- Hal *Thanks,* *Don Meyer* /Senior Network/System Engineer/Programmer/ US+ (253) 371-9532 iNet 8-371-9532 /*Other names and brands may be claimed as the property of others/ ___ general mailing list general@lists.openfabrics.org mailto:general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] libibmad/dump.c: In mad_dump_portcapmask, decode new capabilities
Per published MgtWG errata RefID 4484 - vendor specific MADs table support RefID 4626 - reverse path PKey support in PathRecord responses RefID 4635 - multicast FDB top support RefID 4644 - hierarchy support Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/libibmad/src/dump.c b/libibmad/src/dump.c index d97d359..1b287c0 100644 --- a/libibmad/src/dump.c +++ b/libibmad/src/dump.c @@ -2,6 +2,7 @@ * Copyright (c) 2004-2008 Voltaire Inc. All rights reserved. * Copyright (c) 2007 Xsigo Systems Inc. All rights reserved. * Copyright (c) 2009 Mellanox Technologies LTD. All rights reserved. + * Copyright (c) 2009 HNR Consulting. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU @@ -519,8 +520,14 @@ void mad_dump_portcapmask(char *buf, int bufsz, void *val, int valsz) if (mask (1 27)) s += sprintf(s, \t\t\t\tIsLinkSpeedWidthPairsTableSupported\n); + if (mask (1 28)) + s += sprintf(s, \t\t\t\tIsVendorSpecificMadsTableSupported\n); + if (mask (1 29)) + s += sprintf(s, \t\t\t\tIsiMcastPkeyTrapSuppressionSupported\n); if (mask (1 30)) s += sprintf(s, \t\t\t\tIsMulticastFDBTopSupported\n); + if (mask (1 31)) + s += sprintf(s, \t\t\t\tIsHierarchyInfoSupported\n); if (s != buf) *(--s) = 0; ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] opensm: Add support for MulticastFDBTop
Add support for SwitchInfo:MulticastFDBTop Added by MgtWG errata #4505-4508 Also, per MgtWG RefID #4640, MulticastFDBTop value of 0xbfff means no entries In osm_mcast_mgr.c:mcast_mgr_set_mftables call new routine mcast_mgr_set_mfttop to set MulticastFDBTop in SwitchInfo based on max_block_in_use when switch port 0 indicates IsMulticastFDBTop is supported. Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c index d7c5ce1..3671e08 100644 --- a/opensm/opensm/osm_mcast_mgr.c +++ b/opensm/opensm/osm_mcast_mgr.c @@ -1066,6 +1066,83 @@ Exit: /** **/ +static void mcast_mgr_set_mfttop(IN osm_sm_t * sm, IN osm_switch_t * p_sw) +{ + osm_node_t *p_node; + osm_dr_path_t *p_path; + osm_physp_t *p_physp; + osm_mcast_tbl_t *p_tbl; + osm_madw_context_t context; + ib_api_status_t status; + ib_switch_info_t si; + boolean_t set_swinfo_require = FALSE; + uint16_t mcast_top; + uint8_t life_state; + + OSM_LOG_ENTER(sm-p_log); + + CL_ASSERT(p_sw); + + p_node = p_sw-p_node; + + CL_ASSERT(p_node); + + p_physp = osm_node_get_physp_ptr(p_node, 0); + p_path = osm_physp_get_dr_path_ptr(p_physp); + p_tbl = osm_switch_get_mcast_tbl_ptr(p_sw); + + if (p_physp-port_info.capability_mask IB_PORT_CAP_HAS_MCAST_FDB_TOP) { + /* + Set the top of the multicast forwarding table. +*/ + si = p_sw-switch_info; + if (p_tbl-max_block_in_use == -1) + mcast_top = cl_hton16(IB_LID_MCAST_START_HO - 1); + else + mcast_top = cl_hton16(IB_LID_MCAST_START_HO + + (p_tbl-max_block_in_use + 1) * IB_MCAST_BLOCK_SIZE - 1); + if (mcast_top != si.mcast_top) { + set_swinfo_require = TRUE; + si.mcast_top = mcast_top; + } + + /* check to see if the change state bit is on. If it is - then + we need to clear it. */ + if (ib_switch_info_get_state_change(si)) + life_state = ((sm-p_subn-opt.packet_life_time 3) + | (si.life_state IB_SWITCH_PSC)) 0xfc; + else + life_state = (sm-p_subn-opt.packet_life_time 3) 0xf8; + + if (life_state != si.life_state || + ib_switch_info_get_state_change(si)) { + set_swinfo_require = TRUE; + si.life_state = life_state; + } + + if (set_swinfo_require) { + OSM_LOG(sm-p_log, OSM_LOG_DEBUG, + Setting switch MFT top to MLID 0x%x\n, + cl_ntoh16(si.mcast_top)); + + context.si_context.light_sweep = FALSE; + context.si_context.node_guid = osm_node_get_node_guid(p_node); + context.si_context.set_method = TRUE; + + status = osm_req_set(sm, p_path, (uint8_t *) si, +sizeof(si), IB_MAD_ATTR_SWITCH_INFO, +0, CL_DISP_MSGID_NONE, context); + + if (status != IB_SUCCESS) + OSM_LOG(sm-p_log, OSM_LOG_ERROR, ERR 0A1B: + Sending SwitchInfo attribute failed (%s)\n, + ib_get_err_str(status)); + } + } +} + +/** + **/ static int mcast_mgr_set_mftables(osm_sm_t * sm) { cl_qmap_t *p_sw_tbl = sm-p_subn-sw_guid_tbl; @@ -1081,6 +1158,7 @@ static int mcast_mgr_set_mftables(osm_sm_t * sm) p_tbl = osm_switch_get_mcast_tbl_ptr(p_sw); if (osm_mcast_tbl_get_max_block_in_use(p_tbl) max_block) max_block = osm_mcast_tbl_get_max_block_in_use(p_tbl); + mcast_mgr_set_mfttop(sm, p_sw); p_sw = (osm_switch_t *) cl_qmap_next(p_sw-map_item); } diff --git a/opensm/opensm/osm_sa_class_port_info.c b/opensm/opensm/osm_sa_class_port_info.c index d2ab96a..fb58fe5 100644 --- a/opensm/opensm/osm_sa_class_port_info.c +++ b/opensm/opensm/osm_sa_class_port_info.c @@ -1,6 +1,6 @@ /* * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. + * Copyright (c) 2002-2009 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation
[ofa-general] [PATCH] libibmad/mad.h: Add a couple of SM class attribute IDs
VendorSpecificMadsTable added by MgtWG errata RefID 4482 Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/libibmad/include/infiniband/mad.h b/libibmad/include/infiniband/mad.h index 5f3b52b..94b64cf 100644 --- a/libibmad/include/infiniband/mad.h +++ b/libibmad/include/infiniband/mad.h @@ -133,6 +133,8 @@ enum SMI_ATTR_ID { IB_ATTR_VL_ARBITRATION = 0x18, IB_ATTR_LINEARFORWTBL = 0x19, IB_ATTR_MULTICASTFORWTBL = 0x1b, + IB_ATTR_LINKSPEEDWIDTHPAIRSTBL = 0x1c, + IB_ATTR_VENDORMADSTBL = 0x1d, IB_ATTR_SMINFO = 0x20, IB_ATTR_LAST ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] opensm: Add infrastructure support for more newly allocated PortInfo CapabilityMask bits
Per published MgtWG errata: RefID 4484 - vendor specific MADs RefID 4575 - multicast PKey trap suppression RefID 4641 - hierarchy info Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/opensm/include/iba/ib_types.h b/opensm/include/iba/ib_types.h index c9d81cb..25ed35f 100644 --- a/opensm/include/iba/ib_types.h +++ b/opensm/include/iba/ib_types.h @@ -4490,10 +4490,10 @@ typedef struct _ib_port_info { #define IB_PORT_CAP_HAS_CLIENT_REREG (CL_HTON32(0x0200)) #define IB_PORT_CAP_HAS_OTHER_LOCAL_CHANGES_NTC (CL_HTON32(0x0400)) #define IB_PORT_CAP_HAS_LINK_SPEED_WIDTH_PAIRS_TBL (CL_HTON32(0x0800)) -#define IB_PORT_CAP_RESV28(CL_HTON32(0x1000)) -#define IB_PORT_CAP_RESV29(CL_HTON32(0x2000)) +#define IB_PORT_CAP_HAS_VEND_MADS (CL_HTON32(0x1000)) +#define IB_PORT_CAP_HAS_MCAST_PKEY_TRAP_SUPPRESS (CL_HTON32(0x2000)) #define IB_PORT_CAP_HAS_MCAST_FDB_TOP (CL_HTON32(0x4000)) -#define IB_PORT_CAP_RESV31(CL_HTON32(0x8000)) +#define IB_PORT_CAP_HAS_HIER_INFO (CL_HTON32(0x8000)) /f* IBA Base: Types/ib_port_info_get_port_state * NAME diff --git a/opensm/opensm/osm_helper.c b/opensm/opensm/osm_helper.c index 341d778..4b4e320 100644 --- a/opensm/opensm/osm_helper.c +++ b/opensm/opensm/osm_helper.c @@ -752,15 +752,15 @@ static void dbg_get_capabilities_str(IN char *p_buf, IN const uint32_t buf_size, total_len) != IB_SUCCESS) return; } - if (p_pi-capability_mask IB_PORT_CAP_RESV28) { + if (p_pi-capability_mask IB_PORT_CAP_HAS_VEND_MADS) { if (dbg_do_line(p_local, buf_size, p_prefix_str, - IB_PORT_CAP_RESV28\n, + IB_PORT_CAP_HAS_VEND_MADS\n, total_len) != IB_SUCCESS) return; } - if (p_pi-capability_mask IB_PORT_CAP_RESV29) { + if (p_pi-capability_mask IB_PORT_CAP_HAS_MCAST_PKEY_TRAP_SUPPRESS) { if (dbg_do_line(p_local, buf_size, p_prefix_str, - IB_PORT_CAP_RESV29\n, + IB_PORT_CAP_HAS_MCAST_PKEY_TRAP_SUPPRESS\n, total_len) != IB_SUCCESS) return; } @@ -770,9 +770,9 @@ static void dbg_get_capabilities_str(IN char *p_buf, IN const uint32_t buf_size, total_len) != IB_SUCCESS) return; } - if (p_pi-capability_mask IB_PORT_CAP_RESV31) { + if (p_pi-capability_mask IB_PORT_CAP_HAS_HIER_INFO) { if (dbg_do_line(p_local, buf_size, p_prefix_str, - IB_PORT_CAP_RESV31\n, + IB_PORT_CAP_HAS_HIER_INFO\n, total_len) != IB_SUCCESS) return; } ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] opensm/osm_base.h: Add new SA ClassPortInfo:CapabilityMask2 bit allocations
Per published MgtWG errata: RefID 4626 - reverse path PKey support in PathRecord responses RefID 4635 - multicast FDB top support RefID 4644 - hierarchy support Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/opensm/include/opensm/osm_base.h b/opensm/include/opensm/osm_base.h index 0537002..06223ce 100644 --- a/opensm/include/opensm/osm_base.h +++ b/opensm/include/opensm/osm_base.h @@ -1,6 +1,6 @@ /* * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2006 Mellanox Technologies LTD. All rights reserved. + * Copyright (c) 2002-2009 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * Copyright (c) 2009 Sun Microsystems, Inc. All rights reserved. * @@ -776,6 +776,41 @@ typedef enum _osm_thread_state { #define OSM_CAP2_IS_QOS_SUPPORTED (1 1) /***/ +/d* OpenSM: Base/OSM_CAP2_IS_REVERSE_PATH_PKEY_SUPPPORTED +* Name +* OSM_CAP2_IS_REVERSE_PATH_PKEY_SUPPPORTED +* +* DESCRIPTION +* Reverse path PKeys indicate in PathRecord responses +* +* SYNOPSIS +*/ +#define OSM_CAP2_IS_REVERSE_PATH_PKEY_SUPPPORTED (1 2) +/***/ + +/d* OpenSM: Base/OSM_CAP2_IS_MCAST_TOP_SUPPORTED +* Name +* OSM_CAP2_IS_MCAST_TOP_SUPPORTED +* +* DESCRIPTION +* SwitchInfo.MulticastFDBTop is supported +* +* SYNOPSIS +*/ +#define OSM_CAP2_IS_MCAST_TOP_SUPPORTED (1 3) +/***/ + +/d* OpenSM: Base/OSM_CAP2_IS_HIERARCHY_SUPPORTED +* Name +* +* DESCRIPTION +* Hierarchy info suppported +* +* SYNOPSIS +*/ +#define OSM_CAP2_IS_HIERARCHY_SUPPORTED (1 4) +/***/ + /d* OpenSM: Base/osm_signal_t * NAME * osm_signal_t ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCHv2] opensm: Parallelize (Stripe) MFT sets across switches
Similar to previous patch to Parallelize (Stripe) LFT sets across switches. Currently, MADs are pipelined to a single switch first which effectively serializes these requests. This patch pipelines the MFT set MADs across switches first (before cycling to the next MFT block) so that multiple switches can be responding concurrently. Speedup is dependent on number of MFT blocks in use (number of MLIDs) which is dependent on the number of multicast groups. Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- Changes since v1: Fixed loop which stripes MFT block across switches Changed routine name from mcast_mgr_set_tbl to mcast_mgr_set_mft_block and added block_num and position parameters Consolidate code into mcast_mgr_set_mftables diff --git a/opensm/include/opensm/osm_switch.h b/opensm/include/opensm/osm_switch.h index 7ce28c5..e281842 100644 --- a/opensm/include/opensm/osm_switch.h +++ b/opensm/include/opensm/osm_switch.h @@ -1,6 +1,6 @@ /* * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2008 Mellanox Technologies LTD. All rights reserved. + * Copyright (c) 2002-2009 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two @@ -103,6 +103,8 @@ typedef struct osm_switch { uint8_t *lft; uint8_t *new_lft; osm_mcast_tbl_t mcast_tbl; + uint32_t mft_block_num; + uint32_t mft_position; unsigned endport_links; unsigned need_update; void *priv; diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c index 4dbbaa0..708d837 100644 --- a/opensm/opensm/osm_mcast_mgr.c +++ b/opensm/opensm/osm_mcast_mgr.c @@ -1,6 +1,6 @@ /* * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2006 Mellanox Technologies LTD. All rights reserved. + * Copyright (c) 2002-2009 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved. * @@ -321,16 +321,14 @@ static osm_switch_t *mcast_mgr_find_root_switch(osm_sm_t * sm, /** **/ -static int mcast_mgr_set_tbl(osm_sm_t * sm, IN osm_switch_t * p_sw) +static int mcast_mgr_set_mft_block(osm_sm_t * sm, IN osm_switch_t * p_sw, + uint32_t block_num, uint32_t position) { osm_node_t *p_node; osm_dr_path_t *p_path; - osm_madw_context_t mad_context; + osm_madw_context_t context; ib_api_status_t status; - uint32_t block_id_ho = 0; - int16_t block_num = 0; - uint32_t position = 0; - uint32_t max_position; + uint32_t block_id_ho; osm_mcast_tbl_t *p_tbl; ib_net16_t block[IB_MCAST_BLOCK_SIZE]; int ret = 0; @@ -353,23 +351,25 @@ static int mcast_mgr_set_tbl(osm_sm_t * sm, IN osm_switch_t * p_sw) configuration. */ - mad_context.mft_context.node_guid = osm_node_get_node_guid(p_node); - mad_context.mft_context.set_method = TRUE; + context.mft_context.node_guid = osm_node_get_node_guid(p_node); + context.mft_context.set_method = TRUE; p_tbl = osm_switch_get_mcast_tbl_ptr(p_sw); - max_position = p_tbl-max_position; - while (osm_mcast_tbl_get_block(p_tbl, block_num, - (uint8_t) position, block)) { - OSM_LOG(sm-p_log, OSM_LOG_DEBUG, - Writing MFT block 0x%X\n, block_id_ho); + if (osm_mcast_tbl_get_block(p_tbl, block_num, + (uint8_t) position, block)) { block_id_ho = block_num + (position 28); + OSM_LOG(sm-p_log, OSM_LOG_DEBUG, + Writing MFT block %u position %u to switch 0x% PRIx64 \n, + block_num, position, + cl_ntoh64(context.lft_context.node_guid)); + status = osm_req_set(sm, p_path, (void *)block, sizeof(block), IB_MAD_ATTR_MCAST_FWD_TBL, cl_hton32(block_id_ho), CL_DISP_MSGID_NONE, -mad_context); +context); if (status != IB_SUCCESS) { OSM_LOG(sm-p_log, OSM_LOG_ERROR, ERR 0A02: @@ -377,11 +377,6 @@ static int mcast_mgr_set_tbl(osm_sm_t * sm, IN osm_switch_t * p_sw) ib_get_err_str(status)); ret = -1; } - - if (++position max_position) { - position = 0; - block_num++; - } } OSM_LOG_EXIT(sm-p_log
Re: [ofa-general] Re: [PATCH] infiniband-diags/ibroute: Add support for MulticastFDBTop
On 8/31/09, Sasha Khapyorsky sas...@voltaire.com wrote: On 12:35 Sun 30 Aug , Hal Rosenstock wrote: Doesn't the loop: for (block = startblock; block = lastblock; block++) terminates without any blocks read ? So it shows no entries. Sorry, I still don't understand. Let's suppose that top = 0xbfff, cap = 1024, startlid = 0xc000, endlid = 0xc030 and dump_all = 0. What will prevent MFT entries printing? This will ignore a value of 'top' or I'm missing something? Wouldn't endlid be set to top for this case (since top endlid) ? It ignores endlid and not top in this case. -- Hal Do you mean to print no entries ? No, of course not that :) Sasha ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] osmtest: Add SA get PathRecord stress test
Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/opensm/man/osmtest.8 b/opensm/man/osmtest.8 index fa0cd52..f0d6323 100644 --- a/opensm/man/osmtest.8 +++ b/opensm/man/osmtest.8 @@ -1,4 +1,4 @@ -.TH OSMTEST 8 August 11, 2008 OpenIB OpenIB Management +.TH OSMTEST 8 August 31, 2009 OpenIB OpenIB Management .SH NAME osmtest \- InfiniBand subnet manager and administration (SM/SA) test program @@ -108,9 +108,10 @@ Stress test options are as follows: OPTDescription ---- - -s1 - Single-MAD response SA queries + -s1 - Single-MAD (RMPP) response SA queries -s2 - Multi-MAD (RMPP) response SA queries -s3 - Multi-MAD (RMPP) Path Record SA queries + -s4 - Single-MAD (non RMPP) get Path Record SA queries Without -s, stress testing is not performed .TP diff --git a/opensm/osmtest/include/osmtest_base.h b/opensm/osmtest/include/osmtest_base.h index 7c33da3..cda3a31 100644 --- a/opensm/osmtest/include/osmtest_base.h +++ b/opensm/osmtest/include/osmtest_base.h @@ -56,11 +56,12 @@ #define STRESS_SMALL_RMPP_THR 10 /* -Take long times when quering big clusters (over 40 nodes) , an average of : 0.25 sec for query +Take long times when querying big clusters (over 40 nodes), an average of : 0.25 sec for query each query receives 1000 records */ #define STRESS_LARGE_RMPP_THR 4000 #define STRESS_LARGE_PR_RMPP_THR 2 +#define STRESS_GET_PR 10 extern const char *const p_file; diff --git a/opensm/osmtest/main.c b/opensm/osmtest/main.c index bb2d6bc..4bb9f82 100644 --- a/opensm/osmtest/main.c +++ b/opensm/osmtest/main.c @@ -143,9 +143,10 @@ void show_usage() Stress test options are as follows:\n OPTDescription\n ----\n --s1 - Single-MAD response SA queries\n +-s1 - Single-MAD (RMPP) response SA queries\n -s2 - Multi-MAD (RMPP) response SA queries\n -s3 - Multi-MAD (RMPP) Path Record SA queries\n +-s4 - Single-MAD (non RMPP) get Path Record SA queries\n Without -s, stress testing is not performed\n\n); printf(-M\n --Multicast_Mode\n @@ -499,6 +500,9 @@ int main(int argc, char *argv[]) case 3: printf(Large Path Record SA queries\n); break; + case 4: + printf(SA Get Path Record queries\n); + break; default: printf(Unknown value %u (ignored)\n, opt.stress); diff --git a/opensm/osmtest/osmtest.c b/opensm/osmtest/osmtest.c index 986a8d2..8357d90 100644 --- a/opensm/osmtest/osmtest.c +++ b/opensm/osmtest/osmtest.c @@ -2882,6 +2882,151 @@ Exit: /** **/ +ib_api_status_t +osmtest_stress_path_recs_by_lid(IN osmtest_t * const p_osmt, + IN int mode, + OUT uint32_t * const p_num_recs, + OUT uint32_t * const p_num_queries) +{ + osmtest_req_context_t context; + ib_path_rec_t *p_rec; + cl_status_t status; + ib_net16_t dlid, slid; + int num_recs, i; + + OSM_LOG_ENTER(p_osmt-log); + + memset(context, 0, sizeof(context)); + + slid = cl_ntoh16(p_osmt-local_port.lid); + if (!mode) + dlid = cl_ntoh16(p_osmt-local_port.sm_lid); + else + dlid = cl_ntoh16(p_osmt-local_port.lid); + + /* +* Do a blocking query for the PathRecord. +*/ + status = osmtest_get_path_rec_by_lid_pair(p_osmt, slid, dlid, context); + if (status != IB_SUCCESS) { + OSM_LOG(p_osmt-log, OSM_LOG_ERROR, ERR 000A: + osmtest_get_path_rec_by_lid_pair failed (%s)\n, + ib_get_err_str(status)); + goto Exit; + } + + /* +* Populate the database with the received records. +*/ + num_recs = context.result.result_cnt; + *p_num_recs += num_recs; + ++*p_num_queries; + + if (osm_log_is_active(p_osmt-log, OSM_LOG_VERBOSE)) { + OSM_LOG(p_osmt-log, OSM_LOG_VERBOSE, + Received %u records\n, num_recs); + + for (i = 0; i num_recs; i++) { + p_rec = osmv_get_query_path_rec(context.result.p_result_madw, 0); + osm_dump_path_record(p_osmt-log, p_rec, OSM_LOG_VERBOSE); + } + } + +Exit: + /* +* Return the IB query MAD to the pool as necessary
[ofa-general] Re: osm_link_mgr.c:link_mgr_get_smsl question
Hi Sasha, On 8/29/09, Sasha Khapyorsky sas...@voltaire.com wrote: Hi Hal, On 14:38 Fri 07 Aug , Hal Rosenstock wrote: osm_link_mgr.c:link_mgr_get_smsl has the following: /* Find osm_port of the source = p_physp */ slid = osm_physp_get_base_lid(p_physp); p_src_port = cl_ptr_vector_get(sm-p_subn-port_lid_tbl, cl_ntoh16(slid)); /* Call lash to find proper SL */ sl = osm_get_lash_sl(p_osm, p_src_port, p_sm_port); It may be that this code is invoked prior to the LID being assigned How is it possible? In the code I can see that link_mgr_process() is always executed after lid_mgr run. When nodes use gPXE, the LID is not passed from the gPXE to the Linux environment. so getting the p_src_port based on the LID yields NULL and then calling osm_get_lash_sl causes a seg fault. I can see two ways to fix this: 1. Replace with port GUID search 2. Have osm_get_lash_sl handle NULL for p_src_port Maybe you see other ways to deal with this. Do you have a preferred approach ? Hmm, SMSL will be irrelevant for a port where LID was not assigned, right? Of course. If so than it is probably just enough to add in link_mgr_get_smsl(): if (!p_src_port) return; OK. -- Hal But it would be really better to understand an error source before deciding about proper solution. Sasha ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Re: [PATCH] opensm/ib_types.h: Add CounterSelect2 field to PortCounters attribute
On 8/30/09, Sasha Khapyorsky sas...@voltaire.com wrote: On 11:54 Wed 26 Aug , Hal Rosenstock wrote: Per MgtWG RefID #4527 Also, cosmetic commentary change Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com Applied. Thanks. Next time could you add more descriptive change log to your patches - RefID #4527 by itself doesn't say a lot (and RefID texts is available only in member area of IBTA site). There is a public version now. -- Hal Sasha ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Re: [PATCH] opensm: Add infrastructure support for MulticastFDBTop
On 8/30/09, Sasha Khapyorsky sas...@voltaire.com wrote: On 10:04 Wed 26 Aug , Hal Rosenstock wrote: @@ -5899,6 +5899,8 @@ typedef struct _ib_switch_info { ib_net16_t lids_per_port; ib_net16_t enforce_cap; uint8_t flags; + uint8_t resvd; + ib_net16_t mcast_top; } PACK_SUFFIX ib_switch_info_t; #include complib/cl_packoff.h // @@ -5908,7 +5910,7 @@ typedef struct _ib_switch_info_record { ib_net16_t lid; uint16_t resv0; ib_switch_info_t switch_info; - uint8_t pad[3]; + uint8_t pad[1]; Why should be pad[1] here? In struct switch_info you are adding three bytes (resvd - 1 and mcast_top - 2), no? Good catch. It was due to an initial version which didn't have the 16 bit MFTTop alignment. Do you want a v2 patch for this ? -- Hal Sasha ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCHv2] opensm: Add infrastructure support for MulticastFDBTop
Add support for SwitchInfo:MulticastFDBTop Added by MgtWG errata #4505-4508 Add OpenSM infrastructure support to ib_types.h and osm_helper.c Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- Changes since v1: Removed erroneous pad byte left remaining in ib_switch_info_record_t diff --git a/opensm/include/iba/ib_types.h b/opensm/include/iba/ib_types.h index fe3f051..9e38a6d 100644 --- a/opensm/include/iba/ib_types.h +++ b/opensm/include/iba/ib_types.h @@ -1,6 +1,6 @@ /* * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. + * Copyright (c) 2002-2009 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * Copyright (c) 2009 HNR Consulting. All rights reserved. * @@ -4492,7 +4492,7 @@ typedef struct _ib_port_info { #define IB_PORT_CAP_HAS_LINK_SPEED_WIDTH_PAIRS_TBL (CL_HTON32(0x0800)) #define IB_PORT_CAP_RESV28(CL_HTON32(0x1000)) #define IB_PORT_CAP_RESV29(CL_HTON32(0x2000)) -#define IB_PORT_CAP_RESV30(CL_HTON32(0x4000)) +#define IB_PORT_CAP_HAS_MCAST_FDB_TOP (CL_HTON32(0x4000)) #define IB_PORT_CAP_RESV31(CL_HTON32(0x8000)) /f* IBA Base: Types/ib_port_info_get_port_state @@ -5899,6 +5899,8 @@ typedef struct _ib_switch_info { ib_net16_t lids_per_port; ib_net16_t enforce_cap; uint8_t flags; + uint8_t resvd; + ib_net16_t mcast_top; } PACK_SUFFIX ib_switch_info_t; #include complib/cl_packoff.h // @@ -5908,7 +5910,6 @@ typedef struct _ib_switch_info_record { ib_net16_t lid; uint16_t resv0; ib_switch_info_t switch_info; - uint8_t pad[3]; } PACK_SUFFIX ib_switch_info_record_t; #include complib/cl_packoff.h diff --git a/opensm/opensm/osm_helper.c b/opensm/opensm/osm_helper.c index 3692474..b5a29c2 100644 --- a/opensm/opensm/osm_helper.c +++ b/opensm/opensm/osm_helper.c @@ -1,6 +1,6 @@ /* * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. + * Copyright (c) 2002-2009 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * Copyright (c) 2009 HNR Consulting. All rights reserved. * Copyright (c) 2009 Sun Microsystems, Inc. All rights reserved. @@ -764,9 +764,9 @@ static void dbg_get_capabilities_str(IN char *p_buf, IN const uint32_t buf_size, total_len) != IB_SUCCESS) return; } - if (p_pi-capability_mask IB_PORT_CAP_RESV30) { + if (p_pi-capability_mask IB_PORT_CAP_HAS_MCAST_FDB_TOP) { if (dbg_do_line(p_local, buf_size, p_prefix_str, - IB_PORT_CAP_RESV30\n, + IB_PORT_CAP_HAS_MCAST_FDB_TOP\n, total_len) != IB_SUCCESS) return; } @@ -1512,7 +1512,8 @@ void osm_dump_switch_info(IN osm_log_t * p_log, \t\t\t\tlife_state..0x%X\n \t\t\t\tlids_per_port...%u\n \t\t\t\tpartition_enf_cap...0x%X\n - \t\t\t\tflags...0x%X\n, + \t\t\t\tflags...0x%X\n + \t\t\t\tmcast_top...0x%X\n, cl_ntoh16(p_si-lin_cap), cl_ntoh16(p_si-rand_cap), cl_ntoh16(p_si-mcast_cap), @@ -1522,7 +1523,8 @@ void osm_dump_switch_info(IN osm_log_t * p_log, p_si-def_mcast_not_port, p_si-life_state, cl_ntoh16(p_si-lids_per_port), - cl_ntoh16(p_si-enforce_cap), p_si-flags); + cl_ntoh16(p_si-enforce_cap), p_si-flags, + cl_ntoh16(p_si-mcast_top)); } } ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCHv2] infiniband-diags/ibroute: Add support for MulticastFDBTop
Add support for SwitchInfo:MulticastFDBTop Added by MgtWG errata #4505-4508 and 4640 If MulticastFDBTop set to other than 0, only fetch MulticastForwardingTable blocks up through MulticastFDBTop rather than MulticastFDBCap If MulticastFDBTop set to 0xbfff, this means no entries (per 4640) Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- Changes since v1: Fixed top range check diff --git a/infiniband-diags/src/ibroute.c b/infiniband-diags/src/ibroute.c index 106c934..1112b87 100644 --- a/infiniband-diags/src/ibroute.c +++ b/infiniband-diags/src/ibroute.c @@ -1,5 +1,6 @@ /* * Copyright (c) 2004-2008 Voltaire Inc. All rights reserved. + * Copyright (c) 2009 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU @@ -140,16 +141,24 @@ char *dump_multicast_tables(ib_portid_t * portid, unsigned startlid, char *s; uint64_t nodeguid; uint32_t mod; - unsigned block, i, j, e, nports, cap, chunks, startblock, lastblock; + unsigned block, i, j, e, nports, cap, chunks, startblock, lastblock, +top; int n = 0; if ((s = check_switch(portid, nports, nodeguid, sw, nd))) return s; mad_decode_field(sw, IB_SW_MCAST_FDB_CAP_F, cap); + mad_decode_field(sw, IB_SW_MCAST_FDB_TOP_F, top); if (!endlid || endlid IB_MIN_MCAST_LID + cap - 1) endlid = IB_MIN_MCAST_LID + cap - 1; + if (!dump_all top top endlid) { + if (top IB_MIN_MCAST_LID - 1 || top IB_MIN_MCAST_LID + cap - 1) + IBWARN(illegal top mlid %x, top); + else + endlid = top; + } if (!startlid) startlid = IB_MIN_MCAST_LID; @@ -187,7 +196,8 @@ char *dump_multicast_tables(ib_portid_t * portid, unsigned startlid, printf( MLid\n); } if (ibverbose) - printf(Switch multicast mlid capability is %d\n, cap); + printf(Switch multicast mlid capability is %d top is 0x%x\n, + cap, top); chunks = ALIGN(nports + 1, 16) / 16; ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Re: [PATCH] infiniband-diags/ibroute: Add support for MulticastFDBTop
On 8/30/09, Sasha Khapyorsky sas...@voltaire.com wrote: On 08:42 Sun 30 Aug , Hal Rosenstock wrote: This is handled by the block loop inside of dump_multicast_tables. Where? I don't see this. Should not it to show nothing (no entries) when top = 0xbfff and dump_all is not set? Doesn't the loop: for (block = startblock; block = lastblock; block++) terminates without any blocks read ? So it shows no entries. Do you mean to print no entries ? -- Hal Sasha ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCHv3] infiniband-diags/ibroute: Add support for MulticastFDBTop
Add support for SwitchInfo:MulticastFDBTop Added by MgtWG errata #4505-4508 and 4640 If MulticastFDBTop set to other than 0, only fetch MulticastForwardingTable blocks up through MulticastFDBTop rather than MulticastFDBCap If MulticastFDBTop set to 0xbfff, this means no entries (per 4640) Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- Changes since v2: Removed redundant clause in top range check Changes since v1: Fixed top range check diff --git a/infiniband-diags/src/ibroute.c b/infiniband-diags/src/ibroute.c index 106c934..00df1ec 100644 --- a/infiniband-diags/src/ibroute.c +++ b/infiniband-diags/src/ibroute.c @@ -1,5 +1,6 @@ /* * Copyright (c) 2004-2008 Voltaire Inc. All rights reserved. + * Copyright (c) 2009 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU @@ -140,16 +141,24 @@ char *dump_multicast_tables(ib_portid_t * portid, unsigned startlid, char *s; uint64_t nodeguid; uint32_t mod; - unsigned block, i, j, e, nports, cap, chunks, startblock, lastblock; + unsigned block, i, j, e, nports, cap, chunks, startblock, lastblock, +top; int n = 0; if ((s = check_switch(portid, nports, nodeguid, sw, nd))) return s; mad_decode_field(sw, IB_SW_MCAST_FDB_CAP_F, cap); + mad_decode_field(sw, IB_SW_MCAST_FDB_TOP_F, top); if (!endlid || endlid IB_MIN_MCAST_LID + cap - 1) endlid = IB_MIN_MCAST_LID + cap - 1; + if (!dump_all top top endlid) { + if (top IB_MIN_MCAST_LID - 1) + IBWARN(illegal top mlid %x, top); + else + endlid = top; + } if (!startlid) startlid = IB_MIN_MCAST_LID; @@ -187,7 +196,8 @@ char *dump_multicast_tables(ib_portid_t * portid, unsigned startlid, printf( MLid\n); } if (ibverbose) - printf(Switch multicast mlid capability is %d\n, cap); + printf(Switch multicast mlid capability is %d top is 0x%x\n, + cap, top); chunks = ALIGN(nports + 1, 16) / 16; ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] opensm: Reduce heap consumption by unicast routing tables (LFTs)
Heap memory consumption by the unicast and multicast routing tables can be reduced. Using valgrind --tool=massif (for heap profiling), there are couple of places that consume most of the heap memory: -38.75% (11,206,656B) 0x43267E: osm_switch_new (osm_switch.c:134) -12.89% (3,728,256B) 0x40F8C9: osm_mcast_tbl_init (osm_mcast_tbl.c:96) osm_switch_new (osm_switch.c:108): p_sw-lft = malloc(IB_LID_UCAST_END_HO + 1); From ib_types.h #define IB_LID_UCAST_END_HO 0xBFFF The LFT can be allocated in smaller chunks. If there is a LID that exeeds the current LFT size, LFT is reallocated with an increased size. This reduces performance and increases memory fragmentation, so this tradeoff is made optional based on new build and config options (see below). Using a 4K chunk as the minimal LFT block reduces the memory used by the LFTs by a factor of 12. For a larger (than 4K) fabric, 4K is added each time the existing LFT size is insufficient. So it looks like for cluster of 2-4K withan LMC of 0 about 40% (!!!) of the heap memory can be saved: - 39% used by LFTs, each with 48K entries - SM can allocate 4K entries instead. There is a new build option to specify whether to include the FT heap optimization code or not. It defaults to off and not include the new code (basically just the code that exists today). A new config option specifies whether to optimize FT allocation and defaults to off. Another new config option will specify the LFT allocation chunk and defaults to 4K. These chunks will be used as the initial minimum allocation and increased in increments of the chunk using realloc. LFTs are only be increased in size and are never reduced in size. If a realloc for an LFT fails, it results in an exit. A similar subsequent change will do this for MFTs. Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/opensm/config/osmvsel.m4 b/opensm/config/osmvsel.m4 index c24930b..1c7c8a2 100644 --- a/opensm/config/osmvsel.m4 +++ b/opensm/config/osmvsel.m4 @@ -232,6 +232,25 @@ fi # --- END OPENIB_OSM_PERF_MGR_SEL --- ]) dnl OPENIB_OSM_PERF_MGR_SEL +dnl Check if they want the FT heap optimization +AC_DEFUN([OPENIB_OSM_FT_OPTIMIZE_HEAP_SEL], [ +# --- BEGIN OPENIB_OSM_FT_OPTIMIZE_HEAP_SEL --- + +dnl enable the FT heap optimization +AC_ARG_ENABLE(ft-heap-optimize, +[ --enable-ft-heap-optimize Enable FT heap optimization (default no)], + [case $enableval in + yes) ft_heap_optimize=yes ;; + no) ft_heap_optimize=no ;; + esac], + ft_heap_optimize=no) +if test $ft_heap_optimize = yes; then + AC_DEFINE(ENABLE_OSM_FT_HEAP_OPTIMIZATION, + 1, + [Define as 1 if you want to enable the FT heap optimization]) +fi +# --- END OPENIB_OSM_FT_OPTIMIZE_HEAP_SEL --- +]) dnl OPENIB_OSM_FT_OPTIMIZE_HEAP_SEL dnl Check if they want the event plugin AC_DEFUN([OPENIB_OSM_DEFAULT_EVENT_PLUGIN_SEL], [ diff --git a/opensm/configure.in b/opensm/configure.in index 8a6b4c0..9b5ec00 100644 --- a/opensm/configure.in +++ b/opensm/configure.in @@ -87,6 +87,9 @@ OPENIB_OSM_CONSOLE_SOCKET_SEL dnl select performance manager or not OPENIB_OSM_PERF_MGR_SEL +dnl select FT heap optimization or not +OPENIB_OSM_FT_OPTIMIZE_HEAP_SEL + dnl resolve sysconfdir config dir. conf_dir_tmp1=`eval echo ${sysconfdir} | sed 's/^NONE/$ac_default_prefix/'` SYS_CONFIG_DIR=`eval echo $conf_dir_tmp1` diff --git a/opensm/include/opensm/osm_base.h b/opensm/include/opensm/osm_base.h index 0537002..89b125c 100644 --- a/opensm/include/opensm/osm_base.h +++ b/opensm/include/opensm/osm_base.h @@ -1,6 +1,6 @@ /* * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2006 Mellanox Technologies LTD. All rights reserved. + * Copyright (c) 2002-2009 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * Copyright (c) 2009 Sun Microsystems, Inc. All rights reserved. * @@ -449,6 +449,18 @@ BEGIN_C_DECLS */ #define OSM_DEFAULT_SMP_MAX_ON_WIRE 4 /***/ +/d* OpenSM: Base/OSM_DEFAULT_LFT_CHUNKS +* NAME +* OSM_DEFAULT_LFT_CHUNKS +* +* DESCRIPTION +* Specifies the default number of 64 entry (byte) chunks in LFT +* related memory (re)allocation. Default is 64 (4K bytes). +* +* SYNOPSIS +*/ +#define OSM_DEFAULT_LFT_CHUNKS 64 +/***/ /d* OpenSM: Base/OSM_SM_DEFAULT_QP0_RCV_SIZE * NAME * OSM_SM_DEFAULT_QP0_RCV_SIZE diff --git a/opensm/include/opensm/osm_subnet.h b/opensm/include/opensm/osm_subnet.h index 6c20de8..be90ce4 100644 --- a/opensm/include/opensm/osm_subnet.h +++ b/opensm/include/opensm/osm_subnet.h @@ -1,6 +1,6 @@ /* * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2008 Mellanox Technologies LTD. All rights reserved. + * Copyright (c) 2002-2009 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved. * Copyright
Re: [ofa-general] [PATCH] opensm/osm_ucast_mgr.c: simplify fwd tables setup flow
On 8/29/09, Sasha Khapyorsky sas...@voltaire.com wrote: On 12:03 Fri 28 Aug , Hal Rosenstock wrote: lash_core: ERR 4D02: Lane requirements (9) exceed available lanes (8) with starting lane (0) ucast_mgr_route: lash: cannot build fwd tables. osm_ucast_mgr_process: minhop tables configured on all switches ERR 331D: LFT of switch 0xguid is not up to date. Prior to this change, the LFTs were pushed for this fallback case (and no ERR 331D occured). Nice catch. Such addition is needed to make a fallback to work properly: diff --git a/opensm/opensm/osm_ucast_mgr.c b/opensm/opensm/osm_ucast_mgr.c index b7e3893..39d825c 100644 --- a/opensm/opensm/osm_ucast_mgr.c +++ b/opensm/opensm/osm_ucast_mgr.c @@ -1007,6 +1007,7 @@ int osm_ucast_mgr_process(IN osm_ucast_mgr_t * p_mgr) /* If configured routing algorithm failed, use default MinHop */ osm_ucast_mgr_build_lid_matrices(p_mgr); ucast_mgr_build_lfts(p_mgr); + osm_ucast_mgr_set_fwd_tables(p_mgr); Shouldn't this be osm_ucast_mgr_set_fwd_table ? -- Hal p_osm-routing_engine_used = OSM_ROUTING_ENGINE_TYPE_MINHOP; } Sasha ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] opensm/osm_helper.c: Add SM priority changed into trap 144 description
Per MgtWG RefID #4503 Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/opensm/opensm/osm_helper.c b/opensm/opensm/osm_helper.c index 3692474..1b83a9e 100644 --- a/opensm/opensm/osm_helper.c +++ b/opensm/opensm/osm_helper.c @@ -531,7 +531,7 @@ const char *ib_get_trap_str(ib_net16_t trap_num) return Flow Control Update watchdog timer expired; case 144: return - CapabilityMask, NodeDescription, Link [Width|Speed] Enabled changed; + CapabilityMask, NodeDescription, Link [Width|Speed] Enabled, SM priority changed; case 145: return System Image GUID changed; case 256: ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] [PATCH] opensm/osm_ucast_mgr.c: simplify fwd tables setup flow
On 8/28/09, Sasha Khapyorsky sas...@voltaire.com wrote: Simplify (and unify) forwarding tables setup decision flow. Seems to work for all engines but I got a failure for a test case where lash fell back to min hop: lash_core: ERR 4D02: Lane requirements (9) exceed available lanes (8) with starting lane (0) ucast_mgr_route: lash: cannot build fwd tables. osm_ucast_mgr_process: minhop tables configured on all switches ERR 331D: LFT of switch 0xguid is not up to date. Prior to this change, the LFTs were pushed for this fallback case (and no ERR 331D occured). -- Hal Signed-off-by: Sasha Khapyorsky sas...@voltaire.com --- opensm/opensm/osm_ucast_mgr.c |7 +-- 1 files changed, 1 insertions(+), 6 deletions(-) diff --git a/opensm/opensm/osm_ucast_mgr.c b/opensm/opensm/osm_ucast_mgr.c index 629f628..8ba78f8 100644 --- a/opensm/opensm/osm_ucast_mgr.c +++ b/opensm/opensm/osm_ucast_mgr.c @@ -463,8 +463,6 @@ static void ucast_mgr_process_tbl(IN cl_map_item_t * p_map_item, } } - set_fwd_tbl_top(p_mgr, p_sw); - if (p_mgr-p_subn-opt.lmc) free_ports_priv(p_mgr); @@ -977,8 +975,6 @@ static int ucast_mgr_build_lfts(osm_ucast_mgr_t * p_mgr) cl_qmap_apply_func(p_mgr-p_subn-sw_guid_tbl, ucast_mgr_process_tbl, p_mgr); - ucast_mgr_pipeline_fwd_tbl(p_mgr); - cl_qlist_remove_all(p_mgr-port_order_list); return 0; @@ -1025,8 +1021,7 @@ static int ucast_mgr_route(struct osm_routing_engine *r, osm_opensm_t * osm) osm-routing_engine_used = osm_routing_engine_type(r-name); - if (r-ucast_build_fwd_tables) - osm_ucast_mgr_set_fwd_table(osm-sm.ucast_mgr); + osm_ucast_mgr_set_fwd_table(osm-sm.ucast_mgr); return 0; } -- 1.6.4 ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] [PATCH] opensm/osm_ucast_mgr: better lft setup
On 8/28/09, Sasha Khapyorsky sas...@voltaire.com wrote: The function set_next_lft_block() is called in loop with block number incremented, inside it loops by itself in looking for changed block, caller will call this function with original block number incremented so this internal loop could be repeated again and again. This patch cleans this ineffectiveness. Also rename it to set_lft_block() since block number is treated as parameters and *not* next block is processed and merges some code. Signed-off-by: Sasha Khapyorsky sas...@voltaire.com Acked-by: Hal Rosenstock hal.rosenst...@gmail.com --- opensm/include/opensm/osm_ucast_mgr.h |1 + opensm/opensm/osm_ucast_mgr.c | 126 +++-- 2 files changed, 43 insertions(+), 84 deletions(-) diff --git a/opensm/include/opensm/osm_ucast_mgr.h b/opensm/include/opensm/osm_ucast_mgr.h index 4ef045c..78a88f0 100644 --- a/opensm/include/opensm/osm_ucast_mgr.h +++ b/opensm/include/opensm/osm_ucast_mgr.h @@ -95,6 +95,7 @@ typedef struct osm_ucast_mgr { osm_subn_t *p_subn; osm_log_t *p_log; cl_plock_t *p_lock; + uint16_t max_lid; cl_qlist_t port_order_list; boolean_t is_dor; boolean_t some_hop_count_set; diff --git a/opensm/opensm/osm_ucast_mgr.c b/opensm/opensm/osm_ucast_mgr.c index 8ba78f8..a111c10 100644 --- a/opensm/opensm/osm_ucast_mgr.c +++ b/opensm/opensm/osm_ucast_mgr.c @@ -336,6 +336,9 @@ static int set_fwd_tbl_top(IN osm_ucast_mgr_t * p_mgr, IN osm_switch_t * p_sw) CL_ASSERT(p_node); + if (p_mgr-max_lid p_sw-max_lid_ho) + p_mgr-max_lid = p_sw-max_lid_ho; + p_path = osm_physp_get_dr_path_ptr(osm_node_get_physp_ptr(p_node, 0)); /* @@ -478,65 +481,13 @@ static void ucast_mgr_process_top(IN cl_map_item_t * p_map_item, set_fwd_tbl_top(p_mgr, p_sw); } -static boolean_t set_next_lft_block(IN osm_switch_t * p_sw, IN osm_sm_t * p_sm, - IN uint8_t * p_block, - IN osm_dr_path_t * p_path, - IN uint16_t block_id_ho, - IN osm_madw_context_t * p_context) -{ - ib_api_status_t status; - boolean_t sts; - - OSM_LOG_ENTER(p_sm-p_log); - - for (; -(sts = osm_switch_get_lft_block(p_sw, block_id_ho, p_block)); -block_id_ho++) { - if (!p_sw-need_update !p_sm-p_subn-need_update - !memcmp(p_block, - p_sw-new_lft + block_id_ho * IB_SMP_DATA_SIZE, - IB_SMP_DATA_SIZE)) - continue; - - OSM_LOG(p_sm-p_log, OSM_LOG_DEBUG, - Writing FT block %u to switch 0x% PRIx64 \n, - block_id_ho, - cl_ntoh64(p_context-lft_context.node_guid)); - - status = osm_req_set(p_sm, p_path, -p_sw-new_lft + -block_id_ho * IB_SMP_DATA_SIZE, -IB_SMP_DATA_SIZE, IB_MAD_ATTR_LIN_FWD_TBL, -cl_hton32(block_id_ho), -CL_DISP_MSGID_NONE, p_context); - - if (status != IB_SUCCESS) - OSM_LOG(p_sm-p_log, OSM_LOG_ERROR, ERR 3A05: - Sending linear fwd. tbl. block failed (%s)\n, - ib_get_err_str(status)); - break; - } - - OSM_LOG_EXIT(p_sm-p_log); - return sts; -} - -static boolean_t pipeline_next_lft_block(IN osm_switch_t *p_sw, -IN osm_ucast_mgr_t *p_mgr, -IN uint16_t block_id_ho) +static int set_lft_block(IN osm_switch_t *p_sw, IN osm_ucast_mgr_t *p_mgr, +IN uint16_t block_id_ho) { - osm_dr_path_t *p_path; - osm_madw_context_t context; uint8_t block[IB_SMP_DATA_SIZE]; - boolean_t status; - - OSM_LOG_ENTER(p_mgr-p_log); - - CL_ASSERT(p_sw p_sw-p_node); - - OSM_LOG(p_mgr-p_log, OSM_LOG_DEBUG, - Processing switch 0x% PRIx64 \n, - cl_ntoh64(osm_node_get_node_guid(p_sw-p_node))); + osm_madw_context_t context; + osm_dr_path_t *p_path; + ib_api_status_t status; /* Send linear forwarding table blocks to the switch @@ -547,8 +498,7 @@ static boolean_t pipeline_next_lft_block(IN osm_switch_t *p_sw, /* any routing should provide the new_lft */ CL_ASSERT(p_mgr-p_subn-opt.use_ucast_cache p_mgr-cache_valid !p_sw-need_update); - status = FALSE; - goto Exit; + return -1; } p_path
[ofa-general] Re: [ewg] [PATCH] IB/ehca: Construct MAD redirect replies from request MAD
On 8/27/09, Joachim Fenkes fen...@de.ibm.com wrote: Hal Rosenstock hal.rosenst...@gmail.com wrote on 26.08.2009 17:15:03: Thanks for doing this. It looks sane to me. The only issue I recall that appears to be remaining is a better setting of ClassPortInfo:RespTimeValue rather than hardcoding. Perhaps using the value from PortInfo is the way to go (ideally it would be that value from the port to which the the requester is being redirected to but that might not be so easy to get from this port. I don't think that effort will be necessary or even legal. The requestor will react to the redirection with another Get(ClassPortInfo) to the redirection target, which will reply with its own RespTimeValue, so our driver should speak for itself. I overreached with my comment on how this works. Since we don't know when our MAD processing and sending of the response is going to be scheduled (we're not running on real-time constraints here), we play it safe and return 18, which amounts to roughly a second. Make sense? I don't think it should be hard coded. IMO it would be better to default to 18 and somehow able to be adjusted (via a (dynamic) module parameter ?). -- Hal Regards Joachim ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] opensm: Add infrastructure support for MulticastFDBTop
Add support for SwitchInfo:MulticastFDBTop Added by MgtWG errata #4505-4508 Add OpenSM infrastructure support to ib_types.h and osm_helper.c Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/opensm/include/iba/ib_types.h b/opensm/include/iba/ib_types.h index fe3f051..e1e2bdb 100644 --- a/opensm/include/iba/ib_types.h +++ b/opensm/include/iba/ib_types.h @@ -1,6 +1,6 @@ /* * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. + * Copyright (c) 2002-2009 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * Copyright (c) 2009 HNR Consulting. All rights reserved. * @@ -4492,7 +4492,7 @@ typedef struct _ib_port_info { #define IB_PORT_CAP_HAS_LINK_SPEED_WIDTH_PAIRS_TBL (CL_HTON32(0x0800)) #define IB_PORT_CAP_RESV28(CL_HTON32(0x1000)) #define IB_PORT_CAP_RESV29(CL_HTON32(0x2000)) -#define IB_PORT_CAP_RESV30(CL_HTON32(0x4000)) +#define IB_PORT_CAP_HAS_MCAST_FDB_TOP (CL_HTON32(0x4000)) #define IB_PORT_CAP_RESV31(CL_HTON32(0x8000)) /f* IBA Base: Types/ib_port_info_get_port_state @@ -5899,6 +5899,8 @@ typedef struct _ib_switch_info { ib_net16_t lids_per_port; ib_net16_t enforce_cap; uint8_t flags; + uint8_t resvd; + ib_net16_t mcast_top; } PACK_SUFFIX ib_switch_info_t; #include complib/cl_packoff.h // @@ -5908,7 +5910,7 @@ typedef struct _ib_switch_info_record { ib_net16_t lid; uint16_t resv0; ib_switch_info_t switch_info; - uint8_t pad[3]; + uint8_t pad[1]; } PACK_SUFFIX ib_switch_info_record_t; #include complib/cl_packoff.h diff --git a/opensm/opensm/osm_helper.c b/opensm/opensm/osm_helper.c index 23392a4..b8a6523 100644 --- a/opensm/opensm/osm_helper.c +++ b/opensm/opensm/osm_helper.c @@ -1,6 +1,6 @@ /* * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. + * Copyright (c) 2002-2009 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * Copyright (c) 2009 HNR Consulting. All rights reserved. * Copyright (c) 2009 Sun Microsystems, Inc. All rights reserved. @@ -766,9 +766,9 @@ static void dbg_get_capabilities_str(IN char *p_buf, IN const uint32_t buf_size, total_len) != IB_SUCCESS) return; } - if (p_pi-capability_mask IB_PORT_CAP_RESV30) { + if (p_pi-capability_mask IB_PORT_CAP_HAS_MCAST_FDB_TOP) { if (dbg_do_line(p_local, buf_size, p_prefix_str, - IB_PORT_CAP_RESV30\n, + IB_PORT_CAP_HAS_MCAST_FDB_TOP\n, total_len) != IB_SUCCESS) return; } @@ -1514,7 +1514,8 @@ void osm_dump_switch_info(IN osm_log_t * p_log, \t\t\t\tlife_state..0x%X\n \t\t\t\tlids_per_port...%u\n \t\t\t\tpartition_enf_cap...0x%X\n - \t\t\t\tflags...0x%X\n, + \t\t\t\tflags...0x%X\n + \t\t\t\tmcast_top...0x%X\n, cl_ntoh16(p_si-lin_cap), cl_ntoh16(p_si-rand_cap), cl_ntoh16(p_si-mcast_cap), @@ -1524,7 +1525,8 @@ void osm_dump_switch_info(IN osm_log_t * p_log, p_si-def_mcast_not_port, p_si-life_state, cl_ntoh16(p_si-lids_per_port), - cl_ntoh16(p_si-enforce_cap), p_si-flags); + cl_ntoh16(p_si-enforce_cap), p_si-flags, + cl_ntoh16(p_si-mcast_top)); } } ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] infiniband-diags/ibroute: Add support for MulticastFDBTop
Add support for SwitchInfo:MulticastFDBTop Added by MgtWG errata #4505-4508 and #4640 If MulticastFDBTop is set to other than 0, only fetch MulticastForwardingTable blocks up through MulticastFDBTop rather than MulticastFDBCap If MulticastFDBTop is set to 0xbfff, this means no entries (per #4640) Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/infiniband-diags/src/ibroute.c b/infiniband-diags/src/ibroute.c index 106c934..f3ebe56 100644 --- a/infiniband-diags/src/ibroute.c +++ b/infiniband-diags/src/ibroute.c @@ -1,5 +1,6 @@ /* * Copyright (c) 2004-2008 Voltaire Inc. All rights reserved. + * Copyright (c) 2009 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU @@ -140,16 +141,24 @@ char *dump_multicast_tables(ib_portid_t * portid, unsigned startlid, char *s; uint64_t nodeguid; uint32_t mod; - unsigned block, i, j, e, nports, cap, chunks, startblock, lastblock; + unsigned block, i, j, e, nports, cap, top, chunks, +startblock, lastblock; int n = 0; if ((s = check_switch(portid, nports, nodeguid, sw, nd))) return s; mad_decode_field(sw, IB_SW_MCAST_FDB_CAP_F, cap); + mad_decode_field(sw, IB_SW_MCAST_FDB_TOP_F, top); if (!endlid || endlid IB_MIN_MCAST_LID + cap - 1) endlid = IB_MIN_MCAST_LID + cap - 1; + if (!dump_all top top endlid) { + if (top IB_MIN_MCAST_LID - 1 || top == 0x) + IBWARN(illegal top mlid %x, top); + else + endlid = top; + } if (!startlid) startlid = IB_MIN_MCAST_LID; @@ -187,7 +196,8 @@ char *dump_multicast_tables(ib_portid_t * portid, unsigned startlid, printf( MLid\n); } if (ibverbose) - printf(Switch multicast mlid capability is %d\n, cap); + printf(Switch multicast mlid capability is %d top is %d\n, + cap, top); chunks = ALIGN(nports + 1, 16) / 16; ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] libibmad: Add support for MulticastFDBTop
Add support for SwitchInfo:MulticastFDBTop and PortInfo:CapabilityMask.IsMulticastFDBTopSupported Added by MgtWG errata #4505-4508 Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/libibmad/include/infiniband/mad.h b/libibmad/include/infiniband/mad.h index 3093fbd..5f3b52b 100644 --- a/libibmad/include/infiniband/mad.h +++ b/libibmad/include/infiniband/mad.h @@ -1,6 +1,7 @@ /* * Copyright (c) 2004-2007 Voltaire Inc. All rights reserved. * Copyright (c) 2009 HNR Consulting. All rights reserved. + * Copyright (c) 2009 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU @@ -400,6 +401,7 @@ enum MAD_FIELDS { IB_SW_FILTER_RAW_INB_F, IB_SW_FILTER_RAW_OUTB_F, IB_SW_ENHANCED_PORT0_F, + IB_SW_MCAST_FDB_TOP_F, IB_SW_LAST_F, /* diff --git a/libibmad/src/dump.c b/libibmad/src/dump.c index 051c708..d97d359 100644 --- a/libibmad/src/dump.c +++ b/libibmad/src/dump.c @@ -1,6 +1,7 @@ /* * Copyright (c) 2004-2008 Voltaire Inc. All rights reserved. * Copyright (c) 2007 Xsigo Systems Inc. All rights reserved. + * Copyright (c) 2009 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU @@ -518,6 +519,8 @@ void mad_dump_portcapmask(char *buf, int bufsz, void *val, int valsz) if (mask (1 27)) s += sprintf(s, \t\t\t\tIsLinkSpeedWidthPairsTableSupported\n); + if (mask (1 30)) + s += sprintf(s, \t\t\t\tIsMulticastFDBTopSupported\n); if (s != buf) *(--s) = 0; diff --git a/libibmad/src/fields.c b/libibmad/src/fields.c index c8e4e79..5f30116 100644 --- a/libibmad/src/fields.c +++ b/libibmad/src/fields.c @@ -1,6 +1,7 @@ /* * Copyright (c) 2004-2007 Voltaire Inc. All rights reserved. * Copyright (c) 2009 HNR Consulting. All rights reserved. + * Copyright (c) 2009 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU @@ -206,6 +207,7 @@ static const ib_field_t ib_mad_f[] = { {BITSOFFS(130, 1), FilterRawInbound, mad_dump_uint}, {BITSOFFS(131, 1), FilterRawOutbound, mad_dump_uint}, {BITSOFFS(132, 1), EnhancedPort0, mad_dump_uint}, + {BITSOFFS(144, 16), MulticastFDBTop, mad_dump_hex}, {0, 0}, /* IB_SW_LAST_F */ /* ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Combined DR path with empty DR path, what is the expected behavior?
On 8/25/09, Ira Weiny wei...@llnl.gov wrote: On Tue, 25 Aug 2009 19:15:19 -0400 Hal Rosenstock hal.rosenst...@gmail.com wrote: On 8/24/09, Ira Weiny wei...@llnl.gov wrote: If I send a combined DR path with a start lid but an empty (0 length) DR path. Hop Count 0 ? Yes What is the expected behavior? Not sure what you mean by expected here. Are you referring to expectation based on the spec ? yes I know this could be specified with LID routing, but I don't see anywhere in the specification which says this is an error. I don't think it should be an error (certainly not for the form you are using LID routed part followed by a DR part) but a null DR part is a little funny/odd. Yea I know. It turns out that the new iblinkinfo issues queries like this when it is removes recurses back from the last DR portion of the combined route path. It only showed up as an error when using the -S guid option of iblinkinfo with this new switch I have. Works fine with the old switches. I do however seem to have 2 different implementations on 2 different switches. For example: I have Switch A (Lid 1) and Switch B (Lid 7). I attempt to query PortInfo of Port 1 of each switch using the LID followed by an empty DR path. 17:55:22 ./smpquery -c portinfo 1 0 1 ibwarn: [21005] mad_rpc: _do_madrpc failed; dport (Lid 1) ./smpquery: iberror: failed: operation portinfo: port info query failed Is this a timeout ? yes 16:26:25 ./smpquery -e -c portinfo 1 0 1 ibwarn: [27150] _do_madrpc: retry 1 (timeout 1000 ms) ibwarn: [27150] _do_madrpc: retry 2 (timeout 1000 ms) ibwarn: [27150] _do_madrpc: timeout after 3 retries, 3000 ms ibwarn: [27150] mad_rpc: _do_madrpc failed; dport (Lid 1) ./smpquery: iberror: failed: operation portinfo: port info query failed 17:55:31 ./smpquery -c portinfo 7 0 1 # Port info: Lid 7 port 1 Mkey:0x GidPrefix:...0x ... normal output snipped Detecting this special case in libibmad and turning the packet into a LID routed one Ugh... Is this special case really needed ? I don't think the underlying issue is understood sufficiently yet. Well I just did it to prove that what I was doing would work with a simple lid routed packet. Like I said it might be that this portid which is being specified to libibmad by libibnetdisc is not valid. If that is true then libibnetdisc should detect when the DR path is empty and go back to LID routed requests. That is a valid fix in my mind. Sure; there's no real need for combined route when the DR path is empty but it should work (at least with switches). succeeds but I wonder if this is an error in the SMI? Switch SMI ? Is this a proprietary implementation ? Yes I see the bug with 2 different vendors switches. One is managed and the other is not. My old switches (3 different vendors) do not show this behavior. (Just to be clear I now I have 5 switches in my 5 node cluster! ;-) I also notice this is an error on the HCA I am running from (lid 2). Is this HCA node OpenIB based ? yes If I recall correctly, there is something in the spec that makes combined routing not be allowed on HCA (and router) ports so this seems correct. I can dig this out if really needed. 17:57:42 ./smpquery -c portinfo 2 0 1 ibwarn: [21008] mad_rpc: _do_madrpc failed; dport (Lid 2) ./smpquery: iberror: failed: operation portinfo: port info query failed Is this also a timeout ? yes Also, does the result differ based on where you source these from matter (locally v. remotely)? Same result local and remote. Running with a simple DR path works, You're referring to the same DR path here that fails in the combined route examples above, right ? No. the example below is a DR path with Hop Count == 0 but without the initial LID routing. I guess because this is the loopback case mentioned on page 805. Yes but that's the high level requirement rather than the SMI rules which make that work. 17:58:16 ./smpquery -D portinfo 0 1 # Port info: DR path slid 65535; dlid 65535; 0 port 1 Mkey:0x GidPrefix:...0x2007 ... snip It guess that the comment Since each part may be empty, there are eight combinations, although only four are really useful: on line 36 Page 805 can be interpreted to mean that only those 4 combinations need to be supported. Is this true? Not all 4 combinations are supported/known to work. When this was added for ibportstate, the only combined routing form that was important was LID routed part followed by a DR part. When you say known to work you mean implemented with the diags? Or known to work in all hardware
[ofa-general] Re: [ewg] [PATCH] IB/ehca: Construct MAD redirect replies from request MAD
On 8/26/09, Joachim Fenkes fen...@de.ibm.com wrote: The old code used a lot of hardcoded values, which might not be valid in all environments (especially routed fabrics or partitioned subnets). Copy as much information as possible from the incoming request to prevent that. Signed-off-by: Joachim Fenkes fen...@de.ibm.com --- Hal, Jason -- here's the change I promised. Looks okay to you? Roland -- if Hal and Jason don't object, please queue this up for the next kernel. Thanks! Thanks for doing this. It looks sane to me. The only issue I recall that appears to be remaining is a better setting of ClassPortInfo:RespTimeValue rather than hardcoding. Perhaps using the value from PortInfo is the way to go (ideally it would be that value from the port to which the the requester is being redirected to but that might not be so easy to get from this port (I guess that could be SA Get PortInfoRecord for that port but that is a larger change and it likely to be same as local port issuing the redirect response). -- Hal Regards, Joachim drivers/infiniband/hw/ehca/ehca_sqp.c | 47 1 files changed, 41 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_sqp.c b/drivers/infiniband/hw/ehca/ehca_sqp.c index c568b28..8c1213f 100644 --- a/drivers/infiniband/hw/ehca/ehca_sqp.c +++ b/drivers/infiniband/hw/ehca/ehca_sqp.c @@ -125,14 +125,30 @@ struct ib_perf { u8 data[192]; } __attribute__ ((packed)); +/* TC/SL/FL packed into 32 bits, as in ClassPortInfo */ +struct tcslfl { + u32 tc:8; + u32 sl:4; + u32 fl:20; +} __attribute__ ((packed)); + +/* IP Version/TC/FL packed into 32 bits, as in GRH */ +struct vertcfl { + u32 ver:4; + u32 tc:8; + u32 fl:20; +} __attribute__ ((packed)); static int ehca_process_perf(struct ib_device *ibdev, u8 port_num, +struct ib_wc *in_wc, struct ib_grh *in_grh, struct ib_mad *in_mad, struct ib_mad *out_mad) { struct ib_perf *in_perf = (struct ib_perf *)in_mad; struct ib_perf *out_perf = (struct ib_perf *)out_mad; struct ib_class_port_info *poi = (struct ib_class_port_info *)out_perf-data; + struct tcslfl *tcslfl = + (struct tcslfl *)poi-redirect_tcslfl; struct ehca_shca *shca = container_of(ibdev, struct ehca_shca, ib_device); struct ehca_sport *sport = shca-sport[port_num - 1]; @@ -158,10 +174,29 @@ static int ehca_process_perf(struct ib_device *ibdev, u8 port_num, poi-base_version = 1; poi-class_version = 1; poi-resp_time_value = 18; - poi-redirect_lid = sport-saved_attr.lid; - poi-redirect_qp = sport-pma_qp_nr; + + /* copy local routing information from WC where applicable */ + tcslfl-sl = in_wc-sl; + poi-redirect_lid = + sport-saved_attr.lid | in_wc-dlid_path_bits; + poi-redirect_qp = sport-pma_qp_nr; poi-redirect_qkey = IB_QP1_QKEY; - poi-redirect_pkey = IB_DEFAULT_PKEY_FULL; + + ehca_query_pkey(ibdev, port_num, in_wc-pkey_index, + poi-redirect_pkey); + + /* if request was globally routed, copy route info */ + if (in_grh) { + struct vertcfl *vertcfl = + (struct vertcfl *)in_grh-version_tclass_flow; + memcpy(poi-redirect_gid, in_grh-dgid.raw, + sizeof(poi-redirect_gid)); + tcslfl-tc= vertcfl-tc; + tcslfl-fl= vertcfl-fl; + } else + /* else only fill in default GID */ + ehca_query_gid(ibdev, port_num, 0, + (union ib_gid *)poi-redirect_gid); ehca_dbg(ibdev, ehca_pma_lid=%x ehca_pma_qp=%x, sport-saved_attr.lid, sport-pma_qp_nr); @@ -183,8 +218,7 @@ perf_reply: int ehca_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num, struct ib_wc *in_wc, struct ib_grh *in_grh, -struct ib_mad *in_mad, -struct ib_mad *out_mad) +struct ib_mad *in_mad, struct ib_mad *out_mad) { int ret; @@ -196,7 +230,8 @@ int ehca_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num, return IB_MAD_RESULT_SUCCESS; ehca_dbg(ibdev, port_num=%x src_qp=%x, port_num, in_wc-src_qp); - ret = ehca_process_perf(ibdev, port_num, in_mad, out_mad); + ret = ehca_process_perf(ibdev, port_num, in_wc, in_grh, + in_mad, out_mad); return ret; } -- 1.6.0.4
Re: [ofa-general] ofed 1.3.2 opensmd failover
On 8/25/09, PN pok...@gmail.com wrote: HI, I can think of a situation in which all servers have dual port IB cards and need failover of OpenSM to achieve HA. As I know, OpenSM can only bind to 1 port at a time, Yes. so do I need to start 2 OpenSM in server A and 2 OpenSM in server B? That would be one valid configuration. I'm assuming all ports are connected to same subnet. Will they use the same guid2lid file? Depends how the OpenSM configuration is done. Do I need to set something in the config file or they will automatically communcate each other? What communication are you referring to ? The all need to share the same subnet prefix. Do I need to run sldd.sh manually or it will automatically sync with other OpenSM? You can either manually copy the guid2lid file around to the appropriate places. I'm not that familiar with sldd.sh but I think it can either be run manually or made to run automatically but I'm not familiar with the details. -- Hal Thanks a lot. Regards, PN 2009/8/26 Hal Rosenstock hal.rosenst...@gmail.com On 8/25/09, kovlen...@interia.pl kovlen...@interia.pl wrote: Hi all, Quick question - is there a need to run anything except opensmd deamons to provide failover capability on ib network in ofed 1.3? In terms of SM failover, modulo bugs fixed relative to this feature since OFED 1.3 (there are a couple of things here which may affect your environment if I recall correctly), you only need to run more than 1 SM for this (one will become master, the other standby). I'm aware that when master manager dies standby one comes in and manages the network, but that does not necessary means that lids are preserved, especially for nodes joining in. I used to run sldd.sh for distributing lids list on ofed 1.2.5, but while this script seems to be in place noone mentions necessity for it. So subnet manager failover is provided by running standby opensm. And how LID preservation is provided? If you want LIDs to be preserved, the guid2lid file needs to be sync'd (copied from the master SM once it's fully assembled to the node which is running the standby SM). That's what the sldd.sh script does. -- Hal Regards, Zdenek Kovlensky -- Kup wlasne mieszkanie za 33 tys. zl! Sprawdz http://link.interia.pl/f22f2 ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general -- Best Regards, PN Lai HPC Specialist Galactic Computng Corp. Tel: 86-755-26733939 ext 826 Mobile: 86-13823161729 Fax: 86-755-26733780 URL: http://www.galactic.com.hk ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Problems with OpenSM from ofed 1.4.1 and MESH topology.
Hi Rafael, On 8/25/09, Rafael David Tinoco rafael.tin...@sun.com wrote: Hello Hal, Bellow... Hal Rosenstock wrote: On 8/24/09, Rafael David Tinoco rafael.tin...@sun.com wrote: Hello, I'm installing an HPC cluster using 2 Sun Blades 6048 with QNEMs (2 asics each, 8 qnems). They are configured in a MESH topology. I'm using Centos 5.3, OFED 1.4.1 and kernel 2.6.18-128.el5. I'm booting PXE from IB, my initrd image is bringing the ib0 interface, getting the squashfs image and mounting with aufs. The problem is.. When booting more then 60 nodes, I start to get above errors on subnet manager. And the problem seems to be intermitent, because each time it gives errors on different path. Any ideas ? Aug 24 15:36:19 713836 [48D7D940] 0x02 - osm_report_notice: Reporting Generic Notice type:3 num:64 (GID in service) from LID:1 GID:fe80::5080:200:8d:9931 Aug 24 15:36:19 713838 [48D7D940] 0x02 - __osm_state_mgr_report_new_ports: Discovered new port with GUID:0x5080028d9381 LID range [78,78] of node:b03n06 HCA-1 Aug 24 15:36:19 713840 [48D7D940] 0x02 - osm_report_notice: Reporting Generic Notice type:3 num:64 (GID in service) from LID:1 GID:fe80::5080:200:8d:9931 Aug 24 15:36:19 713842 [48D7D940] 0x02 - __osm_state_mgr_report_new_ports: Discovered new port with GUID:0x5080028d4689 LID range [76,76] of node:b03n04 HCA-1 Aug 24 15:36:19 713845 [48D7D940] 0x02 - osm_report_notice: Reporting Generic Notice type:3 num:64 (GID in service) from LID:1 GID:fe80::5080:200:8d:9931 Aug 24 15:36:19 713847 [48D7D940] 0x02 - __osm_state_mgr_report_new_ports: Discovered new port with GUID:0x5080028e5191 LID range [82,82] of node:b03n11 HCA-1 Aug 24 15:36:19 713849 [48D7D940] 0x02 - osm_report_notice: Reporting Generic Notice type:3 num:64 (GID in service) from LID:1 GID:fe80::5080:200:8d:9931 Aug 24 15:36:19 713866 [48D7D940] 0x02 - __osm_state_mgr_report_new_ports: Discovered new port with GUID:0x5080028d94c9 LID range [80,80] of node:b03n08 HCA-1 Aug 24 15:36:19 713869 [48D7D940] 0x02 - osm_report_notice: Reporting Generic Notice type:3 num:64 (GID in service) from LID:1 GID:fe80::5080:200:8d:9931 Aug 24 15:36:19 713871 [48D7D940] 0x02 - __osm_state_mgr_report_new_ports: Discovered new port with GUID:0x5080028daedd LID range [83,83] of node:b03n12 HCA-1 Aug 24 15:36:19 714782 [48D7D940] 0x02 - SUBNET UP Aug 24 15:36:19 714805 [48D7D940] 0x01 - __osm_state_mgr_light_sweep_start: ERR 3315: Unknown remote side for node 0x0021283a85260040(Sun Blade 6048 InfiniBand QDR Switched NEM I4A) port 19. Adding to light sweep sampling list Aug 24 15:36:19 714812 [48D7D940] 0x01 - Directed Path Dump of 4 hop path: Path = 0,1,15,15,15 Aug 24 15:36:19 714822 [48D7D940] 0x01 - __osm_state_mgr_light_sweep_start: ERR 3315: Unknown remote side for node 0x0021283a85260040(Sun Blade 6048 InfiniBand QDR Switched NEM I4A) port 21. Adding to light sweep sampling list Aug 24 15:36:19 714827 [48D7D940] 0x01 - Directed Path Dump of 4 hop path: Path = 0,1,15,15,15 Aug 24 15:36:19 714831 [48D7D940] 0x01 - __osm_state_mgr_light_sweep_start: ERR 3315: Unknown remote side for node 0x0021283a85260040(Sun Blade 6048 InfiniBand QDR Switched NEM I4A) port 25. Adding to light sweep sampling list Aug 24 15:36:19 714835 [48D7D940] 0x01 - Directed Path Dump of 4 hop path: Path = 0,1,15,15,15 Aug 24 15:36:20 514302 [4977E940] 0x01 - umad_receiver: ERR 5409: send completed with error (method=0x1 attr=0x15 trans_id=0x4700036595) -- dropping Aug 24 15:36:20 514321 [4977E940] 0x01 - umad_receiver: ERR 5411: DR SMP Hop Ptr: 0x0 Aug 24 15:36:20 514328 [4977E940] 0x01 - Received SMP on a 5 hop path: Initial path = 0,0,0,0,0,0 Return path = 0,0,0,0,0,0 Aug 24 15:36:20 514333 [4977E940] 0x01 - __osm_sm_mad_ctrl_send_err_cb: ERR 3113: MAD completed in error (IB_TIMEOUT) Aug 24 15:36:20 514352 [4977E940] 0x01 - SMP dump: base_ver0x1 mgmt_class..0x81 class_ver...0x1 method..0x1 (SubnGet) D bit...0x0 status..0x0 hop_ptr.0x0 hop_count...0x5 trans_id0x36595 attr_id.0x15 (PortInfo) resv0x0 attr_mod0x0 m_key...0x dr_slid.65535 dr_dlid.65535 Initial path: 0,1,15,15,15,19 Return path: 0,0,0,0,0,0 Reserved: [0][0][0][0][0][0][0] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ofa-general] [PATCH] opensm/ib_types.h: Add CounterSelect2 field to PortCounters attribute
Per MgtWG RefID #4527 Also, cosmetic commentary change Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/opensm/include/iba/ib_types.h b/opensm/include/iba/ib_types.h index fe3f051..42ec794 100644 --- a/opensm/include/iba/ib_types.h +++ b/opensm/include/iba/ib_types.h @@ -4377,8 +4377,8 @@ ib_node_info_get_vendor_id(IN const ib_node_info_t * const p_ni) #include complib/cl_packon.h typedef struct _ib_node_desc { - // Node String is an array of UTF-8 character that - // describes the node in text format + // Node String is an array of UTF-8 characters + // that describe the node in text format // Note that this string is NOT NULL TERMINATED! uint8_t description[IB_NODE_DESCRIPTION_SIZE]; } PACK_SUFFIX ib_node_desc_t; @@ -7737,9 +7737,9 @@ typedef struct _ib_port_counters { ib_net16_t xmit_discards; uint8_t xmit_constraint_err; uint8_t rcv_constraint_err; - uint8_t res1; + uint8_t counter_select2; uint8_t link_int_buffer_overrun; - ib_net16_t res2; + ib_net16_t resv; ib_net16_t vl15_dropped; ib_net32_t xmit_data; ib_net32_t rcv_data; ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] infiniband-diags/perfquery.c: Indicate whether PortXmitWait counter is supported
Indicate extended v. (normal) port counters in output Also, some cosmetic formatting changes and commentary typo fixed Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/infiniband-diags/src/perfquery.c b/infiniband-diags/src/perfquery.c index 39ae2f6..0fd083e 100644 --- a/infiniband-diags/src/perfquery.c +++ b/infiniband-diags/src/perfquery.c @@ -1,6 +1,7 @@ /* * Copyright (c) 2004-2008 Voltaire Inc. All rights reserved. * Copyright (c) 2007 Xsigo Systems Inc. All rights reserved. + * Copyright (c) 2009 HNR Consulting. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU @@ -277,8 +278,8 @@ static void output_aggregate_perfcounters_ext(ib_portid_t * portid) mad_dump_perfcounters_ext(buf, sizeof buf, pc, sizeof pc); - printf(# Port counters: %s port %d\n%s, portid2str(portid), ALL_PORTS, - buf); + printf(# Port extended counters: %s port %d\n%s, portid2str(portid), + ALL_PORTS, buf); } static void dump_perfcounters(int extended, int timeout, uint16_t cap_mask, @@ -291,7 +292,8 @@ static void dump_perfcounters(int extended, int timeout, uint16_t cap_mask, IB_GSI_PORT_COUNTERS, srcport)) IBERROR(perfquery); if (!(cap_mask 0x1000)) { - /* if PortCounters:PortXmitWait not suppported clear this counter */ + /* if PortCounters:PortXmitWait not supported clear this counter */ + IBWARN(PortXmitWait not indicated so ignore this counter); perf_count.xmtwait = 0; mad_encode_field(pc, IB_PC_XMT_WAIT_F, perf_count.xmtwait); @@ -316,9 +318,14 @@ static void dump_perfcounters(int extended, int timeout, uint16_t cap_mask, sizeof pc); } - if (!aggregate) - printf(# Port counters: %s port %d\n%s, portid2str(portid), - port, buf); + if (!aggregate) { + if (extended) + printf(# Port extended counters: %s port %d\n%s, + portid2str(portid), port, buf); + else + printf(# Port counters: %s port %d\n%s, + portid2str(portid), port, buf); + } } static void reset_counters(int extended, int timeout, int mask, @@ -421,9 +428,8 @@ static int process_opt(void *context, int ch, char *optarg) int main(int argc, char **argv) { - int mgmt_classes[4] = { IB_SMI_CLASS, IB_SMI_DIRECT_CLASS, IB_SA_CLASS, - IB_PERFORMANCE_CLASS - }; + int mgmt_classes[4] = {IB_SMI_CLASS, IB_SMI_DIRECT_CLASS, IB_SA_CLASS, + IB_PERFORMANCE_CLASS}; ib_portid_t portid = { 0 }; int mask = 0x; uint16_t cap_mask; @@ -553,7 +559,6 @@ int main(int argc, char **argv) goto done; do_reset: - if (argc = 2 !extended (cap_mask 0x1000)) mask |= (1 16); /* reset portxmitwait */ ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] ofed 1.3.2 opensmd failover
On 8/25/09, kovlen...@interia.pl kovlen...@interia.pl wrote: Hi all, Quick question - is there a need to run anything except opensmd deamons to provide failover capability on ib network in ofed 1.3? In terms of SM failover, modulo bugs fixed relative to this feature since OFED 1.3 (there are a couple of things here which may affect your environment if I recall correctly), you only need to run more than 1 SM for this (one will become master, the other standby). I'm aware that when master manager dies standby one comes in and manages the network, but that does not necessary means that lids are preserved, especially for nodes joining in. I used to run sldd.sh for distributing lids list on ofed 1.2.5, but while this script seems to be in place noone mentions necessity for it. So subnet manager failover is provided by running standby opensm. And how LID preservation is provided? If you want LIDs to be preserved, the guid2lid file needs to be sync'd (copied from the master SM once it's fully assembled to the node which is running the standby SM). That's what the sldd.sh script does. -- Hal Regards, Zdenek Kovlensky -- Kup wlasne mieszkanie za 33 tys. zl! Sprawdz http://link.interia.pl/f22f2 ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general