On Sat, 25 Apr 2009 18:54:41 +0300
Sasha Khapyorsky <[email protected]> wrote:

> On 13:31 Thu 23 Apr     , Ira Weiny wrote:
> > diff --git a/infiniband-diags/src/ibqueryerrors.c 
> > b/infiniband-diags/src/ibqueryerrors.c
> > new file mode 100644
> > index 0000000..9d96190
> > +#include <inttypes.h>

[snip]

> > +
> > +#include <infiniband/complib/cl_nodenamemap.h>
> 
> AFAIR WinOF doesn't like such inclusion and uses
> 
> #include <complib/filename.h>
> 
> instead. I'm changing.

Ok, sorry.

> 
> > +#include <infiniband/ibnetdisc.h>
> > +#include <infiniband/mad.h>
> > +
> > +#include "ibdiag_common.h"
> > +
> > +char *argv0 = "ibqueryerrors";
> 
> argv0 variable is not needed if you are using ibdiag_common stuff.
> Removing.

cool, that was left over sorry.

> 
> > +static FILE *f;
> 
> I don't see where 'f' is used (except 'f = stdout;' below). Removing.
> 
> [snip...]
> 
> > +static int process_opt(void *context, int ch, char *optarg)
> > +{
> > +   switch (ch) {
> > +   case 's':
> > +           calculate_suppressed_fields(optarg);
> > +           break;
> > +   case 'c':
> > +           /* Right now this is the only "common" error */
> > +           add_suppressed(IB_PC_ERR_SWITCH_REL_F);
> > +           break;
> > +   case 1:
> > +           node_name_map_file = strdup(optarg);
> > +           break;
> > +   case 2:
> > +           data_counters++;
> > +           break;
> > +   case 3:
> > +           all_nodes++;
> > +           break;
> > +   case 'S':
> > +           switch_guid_str = strdup(optarg);
> 
> Why should optarg be strdup()ed?

Well it does not have to be strdup'ed however, switch_guid_str needs to be set
for the call to ib_resolve_portid_str_via below.

...
   if (ib_resolve_portid_str_via(&portid, switch_guid_str, IB_DEST_GUID,
...

The removal of this line causes the '-S' option to segfault.  Patch to pq/ibn4
is below.


[snip]

> > +
> > +   if (switch_guid) {
> > +           /* limit the scan the fabric around the target */
> > +           ib_portid_t portid = {0};
> > +
> > +           if (ib_resolve_portid_str_via(&portid, switch_guid_str, 
> > IB_DEST_GUID,
> > +                                   ibd_sm_id, ibmad_port) < 0) {
> > +                   fprintf(stderr, "can't resolve destination port %s 
> > %p\n",
> > +                           switch_guid_str, ibd_sm_id);
> > +                   rc = 1;
> > +                   goto close_port;
> > +           }
> > +
> > +           if ((fabric = ibnd_discover_fabric(ibmad_port, ibd_timeout, 
> > &portid, 1)) == NULL) {
> > +                   fprintf(stderr, "discover failed\n");
> > +                   rc = 1;
> > +                   goto close_port;
> > +           }
> > +   } else {
> > +           if ((fabric = ibnd_discover_fabric(ibmad_port, ibd_timeout, 
> > NULL, -1)) == NULL) {
> > +                   fprintf(stderr, "discover failed\n");
> > +                   rc = 1;
> > +                   goto close_port;
> > +           }
> 
> Above you are using IBERROR(), here is fprintf(stderr, ...). Could it be
> consistent? (if yes - it is subsequent patch).

I used fprintf here to allow the goto to close the port, rather than let
IBERROR exit out.  We already discussed this on the list but this gives a
better example to users that they are to close the port.  I can change it if
you like.

> 
> > +   }
> > +
> > +   report_suppressed();
> > +
> > +   if (switch_guid) {
> > +           ibnd_node_t *node = ibnd_find_node_guid(fabric, switch_guid);
> > +           print_node(node, NULL);
> > +   } else if (dr_path) {
> > +           ibnd_node_t *node = ibnd_find_node_dr(fabric, dr_path);
> > +           print_node(node, NULL);
> 
> When GUID or DR Path are specified we don't need to discover whole
> fabric, but can try to resolve LID using SA or querying PortInfo.
> 
> Although when in GUID is specified and SA is not responsive there is
> probably no other choice than discover.
> 

:-( good point.  Discovering only part of the fabric was a huge speed
improvement but if the resolve does not succeed I should do a full discover.

I will work up a separate patch.  Right now you are correct if the SA is
unresponsive the "-S" option will fail.  iblinkinfo does the full scan every
time.  But that slows down the query for a single switch to the same O(n)
query that a full system scan requires.  I would rather have that query be
O(1).  So I implemented ibqueryerrors in this manner with the intent of going
back and "fixing" iblinkinfo.  I think having a fall back on a full system
scan is a good idea.  Patch for both tools will follow...  :-D

Ira

From: Ira Weiny <[email protected]>
Date: Mon, 27 Apr 2009 14:47:08 -0700
Subject: [PATCH] switch_guid_str is required for the string resolve function.


Signed-off-by: Ira Weiny <[email protected]>
---
 infiniband-diags/src/ibqueryerrors.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/infiniband-diags/src/ibqueryerrors.c 
b/infiniband-diags/src/ibqueryerrors.c
index 52bd036..09861be 100644
--- a/infiniband-diags/src/ibqueryerrors.c
+++ b/infiniband-diags/src/ibqueryerrors.c
@@ -364,6 +364,7 @@ static int process_opt(void *context, int ch, char *optarg)
                all_nodes++;
                break;
        case 'S':
+               switch_guid_str = optarg;
                switch_guid = strtoull(optarg, 0, 0);
                break;
        case 'D':
-- 
1.5.4.5

_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to