Re: [ewg] perfquery error
Hi Hal, On 12:37 Mon 21 Jul , Hal Rosenstock wrote: or directory) [EMAIL PROTECTED] ~]# perfquery ibpanic: [3926] madrpc_init: can't open UMAD port ((null):0): (No such file or directory) [EMAIL PROTECTED] ~]# This sounds like an additional aspect of the libibumad API issue introduced when some functionality related to this was eliminated. The previous fix related to OpenSM appears to not be the complete solution to these mixed configurations. I think you are referring changes which were done in OpenSM ibumad vendor layer, specifically commits 5cf395cb107ef76091d554cd4b42456dc53b38a2 and 36cfd8e6967d8a7aea74b9a180fac0275ef549dd and had nothing to do with libibumad API. Regarding to an issue itself. Steve, does something like this: diff --git a/libibumad/src/umad.c b/libibumad/src/umad.c index dcc7275..9010307 100644 --- a/libibumad/src/umad.c +++ b/libibumad/src/umad.c @@ -499,6 +499,19 @@ umad_done(void) return 0; } +static unsigned is_ib_type(char *ca_name) +{ + char dir_name[256]; + unsigned type; + + snprintf(dir_name, sizeof(dir_name), %s/%s, SYS_INFINIBAND, ca_name); + + if (sys_read_uint(dir_name, SYS_NODE_TYPE, type) 0) + return 0; + + return type = 1 type = 3 ? 1 : 0; +} + int umad_get_cas_names(char cas[][UMAD_CA_NAME_LEN], int max) { @@ -512,7 +525,7 @@ umad_get_cas_names(char cas[][UMAD_CA_NAME_LEN], int max) for (i = 0; i n; i++) { if (strcmp(namelist[i]-d_name, .) strcmp(namelist[i]-d_name, ..)) { - if (j max) + if (j max is_ib_type(namelist[i]-d_name)) strncpy(cas[j++], namelist[i]-d_name, UMAD_CA_NAME_LEN); } helps? Sasha ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] perfquery error
Sasha, On Wed, Jul 23, 2008 at 5:22 PM, Sasha Khapyorsky [EMAIL PROTECTED] wrote: Hi Hal, On 12:37 Mon 21 Jul , Hal Rosenstock wrote: or directory) [EMAIL PROTECTED] ~]# perfquery ibpanic: [3926] madrpc_init: can't open UMAD port ((null):0): (No such file or directory) [EMAIL PROTECTED] ~]# This sounds like an additional aspect of the libibumad API issue introduced when some functionality related to this was eliminated. The previous fix related to OpenSM appears to not be the complete solution to these mixed configurations. I think you are referring changes which were done in OpenSM ibumad vendor layer, specifically commits 5cf395cb107ef76091d554cd4b42456dc53b38a2 and 36cfd8e6967d8a7aea74b9a180fac0275ef549dd and had nothing to do with libibumad API. I was referring to the changes which eliminated index 0 being the default interface. Was the elimination of the default interface not an API change to libibumad ? -- Hal Regarding to an issue itself. Steve, does something like this: diff --git a/libibumad/src/umad.c b/libibumad/src/umad.c index dcc7275..9010307 100644 --- a/libibumad/src/umad.c +++ b/libibumad/src/umad.c @@ -499,6 +499,19 @@ umad_done(void) return 0; } +static unsigned is_ib_type(char *ca_name) +{ + char dir_name[256]; + unsigned type; + + snprintf(dir_name, sizeof(dir_name), %s/%s, SYS_INFINIBAND, ca_name); + + if (sys_read_uint(dir_name, SYS_NODE_TYPE, type) 0) + return 0; + + return type = 1 type = 3 ? 1 : 0; +} + int umad_get_cas_names(char cas[][UMAD_CA_NAME_LEN], int max) { @@ -512,7 +525,7 @@ umad_get_cas_names(char cas[][UMAD_CA_NAME_LEN], int max) for (i = 0; i n; i++) { if (strcmp(namelist[i]-d_name, .) strcmp(namelist[i]-d_name, ..)) { - if (j max) + if (j max is_ib_type(namelist[i]-d_name)) strncpy(cas[j++], namelist[i]-d_name, UMAD_CA_NAME_LEN); } helps? Sasha ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] perfquery error
Hal Rosenstock wrote: Steve, On Sat, Jul 19, 2008 at 8:07 AM, Steve Wise [EMAIL PROTECTED] wrote: Hal, perfquery barfs when an iwarp device is in the mix. I think it needs to skip over devices that are not IB. [EMAIL PROTECTED] ~]# perfquery ibpanic: [5790] madrpc_init: can't open UMAD port ((null):0): (No such file or directory) [EMAIL PROTECTED] ~]# What is the machine configuration in terms of RDMA devices ? Is there just an iWARP NIC in that machine or are there also IB CA(s) ? Is it correct to assume this is the latest perfquery/libibmad/libibumad ? I know this used to work in a mixed configuration but there was a change to a umad API which eliminated some functionality which might cause this to break (similar to a previous issue (that was resolved) with OpenSM in a mixed iWARP/IB configuration). BTW, Sasha is the maintainer for these management tools. I can reproduce this on a system with 1 cxgb3 rnic and 1 mthca hca. If I specify the mthca device explicitly it works. If I specify the rnic device or no devices it fails (probably because cxgb3 is the first device to query): [EMAIL PROTECTED] ~]# perfquery -C mthca0 # Port counters: Lid 1 port 1 PortSelect:..1 CounterSelect:...0x SymbolErrors:0 LinkRecovers:0 LinkDowned:..0 RcvErrors:...0 RcvRemotePhysErrors:.0 RcvSwRelayErrors:0 XmtDiscards:.1 XmtConstraintErrors:.0 RcvConstraintErrors:.0 LinkIntegrityErrors:.0 ExcBufOverrunErrors:.0 VL15Dropped:.0 XmtData:.32387 RcvData:.31697 XmtPkts:.447 RcvPkts:.434 [EMAIL PROTECTED] ~]# perfquery -C cxgb30 ibpanic: [3925] madrpc_init: can't open UMAD port (cxgb30:0): (No such file or directory) [EMAIL PROTECTED] ~]# perfquery ibpanic: [3926] madrpc_init: can't open UMAD port ((null):0): (No such file or directory) [EMAIL PROTECTED] ~]# ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] perfquery error
On Mon, Jul 21, 2008 at 12:33 PM, Steve Wise [EMAIL PROTECTED] wrote: Hal Rosenstock wrote: Steve, On Sat, Jul 19, 2008 at 8:07 AM, Steve Wise [EMAIL PROTECTED] wrote: Hal, perfquery barfs when an iwarp device is in the mix. I think it needs to skip over devices that are not IB. [EMAIL PROTECTED] ~]# perfquery ibpanic: [5790] madrpc_init: can't open UMAD port ((null):0): (No such file or directory) [EMAIL PROTECTED] ~]# What is the machine configuration in terms of RDMA devices ? Is there just an iWARP NIC in that machine or are there also IB CA(s) ? Is it correct to assume this is the latest perfquery/libibmad/libibumad ? I know this used to work in a mixed configuration but there was a change to a umad API which eliminated some functionality which might cause this to break (similar to a previous issue (that was resolved) with OpenSM in a mixed iWARP/IB configuration). BTW, Sasha is the maintainer for these management tools. I can reproduce this on a system with 1 cxgb3 rnic and 1 mthca hca. If I specify the mthca device explicitly it works. If I specify the rnic device or no devices it fails (probably because cxgb3 is the first device to query): [EMAIL PROTECTED] ~]# perfquery -C mthca0 # Port counters: Lid 1 port 1 PortSelect:..1 CounterSelect:...0x SymbolErrors:0 LinkRecovers:0 LinkDowned:..0 RcvErrors:...0 RcvRemotePhysErrors:.0 RcvSwRelayErrors:0 XmtDiscards:.1 XmtConstraintErrors:.0 RcvConstraintErrors:.0 LinkIntegrityErrors:.0 ExcBufOverrunErrors:.0 VL15Dropped:.0 XmtData:.32387 RcvData:.31697 XmtPkts:.447 RcvPkts:.434 [EMAIL PROTECTED] ~]# perfquery -C cxgb30 ibpanic: [3925] madrpc_init: can't open UMAD port (cxgb30:0): (No such file or directory) [EMAIL PROTECTED] ~]# perfquery ibpanic: [3926] madrpc_init: can't open UMAD port ((null):0): (No such file or directory) [EMAIL PROTECTED] ~]# This sounds like an additional aspect of the libibumad API issue introduced when some functionality related to this was eliminated. The previous fix related to OpenSM appears to not be the complete solution to these mixed configurations. -- Hal ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] perfquery error
Hal, perfquery barfs when an iwarp device is in the mix. I think it needs to skip over devices that are not IB. [EMAIL PROTECTED] ~]# perfquery ibpanic: [5790] madrpc_init: can't open UMAD port ((null):0): (No such file or directory) [EMAIL PROTECTED] ~]# ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg