We had a customer report that perfquery was crashing on their nodes when trying 
to query ports on a switch. When I examined the core dump, it was clear that 
libibmad was dereferencing a null pointer from one of the mad_set_ functions:

#0  0x0000000000000000 in ?? ()
#1  0x00002ae4e13e7536 in mad_set_field () from /usr/lib64/libibmad.so.5
#2  0x00002ae4e13e7656 in mad_field_name () from /usr/lib64/libibmad.so.5
#3  0x0000000000401662 in mad_dump_perfcounters_rcv_sl ()
#4  0x00000000004024c9 in mad_dump_perfcounters_rcv_sl ()
#5  0x00002ae4e18168b4 in __libc_start_main () from /lib64/libc.so.6
#6  0x0000000000401189 in mad_dump_perfcounters_rcv_sl ()
#7  0x00007fffe5570ce8 in ?? ()
#8  0x0000000000000000 in ?? ()

It appears that mad_set_field() was hitting a NULL pointer in the table of MAD 
attributes (ib_mad_f). Such entries are being used to separate different groups 
of mad attributes in the table.

Reviewing the code, I noted that the mad_set_* and mad_get_* functions already 
have some error checking to avoid going completely off the end of the table, 
but they do not detect the case where the selected field is unset. This patch 
corrects the problem.

Signed Off By: Michael Heinz <[email protected]>

Attachment: fields.patch
Description: fields.patch

_______________________________________________
ewg mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Reply via email to