Hal Rosenstock wrote:
On Wed, 2007-12-19 at 11:58 -0800, [EMAIL PROTECTED] wrote:
We're seeing a regression in smpquery from alpha2 to rc1.
For example, with alpha2 I get:
grommit:~ # smpquery -G nodeinfo 0x66a01a000737c
# Node info: Lid 3
BaseVers:........................1
ClassVers:.......................1
NodeType:........................Channel Adapter
NumPorts:........................2
SystemGuid:......................0x00066a009800737c
Guid:............................0x00066a009800737c
PortGuid:........................0x00066a01a000737c
PartCap:.........................64
DevId:...........................0x6278
Revision:........................0x000000a0
LocalPort:.......................2
VendorId:........................0x00066a
grommit:~ #

And with rc1, I get:
grommit:~ # smpquery -G nodeinfo 0x66a01a000737c
ibwarn: [5650] ib_path_query: sa call path_query failed
smpquery: iberror: failed: can't resolve destination port 0x66a01a000737c
grommit:~ #
But using a LID works fine:
grommit:~ # smpquery nodeinfo 3
# Node info: Lid 3
BaseVers:........................1
ClassVers:.......................1
NodeType:........................Channel Adapter
NumPorts:........................2
SystemGuid:......................0x00066a009800737c
Guid:............................0x00066a009800737c
PortGuid:........................0x00066a01a000737c
PartCap:.........................64
DevId:...........................0x6278
Revision:........................0x000000a0
LocalPort:.......................2
VendorId:........................0x00066a
grommit:~ #
Strangest of all, running it under strace also works:
grommit:~ # strace smpquery -G nodeinfo 0x66a01a000737c > /tmp/smpquery.out .....
grommit:~ # cat /tmp/smpquery.out
# Node info: Lid 3
BaseVers:........................1
ClassVers:.......................1
NodeType:........................Channel Adapter
NumPorts:........................2
SystemGuid:......................0x00066a009800737c
Guid:............................0x00066a009800737c
PortGuid:........................0x00066a01a000737c
PartCap:.........................64
DevId:...........................0x6278
Revision:........................0x000000a0
LocalPort:.......................2
VendorId:........................0x00066a
grommit:~ #

Some weird race condition...

Anyone else seeing the same?

-G requires a SA path record lookup so this could be an issue with that
timing out in some cases (assuming the port is active and the SM is
operational).

I'm seeing the same problem.
Sometimes the query works, and sometimes it doesn't.
I also see that when the query fails, OpenSM doesn't get PathRecord query at 
all.

Hal, can you elaborate on "that timing out in some cases" issue?

Adding Jack for the libibmad issue:

I see that the ib_path_query() in libibmad/sa.c sometimes fails
when calling safe_sa_call().


-- Yevgeny

-- Hal
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to