We've recently noticed that the Node Description for a node can
mis-mismatch between the output of smpquery and saquery.  For example:

# smpquery NodeDesc 427
Node Description:.................sierra1932 qib0

# saquery NodeRecord 427 | grep NodeDesc
                NodeDescription.........QLogic Infiniband HCA

A restart of OpenSM is the current solution to resolve this.

We've noticed it occurring more often on our larger clusters than our
smaller clusters, leading to a speculation about why it is happening.

The speculation is when a node comes up, there is a window of time in
which the HCA is up, can be scanned by OpenSM, but not yet have its node
descriptor set (in RHEL I appears to be set via /etc/init.d/rdma).
During this window, OpenSM reads/stores the non-desired node descriptor
(in the above case the non-desired "Qlogic Infiniband HCA").

When the node descriptor is changed, a trap should be sent to opensm
indicating the change.  Normally OpenSM gets the trap and reads the new
node descriptor.

On our large clusters all nodes are typically brought up at the same
time, so there are probably a ton of node descriptor change traps
happening at the exact same time.  We speculate a number of these are
dropped/lost, and subsequently OpenSM never realizes that the node
descriptor has changed.

I don't know if the speculation sounds reasonable or not.  Regardless,
we're not sure of the best fix.

A trivial fix would be to just make OpenSM re-scan the node descriptor
of an HCA, perhaps during a heavy sweep.  But I don't know if this is
optimal.  It'll introduce more MADs on the wire.  However if the present
solution is to restart OpenSM, we figure this can't be any worse.

Just wondering what peoples thoughts are of if there's another obvious
solution we're not seeing.

Al

-- 
Albert Chu
[email protected]
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to