Hi Fabian
>
> I think this is a decent idea. My only reservations are that it would require
> everyone to learn the OSM Vendor Layer API. It might also not allow testing
> nuances in the access layer APIs, which might be useful.
[EZ] This is true. But the API is simple. The MAD flow API is:
bind - to get a handle for sending mads of specific class and registering callbacks
send - to send a mad
get_mad - to get a mad buffer
put_mad - to return it to the driver
The rest can be found in the OpenSM repository under osm_vendor_api.h
>
> So I think it would be useful to have the test run over each low level MAD API,
> as well as to the OSM Vendor Layer. I'm a bit weary of adding extra layers
> between the tests and the access layer - it just creates more areas where things
> can go wrong. That said, I'm not dead set on this and could be convinced
> otherwise, but I just don't know enough about the OSM Vendor Layer at the moment
> and don't have many cycles to learn it.
[EZ] I agree. Code testing should be done in all layers. But writing cluster debug tools is easier with a higher abstraction layer (callbacks vs. polling or blocking reads).
>
>
> By system names, you mean node descriptions?
[EZ] If the user provide a file describing the topology in terms of systems then the code uses the names provided in the file in its reports.
For example: Assuming you have a cluster built of a 288port switch and 288 HCAs.
The topology description could then be:
IBSW288 mySwitch
Leaf1/P1 -> HCA Rack1-Node1 P1
Leaf1/P2 -> HCA Rack1-Node2 P1
...
Leaf1/P12 -> HCA Rack2-Node3 P1
Leaf2/P1 -> HCA anyNameYouWant P2
....
Then any error report can be provided in these names like:
Error with cable from mySwitch/Leaf2/P1 to anyNameYouWant/P1
_______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
