If I send a combined DR path with a start lid but an empty (0 length) DR path. What is the expected behavior?
I know this could be specified with LID routing, but I don't see anywhere in the specification which says this is an error. I do however seem to have 2 different implementations on 2 different switches. For example: I have Switch A (Lid 1) and Switch B (Lid 7). I attempt to query PortInfo of Port 1 of each switch using the LID followed by an empty DR path. 17:55:22 > ./smpquery -c portinfo 1 0 1 ibwarn: [21005] mad_rpc: _do_madrpc failed; dport (Lid 1) ./smpquery: iberror: failed: operation portinfo: port info query failed 17:55:31 > ./smpquery -c portinfo 7 0 1 # Port info: Lid 7 port 1 Mkey:............................0x0000000000000000 GidPrefix:.......................0x0000000000000000 ... <normal output snipped> Detecting this special case in libibmad and turning the packet into a LID routed one succeeds but I wonder if this is an error in the SMI? I also notice this is an error on the HCA I am running from (lid 2). 17:57:42 > ./smpquery -c portinfo 2 0 1 ibwarn: [21008] mad_rpc: _do_madrpc failed; dport (Lid 2) ./smpquery: iberror: failed: operation portinfo: port info query failed Running with a simple DR path works, I guess because this is the loopback case mentioned on page 805. 17:58:16 > ./smpquery -D portinfo 0 1 # Port info: DR path slid 65535; dlid 65535; 0 port 1 Mkey:............................0x0000000000000000 GidPrefix:.......................0x2007000000000000 ... <snip> It guess that the comment "Since each part may be empty, there are eight combinations, although only four are really useful:" on line 36 Page 805 can be interpreted to mean that only those 4 combinations need to be supported. Is this true? On the other hand I think strictly this should be supported. Item 4 of C14-9 (line 24 page 810) requires the SMI to handle the packet if the HopPointer equals HopCount +1, which it is in my case (HopCount == 0, HopPointer == 1). Then after processing the SMI should return the packet as specified in C14-13 item 3 on line 9 page 812. Am I wrong? In the end it does not matter as I have to make the software work for all the hardware I have; so I will change the software. However, I wonder where exactly the spec falls on this, because I think it will influence where the fix resides. If the spec does not allow this then I think it is fine to have libibmad return an error since the user specified an invalid combined DR path. However, if this should be legal I think libibmad should work around the bad hardware out there. Thoughts? Ira -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 [email protected] _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
