Hi Shahar,
> Your analysis is not completely accurate. The SM configure the
> subnet using direct mads only, and it builds a spanning tree of direct
> routes. What I want to say, is that that it doesn't matter why exactly a
> port is unreachable. Once a port can not be reached, you can either
> retry the entire heavy sweep process, but if the problem repeats itself
> (X times) on the same port, you have no alternative other then disable
> it.
The point is that the real "bad" ports are not the ones that are killing 100% of packets
(since they will simply have a "DOWN" state and vanish).
The real bad ports are the ones that pass < 25% (as we use retry of 4) of packets that goes through them.
When such a port happen to be on a switch it will normally cause other ports to appear to be "bad" - NOT ITSELF !
The reason for it is that the number of packets sent through a switch port (not a leaf switch port) is much larger then the number of packets that deals with the discovery of the port itself. All the ports "behind" the switch port will go through that port. And there is a much higher chance for ALL the packets that goes to an end-port be dropped then the chance for ALL the packets that goes through the switch ports to be dropped).
So if you implement the feature the way it was proposed what you will end up with is disconnecting end-ports and not the real bad port.
Why is it bad? It is bad since in tree topology the end-ports always have an alternate path to the SM. If you could find the real flaky bad port - you could still communicate with all the end-ports.
So how do we find that bad port/cable that causes other port to appear bad?
We have internally had many long discussions on this topic. The algorithm is not fully developed yet. But several things are clear:
1. One needs to track the number of successful and bad packet flowing through each port. Such that a failure rate can be obtained for each port.
2. Topology based analysis should be used to find the common point that is first to have a high drop rate on the directed route tree.
3. Alternate directed routes might be used to invalidate "suspicious" ports.
In any case, I was not proposing relying on traps. I was suggesting to use the
"healthy" bit on physical ports as the way to carry the information about "bad" ports (once we correctly find them) into the rest of the algorithms used by the SM.
Regarding the need to "disconnect" a bad HCA "end-port" - I still have not seen any log showing OpenSM going through infinite "polling" of bad ports. As I know the code - I can not believe this is possible - so unless you have a log that shows this phenomena (and not another one) please do not chance this path.
One last word. I would highly recommend using the management simulator for setting arbitrary (random) bad packet drops and test any algorithm you might think of.
EZ
Eitan Zahavi
Design Technology Director
Mellanox Technologies LTD
Tel:+972-4-9097208
Fax:+972-4-9593245
P.O. Box 586 Yokneam 20692 ISRAEL
> -----Original Message-----
> From: shaharf [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, April 13, 2005 5:03 PM
> To: Eitan Zahavi; Hal Rosenstock
> Cc: [email protected]
> Subject: RE: [openib-general] SM Bad Port Handling
>
> Eitan,
>
> Your analysis is not completely accurate. The SM configure the
> subnet using direct mads only, and it builds a spanning tree of direct
> routes. What I want to say, is that that it doesn't matter why exactly a
> port is unreachable. Once a port can not be reached, you can either
> retry the entire heavy sweep process, but if the problem repeats itself
> (X times) on the same port, you have no alternative other then disable
> it. If the SM will have an alternative method of building direct paths,
> then such alternative path could be attempted. Currently it is not
> relevant. Speaking of "statistical analysis", what are the odds that a
> port will behave well when it is queried directly, but starts to loose
> packets when a direct route is routed through it, and behave
> consistently during all retries? Again, even if this is the case (and in
> understatement, I am not sure how frequent it is), the port behind it is
> unreachable and therefore "bad".
>
> The current unhealthy port mechanism is not redundant to this "bad" port
> mechanism because it does not handle the same case. Both mechanisms are
> required. The issue if they can share the same status bit is really an
> implementation issue.
>
> Relying of traps is very problematic in some cases, particularly in
> initial bring up sweep when the SM lid is not even configured (remember
> VTEC?).
>
> Shahar
>
>
> ________________________________________
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED]] On Behalf Of Eitan Zahavi
> Sent: Wednesday, April 13, 2005 11:21 AM
> To: Hal Rosenstock; Eitan Zahavi
> Cc: [email protected]
> Subject: RE: [openib-general] SM Bad Port Handling
>
> I probably did not make point very clear:
> It is bad (not to say wrong) to disqualify a port and mark it as bad
> port if it did not respond to queries.
> The cause of the issue might be a flaky link on the directed route to
> the port.
> If the SM would be able to find that flaky link port it would avoid
> marking the wrong ports. More over, the port that was almost marked as
> bad by the simplistic algorithm you propose will be discovered and
> operational as there many other paths to reach it - walking around the
> real bad port !
> Eitan Zahavi
> Design Technology Director
> Mellanox Technologies LTD
> Tel:+972-4-9097208
> Fax:+972-4-9593245
> P.O. Box 586 Yokneam 20692 ISRAEL
>
> > -----Original Message-----
> > From: Hal Rosenstock [mailto:[EMAIL PROTECTED]]
> > Sent: Wednesday, April 13, 2005 12:00 PM
> > To: Eitan Zahavi
> > Cc: [email protected]
> > Subject: RE: [openib-general] SM Bad Port Handling
> >
> > On Wed, 2005-04-13 at 01:28, Eitan Zahavi wrote:
> > > [EZ] This is true. Currently there is only one cause for the
> > > un-healthy bits to be set - which are exactly as you point - these
> > > traps. The point I was trying to make was that this bit is the
> > > mechanism for flagging a port status is bad.
> > >
> > > What I did recommend was to write a "statistical" analysis of
> Directed
> > > Route packet drop - such that we can find the ports with a high drop
>
> > > rate and mark them as un-healthy. If you mark every port that does
> not
> > > respond to a MAD as un-healthy you can suffer from flaky links
> > > somewhere on the route to that port. Only analysis of the number of
> > > good packets vs. dropped packets can lead you to the right bad port.
>
> >
> > The original proposal on this said the following:
> >
> > "The OpenSM will implement a configurable policy (some number of
> > consecutive lack of responses to SM requests). At the point of
> > exhaustion of the timeout/retry strategy, that port will be marked as
> > "bad" by OpenSM."
> >
> > Any idea on what might make a good default threshold (for consecutive
> > retries) ? Do you think there is no sufficient default ?
> >
> > If a link is flaky and MADs can't get through, should it be used for
> non
> > MAD traffic ?
> >
> > Also note that the proposal also said:
> >
> > "Also, there could also be a periodic "ping" at a slower rate to check
>
> > if the "bad" ports revive."
> >
> > In terms of analysis of good v. errored and dropped packets (along the
>
> > path to that node), there are OpenIB diagnostic tools to help with
> this.
> >
> > -- Hal
_______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
