> Quoting Sean Hefty <[EMAIL PROTECTED]>: > Subject: Re: [ofa-general] Re: Re: IPoIB-CM UC mode > > >Hmm, I don't see how REQ gives you data on existing connection. Further, > >this would need a spec extension to define private data format then? > >LAP trick works out of the box ... > > LAP keep-alives requires the apps to implement the keep alive timers and > detection, but sends the messages out-of-band. Why not send the > messages in-band?
Sure, this can be done. But that'd need ULP support, in this case IPoIB protocol extension. Further, if remote is up, it's nice to get a CM message saying "connection was lost" directly rather than just a timeout. What real advantages are there for doing this "in-band" as you say? > Would it make more sense to implement the entire > keep-alive solution in the CM? I think it doesn't matter much. Let's keep it where it's needed: if more UC applications surface, we can rethink this decision, and factor the code out. > >I actually think a single working solution is enough. > >No need to explore all of them :). > > I'm not saying implement all of them, just make sure that we have the > best solution. I can't think of one that I like better than using LAP, > but it feels like the CM protocol / MADs are being hijacked. For > example, if there's only one path between two nodes, LAP doesn't really > make any sense, but it ends up being used. Should we instead look at > adding new CM messages for just this purpose? Sure, I agree, this would be nice. But I expect this will take a while to get the standartization rolling. So I think we'll start with the LAP hack and add support for the new CM message when/if it's there. > >>For > >>example, event registration could be used to detect that a remote node > >>has gone down. > >>We could use per node keep alive messages, rather than > >>per connection messages. > > > >No, these won't address cases such as DREQ timeout after remote > >decides to close connection, without reboot. > > Per node keep alive messages could. It depends on what data is carried > in the message (e.g. all currently connected QPs to the node in > question). I mentioned this because it may be more efficient under some > circumstances. Yes. And with multiple connections per node, all the more so. The CM message format does not seem like a good fit for this, though: maybe some new kind of MAD? -- MST _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
