On Wed, 28 May 2008, Hal Rosenstock wrote:
> On Wed, 2008-05-28 at 09:24 -0400, Talpey, Thomas wrote: > > At 09:03 AM 5/28/2008, Hal Rosenstock wrote: > > >On Wed, 2008-05-28 at 08:56 -0400, Talpey, Thomas wrote: > > >> At 08:39 AM 5/28/2008, Hal Rosenstock wrote: > > >> >Tom, > > >> > > > >> >On Wed, 2008-05-28 at 08:06 -0400, Talpey, Thomas wrote: > > >> >> Is it possible to manually configure two Infiniband ports to operate > > >> >> with one another in back-to-back mode, without running OpenSM > > >> >> on one of them? > > >> > > > >> >This is possible but something would need to do at least some subset of > > >> >what the SM does depending on the precise requirements and the limits > > >> >placed on the environment supported without a "full blown" SM. > > >> > > >> Okay ... but IMO the only thing we need is a LID. Or at least, in my > > >experience > > >> all I've needed is a LID. > > > > > >The port also needs to be walked from init to active which takes > > >coordination at both ends of the b2b link. > > > > Yep. But, it has all it needs with a LID, right? No messages need to be > > exchanged, for instance. > > It's more than a LID and messages do need to be exchanged (mini SM -> > SMA) to walk the port from INIT to ACTIVE. This needs to be coordinated > on both sides of the link so they move in rough concert. > > > >> In a previous effort, we simply stole the low octet of an IP address, so > > >> we'd > > >> "ifconfig ib0 1.2.3.X" and it would jam lid=X into the interface. > > >Worked great. > > >> If necessary, we would set a manual arp entry (using iproute) to avoid > > >> having > > >> to broadcast. > > > > > >That could be done if that is what is desired and can be relied upon > > >(that ib0 is configured and we only care about the first port). > > > > > >Is it just ARP support that is needed ? > > > > Well, ARP is the precursor to establishing an IP send and a TCP connection, > > which we need to do also. > > I was just asking about other broadcast/multicast needs. Sounds like > this is not the case. > > > But, if the resulting ipaddr-hwaddr mapping is > > installed, then ARP is unnecessary and the IP layer can send without using > > it. > > > > When we did this before, we'd install a "permanent" ARP entry, in a two-line > > shell script. Roughly, for peers configuring lids X and Y, it would do > > > > peer X: > > ifconfig ib0 1.2.3.X > > ip neigh add 1.2.3.Y nud permanent lladdr a.b.c.d.e.f....Y (i.e. Y's > > guid) > > > > peer Y: > > ifconfig ib0 1.2.3.Y > > ip neigh add 1.2.3.X nud permanent lladdr a.b.c.d.e.f....X > > > > And we'd be up and running for both IP and RDMA connections. We fixed a > > bug in the old iproute2 command to allow the long IB link addresses. > > > > I'm thinking that using IPOIB to drive this kind of manual setup is one way > > to approach it. It certainly would be simple, and worked for us before there > > was an OFA stack. > > This would still work. > > > Maybe I'm getting ahead of myself though, still wondering if there's a way > > to do it with what we have. > > The closest thing is OpenSM run once mode but I think you've been > describing a b2b mini SM command which wouldn't be hard to implement. Unreleated to NFS/RDMA, I wrote a small kernel module that used MADs to assign a lid, and then transitioned the port to ARMED and ACTIVE. This worked for enabling IB communication, but not IPoIB. In retrospect, I probably could have implemented the same functionality in userspace. > -- Hal > > > Tom. > > > > > > > >> >> We have done this on other IB implementations by manually assigning > > >> >> LIDs, but I discover that the "lid" entry below > > >> >/sys/class/infiniband/<device> > > >> >> is not writable, at least for mthca. > > >> > > > >> >This can be done via MADs so user_mad kernel module would be needed to > > >> >do this. > > >> > > >> Okay, all kernel modules can be assumed to be in place. How do we tell it > > >> to manage the LID, with a shell command? > > > > > >A new "command" would be needed. > > > > > >-- Hal > > > > > >> >> Also, I expect that the ipoib driver will > > >> >> be unable to join the broadcast group, so will be unwilling to > > >come up fully. > > >> > > > >> >Is IPoIB a requirement ? > > >> > > >> I think so, for two reasons. One, principle of least surprise - the user > > >> will > > >> expect to be able to ping, telnet etc if it has connectivity. Two, > > >for NFS/RDMA > > >> we require TCP and UDP connections in order to perform the mount and do > > >> locking and recovery. We could do those over a parallel ethernet > > >> connection, > > >> but that's kind of not the point. > > >> > > >> > > > >> >> With ethernet, and maybe iWARP, just a simple ifconfig can do this. > > >> >> So why > > >> >> not IB? > > >> > > > >> >The simple answer is that it is the nature of IB management (being > > >> >different than ethernet). > > >> > > >> Which, IMO, we need to boil down to simplest-possible, for at least some > > >> workable configuration. > > >> > > >> Thanks for the ideas! > > >> > > >> Tom. > > >> > > >> > > > >> >-- Hal > > >> > > > >> >> If you're wondering, my goal is give NFS/RDMA users a way to avoid > > >> >> having > > >> >> to install the many userspace modules needed to do this, including > > >> >libibverbs, > > >> >> opensm, etc. There's a lot to get wrong, and things go missing. > > >> >> Seeking an > > >> >> "easy" way to get started with just the kernel and some shell > > >> >> commands. > > >> >> > > >> >> Tom. > > >> >> > > >> >> _______________________________________________ > > >> >> general mailing list > > >> >> [email protected] > > >> >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > >> >> > > >> >> To unsubscribe, please visit > > >> >http://openib.org/mailman/listinfo/openib-general > > >> > > > > > >_______________________________________________ > > >general mailing list > > >[email protected] > > >http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > >To unsubscribe, please visit > > >http://openib.org/mailman/listinfo/openib-general > > > > _______________________________________________ > general mailing list > [email protected] > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
