Sean Hefty wrote:
Consider NFS and NFS-RDMA. The NFS gurus struggled with this very issue and concluded that the RDMA service needs to be on a separate port. Thus they are proposing a new netid/port number for doing RDMA mounts vs TCP/UDP mounts. IMO that is the correct way to go: RDMA services are different that tcp services. They use a different protocol on top of TCP and thus shouldn't be handled on the same TCP port. So, applications that want to service Sockets and RDMA services concurrently would do so by listening on different ports...

This is a good point, and a different view from what I've been taking. I was looking at it more like trying to provide the same service over UDP and TCP, where you use the same port number. I just can't come up with any solution that works for iWarp, and sharing the port space seems like the only way to fix things.

The iWARP protocols don't include a UDP based service, so it is not needed. But if you're calling it a UDP port space, maybe it should be the host's port space?

I think it should match what's done for TCP. IMO, there should be a connectionless RDMA service, along with multicast, over UDP/IP/Ethernet. :)


I think the winner would really be a reliable connectionless RDMA service with mcast.

Yes. The only exports interfaces into the host port allocation stuff requires a socket struct. I didn't want to try and tackle exporting the port allocation services at a lower level. Even at the bottom level, I think it still assumes a socket struct...

I looked at this too at one point, and gave up as well. I don't know what other assumptions are made in the stack as a result of this. For example, if an app binds to an IP and port, and the IP address is removed and re-added, is the port still valid/reserved?


I just tried this and I believe the application is still listening/bound even though the address is no longer valid for the host:

[EMAIL PROTECTED] ~]# ifconfig eth1
eth1      Link encap:Ethernet  HWaddr 00:E0:81:33:67:D1
          BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
          Interrupt:29

[EMAIL PROTECTED] ~]# netserver -L 192.168.69.135 -p 2222 -4
Starting netserver at port 2222
set_up_server could not establish a listen endpoint for port 2222 with family AF_INET
[EMAIL PROTECTED] ~]# ifconfig eth1 192.168.69.135 up
[EMAIL PROTECTED] ~]# netserver -L 192.168.69.135 -p 2222 -4
Starting netserver at port 2222
Starting netserver at hostname 192.168.69.135 port 2222 and family AF_INET
[EMAIL PROTECTED] ~]# netstat -an|grep 2222
tcp 0 0 192.168.69.135:2222 0.0.0.0:* LISTEN
[EMAIL PROTECTED] ~]# ifconfig eth1 0.0.0.0
[EMAIL PROTECTED] ~]# netstat -an|grep 2222
tcp 0 0 192.168.69.135:2222 0.0.0.0:* LISTEN
[EMAIL PROTECTED] ~]# ifconfig eth1
eth1      Link encap:Ethernet  HWaddr 00:E0:81:33:67:D1
          inet6 addr: fe80::2e0:81ff:fe33:67d1/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 b)  TX bytes:176 (176.0 b)
          Interrupt:29

[EMAIL PROTECTED] ~]#


For iWarp, is using a struct socket essentially any different than transitioning an existing socket to RDMA mode?

In the RFC patch I posted, the socket is _just_ to allow binding to a port/addr. Its not used for anything else. From the native stack's perspective, its a TCP socket in the CLOSED state (but bound) I guess.

You're just requiring it to be in a specific state. Are there problems around doing this? How much harder (technically, as opposed to politically) would it be to take this change a step farther and offload an active connection?

By active, do you mean in the ESTABLISHED state?


I left it all in to show the minimal changes needed to implement the functionality. To keep the patch simple for initial consumption. But yes, the rdma-cm really doesn't need to track the port stuff for TCP since the host stack does.

Okay - for final patches, I think we want to remove the rdma_cm specific port spaces, along with changing the API to clarify that it uses the same port space as TCP/UDP.

What do you mean by changing the API? Adding a new port space enum?


I haven't looked in detail at the SDP code, but I would think it should want the TCP port space and not its own anwyay, but I'm not sure. What is the point of the SDP port space anyway?

The rdma_cm needs to adjust its protocol for SDP over IB. I'm not too concerned with SDP, since it's not upstream yet, but I don't want to break it beyond repair either. The rdma_cm may not need to manage the SDP port space at all, and instead rely on SDP to ensure that it provides unique port numbers by itself.

- Sean

_______________________________________________
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to