Jason,

There are also good reasons why the RoCE standard left the syntax of address 
handles.
First, it keeps the Verbs unchanged. Even if you are using rdmacm to make 
connections, you still have to inspect address handles when "connecting" to UD 
QPs or joining multicast addresses.
In addition, each incoming packet generates a CQE, whose L2 fields also need to 
be inspected.

Second, making Ethernet L2 fields explicit has implications beyond the address 
handle and CQE formats. Specifically, a lot of the IBTA defined MADs must be 
modified as well.
The most evident example is the CM protocol, which has L2 fields in its 
payloads.

Third, RoCE is not IB; its all about making RDMA user-friendly to Ethernet 
users.
Most importantly, we don't want to change the way Ethernet networks are managed.
This means that admins configure their normal network interfaces, define VLAN 
sub-interfaces, assign IP addresses (or use DHCP), and then work with RoCE 
using IP-mapped addresses, which reference the same IP addresses they use for 
their Ethernet interfaces.
So, regarding our VLAN discussion:
- RoCE gids are L3 addresses, which are not (necessarily) of link-local scope; 
people will mostly use IP-mapped gids of global scope.
- These gids will map to an IP address, which then can resolve to an outgoing 
vlan device exactly as in Ethernet.

We have a specification, we have an implementation, and we have clean way of 
passing RoCE L2 information to user-space via address handles.
I don't see any substantial reason to change the basic approach.

Regards,
--Liran


-----Original Message-----
From: Jason Gunthorpe [mailto:[email protected]] 
Sent: Friday, June 25, 2010 6:58 PM
To: Liran Liss
Cc: Hefty, Sean; Roland Dreier; Aleksey Senin; linux-rdma; [email protected]; 
[email protected]; [email protected]; Tziporet Koren; [email protected]
Subject: Re: When IBoE will be merged to upstream?

On Fri, Jun 25, 2010 at 11:04:28AM +0300, Liran Liss wrote:

> VLANs are part of L2 in Ethernet -- when you resolve a destination
> L3 address to an L2 address, you get the outgoing interface, which 
> also determines the VLAN.  I think this approach has an advantage over 
> an RDMA device per VLAN in that you keep the standard OS VLAN 
> management (vconfig).

Except that in RoCE all L3 addresses are link local GIDs, which must be scoped 
to an interface and cannot be resolved by routing to a specific interface. 
vconfig creates child ethernet devices, I think you have no choice but to do 
the same for RDMA. The GID, when it is resolved, must be scoped to the RDMA 
device it is going to be bound to, which in turn must be bound to a VLAN.

(BTW, Sean, did AF_IB's sockaddr include a scoping field, and did you figure 
out some way to make that work?)

> I wouldn't judge the RoCE spec so quickly --- it guarantees that rdma 
> application binaries could run on any network.  What do you gain by 
> exposing Eth-specific L2 params in the address handle?

Well, 1) invariably that is how the hardware must work, and verbs is about 
exposing that interface to userspace 2) You don't suddenly make AH setup 
require network traffic, and potentitally large time delays 3) it keeps the 
whole RoCE architecture far more consistent with IB.

You can pose the same question for IB, why doesn't AH resolution resolve the 
GID? There are lots of good answers :)

Also bear in mind that APM is entirely possible over RoCE and doing that will 
require a finer touch for managing the data in the AH's.

What do you get by doing all this extra work? I say nothing at all. Users won't 
even be able to tell the difference as long as they use rdmacm to setup the 
connections.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to