> On Jun 16, 2015, at 5:05 PM, Liran Liss <[email protected]> wrote:
> 
>> From: Doug Ledford [mailto:[email protected]]
> 
>>> No. RoCE is as an open standard from the IBTA with the exact same RDMA
>> protocol semantics as InfiniBand and a clear set of compliancy rules without
>> which an implementation can't claim to be such. A RoCE device *is* an IB CA
>> with an Ethernet link.
>>> In contrast, OPA is a proprietary protocol. We don't know what primitives
>> are supported, and whether the semantics of supported primitives are the
>> same as in InfiniBand.
>> 
>> Intel has stated on this list that they intend for RDMA apps to run on
>> OPA transparently.  That pretty much implies the list of primitives and
>> everything else that they must support.  However, time will tell if they
>> succeeded or not.
>> 
> 
> I am sorry, but that's not good enough.
> When I see an IB device, I know exactly what to expect. I can't say anything 
> regarding an OPA device.
> 
> It might be that today the semantics are "close enough".
> But in the future, both feature sets and semantics may diverge considerably.
> What are you going to do then?
> 
> In addition, today, the host admin knows that 2 IB CA nodes will always 
> interoperate. If you share the node type with OPA, everything breaks down. 
> There is no way of knowing which devices work with which.

You’ve not done yourself any favors with this argument.  You’ve actually 
stretched yourself into the land of hyperbole and FUD in order to make this.  
Do you not see that “2 IB CA nodes will always interoperate” is not true as 
soon as you consider differing link layer types?  For example, an mlx4_en 
device will not interoperate with a qib device, yet they are both IB_CA node 
types.  Conflating allowing an OPA device to be node type IB_CA and link layer 
OPA to everything breaking down is pure and utter rubbish.  And with that, we 
are done with this discussion.  I’ve detailed what my litmus test will be, and 
I’m sticking with exactly that.

In the case of iWARP and usNIC, there are significant differences from an IB_CA 
that render a program responsible for possibly altering its intended transfer 
mechanism significantly (for instance usNIC is UD only, iWARP can’t do atomics 
or immediate data, so any transfer engine design that uses either of those is 
out of the question).  On the other hand, everything that uses IB_CA supports 
the various primitives and only vary in their addressing/management.  If OPA 
stays true to that (and it certainly does so far by supporting the same verbs 
as qib), then IB_CA/link_layer OPA is perfectly acceptable and in fact 
preferred due to the fact that it will produce the minimum amount of change in 
user space applications before they can support the OPA devices.

>> So this will be my litmus test.  Currently, an app that supports all of
>> the RDMA types looks like this:
>> 
>> if (node_type == RNIC)
>>      do iwarpy stuff
>> else if (node_type == USNIC)
>>      do USNIC stuff
>> else if (node_type == IB_CA)
>>      do IB verbs stuff
>>      if (link_layer == Ethernet)
>>              do RoCE addressing/management
>>      else
>>              do IB addressing/management
>> 
>> 
>> 
>> If, in the end, apps that are modified to support OPA end up looking
>> like this:
>> 
>> if (node_type == RNIC)
>>      do iwarpy stuff
>> else if (node_type == USNIC)
>>      do USNIC stuff
>> else if (node_type == IB_CA || node_type == OPA_CA)
>>      do IB verbs stuff
>>      if (node_type == OPA_CA)
>>              do OPA addressing/management
>>      else if (link_layer == Ethernet)
>>              do RoCE addressing/management
>>      else
>>              do IB addressing/management
>> 
>> where you can plainly see that the exact same goal can be accomplished
>> whether you have an OPA node_type or an IB_CA node_type + OPA
>> link_layer, then I will be fine with either a new node_type or a new
>> link_layer.  They will be functionally equivalent as far as I'm concerned.
>> 
> 
> It is true that for some applications, your abstraction might work 
> transparently.
> But for other applications, your "do IB verbs stuff" (and not just the 
> addressing/management) will either break today or break tomorrow.

FUD.  Come to me when you have a concrete issue and not hand-wavy scare 
mongering.

> This is bad both for IB and for OPA.

No, it’s not.

> Why on earth are we putting ourselves into a position which could easily be 
> avoided in the first place?
> 
> The solution is simple:
> - As an API, Verbs will support IB/ROCE, iWARP, USNIC, and OPA

There is *zero* functional difference between node_type == OPA or node_type == 
IB_CA and link_layer == OPA. An application has *exactly* what they need to do 
everything you have mentioned.  It changes the test the application makes, but 
not what the application does.

> - The node type and link type refer to specific technologies

Yes, and Intel has made it clear that they are copying IB Verbs as a 
technology.  It is common that the one being copied be pissed off by the 
copying, but it is also common that they can’t do a damn thing about it.

> -- Most applications indeed don't care and don't check either of these 
> properties anyway
> -- Those that do, do it for a good reason; don't break them
> - Management helpers will do a good job to keep the code maintainable and 
> efficient even if OPA and IB have different node types
> 
> Win-win situation...
> --Liran
> 
> 
> 
> N���r�y���b�X�ǧv�^�)޺{.n�+�{#<^_NSEDR_^#<ٚ�{ay�ʇڙ�,j�f�h��z��w�
> �j:+v�w�j�m����zZ+��ݢj"�!

—
Doug Ledford <[email protected]>
        GPG Key ID: 0E572FDD





Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to