mad: Add final OPA MAD processing

Doug Ledford Fri, 12 Jun 2015 07:23:44 -0700

On 06/11/2015 02:27 PM, Liran Liss wrote:
>> From: Doug Ledford [mailto:dledf...@redhat.com]
> 
>>>>> OPA cannot impersonate IB; OPA node and link types have to be
>>>>> designated as such.  In terms of MAD processing flows, both
>>>>> explicit (as in the handle_opa_smi() call below) and implicit code
>>>>> paths (which share IB flows - there are several cases) must make
>>>>> this distinction.
>>>>
>>>> As far as in the kernel is concerned, the individual capability bits
>>>> are much more important.  I would actually like to do away with the
>>>> node_type variable from struct ib_device eventually.  As for user
>>>> space,
> 
> We agreed on the concept of capability bits for the sake of simplifying code 
> sharing.
> That is OK.
> 
> But the node_type stands for more than just an abstract RDMA device:
> In IB, it designates an instance of an industry-standard, well-defined, 
> device type: it's possible link types, transport, semantics, management, 
> everything.
> It *should* be exposed to user-space so apps that know and care what they are 
> running on could continue to work.


I'm sorry, but your argument here is not very convincing at all.  And
it's somewhat hypocritical.  When RoCE was first introduced, the *exact*
same argument could be used to argue for why RoCE should require a new
node_type.  Except then, because RoCE was your own, you argued for, and
got, an expansion of the IB node_type definition that now included a
relevant link_layer attribute that apps never needed to care about
before.  However, now you are a victim of your own success.  You set the
standard then that if the new device can properly emulate an IB Verbs/IB
Link Layer device in terms of A) supported primitives (iWARP and usNIC
both fail here, and hence why they have their own node_types) and B)
queue pair creation process modulo link layer specific addressing
attributes, then that device qualifies to use the IB_CA node_type and
merely needs only a link_layer attribute to differentiate it.

The new OPA stuff appears to be following *exactly* the same development
model/path that RoCE did.  When RoCE was introduced, all the apps that
really cared about low level addressing on the link layer had to be
modified to encompass the new link type.  This is simply link_layer
number three for apps to care about.

> The place for abstraction is in the rdmacm/CMA, which serves applications 
> that just
> want some RDMA functionality regardless of the underlying technology.
> 
>>>
>>> All SMI code has different behavior if it is running on a switch or
>>> HCA, so testing for 'switchyness' is very appropriate here.
>>
>> Sure...
>>
>>> cap_is_switch_smi would be a nice refinement to let us drop nodetype.
>>
>> Exactly, we need a bit added to the immutable data bits, and a new cap_
>> helper, and then nodetype is ready to be retired.  Add a bit, drop a
>> u8 ;-)
>>
> 
> This is indeed a viable solution.
> 
>>> I don't have a problem with sharing the IBA constant names for MAD
>>> structures (like RDMA_NODE_IB_SWITCH) between IB and OPA code. They
>>> already share the structure layouts/etc.
>>>
> 
> The node type is reflected to user-space, which, as I mentioned above, is 
> important.
> Abusing this enumeration is misleading, even in the kernel.
> Jason's proposal for a 'cap_is_switch_smi' is more readable, and directly in 
> line with
> the explicit capability approach that we discussed.
> 
> N�����r��y���b�X��ǧv�^�)޺{.n�+����{��ٚ�{ay�ʇڙ�,j��f���h���z��w���
> ���j:+v���w�j�m��������zZ+�����ݢj"��!tml=
>

signature.asc
Description: OpenPGP digital signature

Re: [PATCH 14/14] IB/mad: Add final OPA MAD processing

Reply via email to