> > The general format of the request and response will be the same:
> >
> >   | netlink header |
> >   |  Data header   |
> >   |      Data      |
> >
> > The data header contains information about the type of
> > request/response, the status (for response), the type (format) of the
> > data, the total length of the data header + data, and a flags field about 
> > the
> request/response or data.
> 
> I assume we can stack multiple data records?
> 
> So a response can have the required number of path records?

Yes, you can. The type indicates the data format of individual record. The 
length field, along with potential flags definition (multi-record indicator), 
can determine how many records are returned.

> 
> There is growing interest in APM as well, please ensure that all 6 APM
> records can be returned to any query:
>  - Primary GMP Path
>  - Primary Forward Path
>  - Primary Return Path
>  - Alternate GMP Path
>  - Alternate Forward Path
>  - Alternate Return Path
> 

Using struct ib_path_rec_data in each record should be able to accomplish this. 
Again a type should be defined for this format. Alternative, we could define a 
mixed type where each data record has a subheader (subheader + data == data 
section):

#define IB_NL_DATA_TYPE_MIXED                   0x0008

struct ib_nl_data_sub_hdr {
        __u16   type;
        __u16   flags;
        __u32   length;
};

----------------------
|  netlink header |
----------------------
| Data header      |
---------------------
| data subhdr 1   |
--------------------
|  data rec 1         |
--------------------
| data subhdr 2 |
--------------------
|  data rec 2       |
-------------------
|         ....                |
--------------------
| data subhdr N|
--------------------
| data rec N       |
-------------------


> [Somewhere I have an experimental patch that globally enables one-shot
> APM for RDMA-CM users, it isn't a big step]
> 
> Please at least consider how we could use the netlink interface to maintain
> APM when alternate paths trigger and new path data needs to be loaded.
> 
> Please consider how we could use this netlink interface to alter existing
> alternate paths on established QPs.
> 
> (Consider, means just think through how the protocol would work, not
> implement)
> 
> Can you please provide a some quick examples of exactly what the exchange
> will look like:
>  - IPoIB UD mode connecting to a peer based on a ND response
>  - IPoIB RC mode connecting to a peer based on a ND response

Not familiar with IPoIB and not sure what information exchange is needed here 
except for multicast group joining. MCMemberRecord could be gotten from a user 
application (SA proxy), similar to that for pathrecord, by sending query to the 
user application and getting back the MCMemberRecord. If the use application 
supports setting this attribute, it can be set through similar request/response 
exchange.

Any help for details?

>  - RDMA CM connecting RC from a src IP to a dst IP

Request from rdma_cm to resolve src/dst IP could be sent to user application 
(eg ibacm) and the pathrecord is sent back as the response. rdma_cm could use 
the returned info to establish connections. Again I am not familiar with the 
rdma_cm details.

Any expert out there? I know Sean is out today.

> 
> > #define IB_NL_DATA_TYPE_INVALID                     0x0000
> > #define IB_NL_DATA_TYPE_NAME                        0x0001
> > #define IB_NL_DATA_TYPE_ADDRESS_IP          0x0002
> > #define IB_NL_DATA_TYPE_ADDRESS_IP6         0x0003
> > #define IB_NL_DATA_TYPE_PATH_RECORD         0x0004
> > #define IB_NL_DATA_TYPE_USER_PATH_REC               0x0005
> > #define IB_NL_DATA_TYPE_MAD                 0x0006
> 
> We definitely want to include policy information:
>  - What IPoIB netdev is this associated with, if any
>  - IP TOS bits, tclass, flowlabel
>  - Requesting kernel agent
>  - Src/Dst IP
> 
> I see this as a way to delegate path lookup to user space, so that userspace
> can inject policy.

As shown above, we can use subheader (or data sections) to aggregate data into 
one request/response.

Kaike

> 
> Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to