Note to PSARC admin folks: I may need manual
intervention to get this into the agenda.
(As far as I can tell, the tools don't support
a fasttrack using an existing case with
one-pager already in place.)

Thanks,
-ted

++++
This information is Copyright 2009 Sun Microsystems
1. Introduction
     1.1. Project/Component Working Name:
         Open Fabrics User Verbs (OFUV) primary kernel components
     1.2. Name of Document Author/Supplier:
         Author:  Brendan Doyle
     1.3  Date of This Document:
        6 November, 2009
4. Technical Description

Open Fabrics User Verbs (OFUV) primary kernel components
========================================================

Table of Contents
-----------------
         I.   Introduction
         II.  Summary of Interfaces
                 A. Exported OFED RDMA CM APIs
                 B. Imported Contracted Project Private IBTF APIs
                 C. New Imported IBTF API
         III. Description of Interfaces
                 A. OFED RDMA CM APIs
                 B. Contracts for private IBTF APIs
                 C. Updated IBTF API
         IV.  Summary of changes by man page


I. Introduction
---------------
In Linux, the most popular InfiniBand (IB) OS-bypass framework is the
Open Fabrics User Verbs (OFUV) framework from the Open Fabrics
Enterprise Distribution (OFED). The OFUV API itself is modeled to a
large degree on the KPI of OFED, used by OFED Linux kernel
modules. Many of the calls originating in userland eventually join
kernel calls in common code further down the stack in the OFED
framework.

In Solaris, OFUV is being ported over in two parts: kernel and
userland. The kernel part is more fully described in the one-pager of
this case. The userland portion is a companion project which delivers
the open source libraries. Because of the similarity of the OFUV API
to the OFED KPI and their largely common back-end, there is an
opportunity to port selected portions of OFED KPI along the way to
accelerate development of certain key applications.

In particular, our business objective is to port what is necessary to
satisfy some of the requirements of Oracle's Reliable Datagram Sockets
v3 (RDSv3, used in Exadata 2) and the Lustre Solaris port. Both
applications are originally written for Linux OFED and benefit
considerably from porting a certain facility known as the "RDMA-CM",
which provides a KPI to manage IB connections. This KPI would be
an alternative compatibility KPI to the one we already have in
IBTF (Solaris IB framework).

The kernel OFUV project is being delivered in phases. This phase
provides the RDMA-CM KPI for the kernel applications mentioned above
and lays the foundation for later phases. The actual OS-bypass
functionality is not enabled in this phase, but code is architected
with this goal in mind and as a result ends up being distributed in a
number of driver modules that match the architecture of the final
phase of the project. A later fasttrack will describe interface
enabled and used for the support of the OS-bypass functionality.

In summary, this fast track describes the RDMA-CM interfaces delivered
in this first phase of the OFUV kernel project.


References:

    o Open Fabrics User Verbs (OFUV) - primary kernel components
      PSARC/2009/421 one-pager
      http://sac.sfbay/PSARC/2009/421/20090731_brendan.doyle

    o IBTF: InfiniBand Transport Framework
      PSARC/2002/132

    o RDS - Reliable Datagram Service
      PSARC/2006/356

    o Kernel RDMA CM API Architecture and use:
      materials directory: ofuv_rdma_arch.txt

    o Solaris Open Fabrics User Verbs Architecture Document:
      materials directory: solaris_ofuv_arch.pdf

    o OFUV Implementation Details:
      materials directory: OFUVImplementationDetails.pdf


II. Summary of Interfaces
-------------------------
This case asserts a micro/patch binding.

A. Exported OFED RDMA CM APIs   - ON Consolidation Private
         rdma_accept()                   - OFED Defined
         rdma_bind_addr()                - OFED Defined
         rdma_cm_event_handler()         - OFED Defined
         rdma_connect()                  - OFED Defined
         rdma_create_id()                - OFED Defined
         rdma_create_qp()                - OFED Defined
         rdma_destroy_id()               - OFED Defined
         rdma_destroy_qp()               - OFED Defined
         rdma_disconnect()               - OFED Defined
         rdma_init_qp_attr()             - OFED Defined
         rdma_join_multicast()           - OFED Defined
         rdma_leave_multicast()          - OFED Defined
         rdma_listen()                   - OFED Defined
         rdma_reject()                   - OFED Defined
         rdma_resolve_addr()             - OFED Defined
         rdma_resolve_route()            - OFED Defined

         ib_get_ibt_channel_hdl()        - Solaris Extension
         ib_get_ibt_hca_hdl()            - Solaris Extension

B. Imported Contracted Project Private IBTF APIs
         These two calls are private IBTF interfaces used in this
         project by contract:

         ibt_ofuvcm_get_req_data()
         ibt_ofuvcm_proceed()

         Additionally a Contracted Project Private IBTF interface flag
         is added to the ibt_open_rc_channel(9f) function as follows:

         IBT_OCHAN_OFUV
           Indicates this channel is for an Open Fabric User Verbs (OFUV)
           consumer. IBTF does not flush the QP associated with channel
           when a DREQ is received for OFUV channels.

C. New Imported IBTF API
         IBT_GENERIC_MISC        - ON Consolidation Private
                                   (IBTF Transport Interface)
                                   add new value to ibt_clnt_class_t
                                   arg of ibt_attach(9f)


III. Description of Interfaces
------------------------------
A. OFED RDMA CM APIs

    This project provides the OFED kernel RDMA CM interfaces defined in
    the rdma_cm.h OFED header, with a number of Solaris specific
    extensions required in order to interface into IBTF (which map from
    "CM ID" concept to the related IBTF handles).

    The 'sol_ofs' kernel module exports the OFED RDMA CM interfaces to
    kernel consumers, and translates the OFED APIs into Solaris
    equivalent IBTF APIs. See the provided man pages (in the
    materials/man_pages directory) for details on each API.

B. Contract for private IBTF APIs

    See the case directory for contract (contract-01.txt) to use the
    project private IBTF APIs.

C. Updated IBTF API

    To support this framework, a new client class(IBT_GENERIC_MISC) is
    added to the list of support IBTF client classes (ibt_clnt_class_t).

    This change is documented in the revised man pages for ibt_attach.9f
    and ibt_clnt_modinfo_t.9s in the materials/man_pages directory.


IV. Summary of changes by man page
----------------------------------
The OFED manual pages for the RDMA CM kernel APIs are taken from OFED
and converted to Solaris conventions. A few new man pages for the
Solaris specific extension are also provided. Modified versions of
existing man pages have change bars.

All new or changed man pages can be found in the case
materials/man_pages directory.


  Man page                       Disposition     Reasons for change
(sorted by section and name)                    (subsection of III)
------------------------------------------------------------------
sol_ofs(7D)                     new             A
sol_uverbs(7D)                  new             A
sol_ucma(7d)                    new             A

rdmacm(9)                       new             A

rdma_cm_event_handler(9E)       new             A

ib_get_ibt_channel_hdl(9F)      new             A
ib_get_ibt_hca_hdl(9F)          new             A
ibt_attach(9F)                  changed         C
rdma_accept(9F)                 new             A
rdma_bind_addr(9F)              new             A
rdma_connect(9F)                new             A
rdma_create_id(9F)              new             A
rdma_create_qp(9F)              new             A
rdma_destroy_id(9F)             new             A
rdma_destroy_qp(9F)             new             A
rdma_disconnect(9F)             new             A
rdma_init_qp_attr(9F)           new             A
rdma_join_multicast(9F)         new             A
rdma_leave_multicast(9F)        new             A
rdma_listen(9F)                 new             A
rdma_reject(9F)                 new             A
rdma_resolve_addr(9F)           new             A
rdma_resolve_route(9F)          new             A

ibt_clnt_modinfo_t(9S)          changed         C


6. Resources and Schedule
     6.4. Steering Committee requested information
        6.4.1. Consolidation C-team Name:
                ON
     6.5. ARC review type: FastTrack
     6.6. ARC Exposure: open

-- 
Ted H. Kim
Sun Microsystems, Inc.                  ted.kim at sun.com
222 North Sepulveda Blvd., 10th Floor   (310) 341-1116
El Segundo, CA  90245                   (310) 341-1120 FAX

Reply via email to