2024 Minutes

Xiong, Jianxin Tue, 20 Aug 2024 11:09:24 -0700

Thank Alexia for taking the notes.

8/20/2024


* Participants:

        Aleixa Ingerson (Intel)
        Adam Goldman (Intel)
        Alex McKinley (Intel)
        Ben Lynam (Cornelis)
        Charles Shereda (Cornelis)
        Howard Pritchard (LANL)
        Ian Ziemba (HPE)
        Jerome Soumagne
        Jianxin Xiong (Intel)
        John Byrne (HPE)
        Juee Desai (Intel)
        Ken Raffenetti (ANL)
        Nikhil Nanal (Intel)
        Peinan Zhang (Intel)
        Rajalaxmi (Intel)
        Shi Jin (AWS)
        Stephen Oost (Intel)
        Steve Welch (HPE)
        Zach Dworkin (Intel)

* Notes:

        2.0 timeline:
                1.22 released recently
                2.0 a little delayed behind original schedule (original was 
July/August). 
                New schedule is alpha in late August - no RC. 
                GA also pushed back - RC end of November. 
                Final release mid-December

* 2.0 pending issues:

        * Add option for not supporting any source receive
        - Some providers might want to optimize for only supporting 
FI_DIRECTED_RECV. 
           Some applications only use directed recv which might allow providers 
to optimize.
        - Could add mode bit FI_NO_ANY_SOURCE to allow disabling 
FI_DIRECTED_RECV
           Support FI_MULTI_RECV for tagged messages
        - Current FI_MULTI_RECV only defined for FI_MSG, not FI_TAGGED
        - Could be useful for tagged messages. Could expand support to include 
tagged or 
          could add extra capability for FI_TMULTI_RECV to be more specific.
        - Leaning towards adding capability bit to not break providers that 
currently support
          FI_MULTI_RECV for only FI_MSG
        - HPE agrees capability bit would be preferred

        * Add hints input and caps output to collective join
        - Currently, before doing any collective operation, collective join 
allows you to join a collective group
        - Could be useful to have information about the type of collective to 
help the switch optimize
          the configuration - could be helpful to add hints as input to join to 
help collective optimization. 
          Also add caps as output to let the application know what collectives 
are supported
        - Add fi_collective_join2 or could alter fi_collective_join call 
directly. In the man page collective 
          implementation is defined as "experimental" which allows us to modify 
the API without having 
          to be backwards compatible
        - Q: What would be an example?
        - A: hints would include which collectives you want to call (ie 
allreduce, gather, etc). Capability 
          returned returns what the provider can support (can be more than what 
was requested), but 
          can disable collectives if they weren't requested.
        - Hints needed is really just the type of collective, not the size. 
This topic needs more discussion 
          because is targeting hardware specific needs

        * Separate FI_DIRECTED_RECV capability for message and tagged messages
        - Sometimes application may only need FI_DIRECTED_RECV for only FI_MSG 
or FI_TAGGED
        - Proposal is to add FI_DIRECTED_TRECV capability bit.
        - Original FI_DIRECTED_RECV covers both. Only new capability is 
restricted

        * Only allow binding Eps to one CQ
        - Got a lot of feedback that separating CQs is helpful. This proposed 
change will be dropped 
          from 2.0. Objections?
        - Concern: makes it more difficult to map application CQ to hardware 
CQ. Currently have a request
          from customers to create a 1:1 relationship between OFI CQ and IBV 
CQ. Having a single application
          CQ for sends and receives makes the code messy in regard to hardware 
mapping of resources
                - Allowing separating of CQ won't affect that case. More has to 
do with difficulty supporting 
                  one CQ for multiple uses

        * Allow different inject sizes for FI_MSG and FI_TAGGED
        - Change already added - resolved

        * Redefine FI_HMEM interface 
        - FI_HMEM is only an on/off capability bit but there are a lot of more 
specific capabilities 
          (How to copy, async/sync,  dmabuf reg
        - Issue with CUDA calls conflicting with NCCL
                - Psm3 uses driver API, not runtime API and wasn't able to 
reproduce issue
        - 2.19 CUDA switched APIs from the driver API to the virtual API - 
broke AWS customers
        - A lot of these issues seem to be CUDA specific, maybe don't want to 
expose some issues 
          targeted more at CUDA (for example which API to use), but could be 
good to define attributes 
          to query
        - FI_HMEM interface uses same interface for dev/host/managed memory and 
allows applications 
          to pass in any type of memory. However, some uses are restricted by 
what type of memory is 
          used (ie RDMA or IPC cannot support host or managed memory).
                - FI_HMEM_DEVICE_ONLY flag exists to communicate to provider 
that the memory is 
                  not managed or host memory and can be used through RDMA or 
IPC protocols

        * Logging API
        - No more details, need to look into

        * Next meeting - AWS will present on HMEM capabilities

* Summary:

Discussion centered around 2.0 release schedule and pending issues/discussions. 
2.0 is a little 
delayed (originally was July/August). New schedule is alpha in late August - no 
RC. GA is also 
pushed back - RC at end of November. Final release is targeted for mid-December.

Went over the following issues:
* Add option for not supporting any source receive - add mode bit 
FI_NO_ANY_SOURCE to 
   disable receiving from any source to allow providers to optimize for 
directed recv.
* Support FI_MULTI_RECV for tagged messages - add capability FI_MULTI_TRECV to 
add 
    tagged multi receive capability to not break providers that advertise 
FI_MULTI_RECV 
    and only support regular messaging with multi recv
* Add hints input and caps output to collective join - add input to join to 
allow applications 
   to specify which collectives they need and add output for provider to 
indicate which 
   collectives are enabled. This allows a a provider to optimize the 
configuration.
* Separate FI_DIRECTED_RECV capability for message and tagged messages - add 
   FI_DIRECTED_TRECV capability to specific directed recv is only needed for 
tagged interface. 
   Existing FI_DIRECTED_RECV remains untouched and indicates support for both 
message 
   and tagged interfaces
* Only allow binding Eps to one CQ - got a lot of feedback that separating CQs 
is helpful. 
   This proposed change will be dropped from 2.0.
* Allow different inject sizes for FI_MSG and FI_TAGGED - this was added and is 
upstream
* Redefine FI_HMEM interface - FI_HMEM is only an on/off capability bit but 
there is a 
   lot of more specific capabilities. AWS will give a presentation at the next 
OFIWG meeting 
   to discuss HMEM capabilities
* Logging API - no more details, need to look into
_______________________________________________
ofiwg mailing list
[email protected]
https://lists.openfabrics.org/mailman/listinfo/ofiwg

[ofiwg] OFIWG 8/20/2024 Minutes

Reply via email to