Thank Alexia for taking the notes.
8/20/2024
* Participants:
Aleixa Ingerson (Intel)
Adam Goldman (Intel)
Alex McKinley (Intel)
Ben Lynam (Cornelis)
Charles Shereda (Cornelis)
Howard Pritchard (LANL)
Ian Ziemba (HPE)
Jerome Soumagne
Jianxin Xiong (Intel)
John Byrne (HPE)
Juee Desai (Intel)
Ken Raffenetti (ANL)
Nikhil Nanal (Intel)
Peinan Zhang (Intel)
Rajalaxmi (Intel)
Shi Jin (AWS)
Stephen Oost (Intel)
Steve Welch (HPE)
Zach Dworkin (Intel)
* Notes:
2.0 timeline:
1.22 released recently
2.0 a little delayed behind original schedule (original was
July/August).
New schedule is alpha in late August - no RC.
GA also pushed back - RC end of November.
Final release mid-December
* 2.0 pending issues:
* Add option for not supporting any source receive
- Some providers might want to optimize for only supporting
FI_DIRECTED_RECV.
Some applications only use directed recv which might allow providers
to optimize.
- Could add mode bit FI_NO_ANY_SOURCE to allow disabling
FI_DIRECTED_RECV
Support FI_MULTI_RECV for tagged messages
- Current FI_MULTI_RECV only defined for FI_MSG, not FI_TAGGED
- Could be useful for tagged messages. Could expand support to include
tagged or
could add extra capability for FI_TMULTI_RECV to be more specific.
- Leaning towards adding capability bit to not break providers that
currently support
FI_MULTI_RECV for only FI_MSG
- HPE agrees capability bit would be preferred
* Add hints input and caps output to collective join
- Currently, before doing any collective operation, collective join
allows you to join a collective group
- Could be useful to have information about the type of collective to
help the switch optimize
the configuration - could be helpful to add hints as input to join to
help collective optimization.
Also add caps as output to let the application know what collectives
are supported
- Add fi_collective_join2 or could alter fi_collective_join call
directly. In the man page collective
implementation is defined as "experimental" which allows us to modify
the API without having
to be backwards compatible
- Q: What would be an example?
- A: hints would include which collectives you want to call (ie
allreduce, gather, etc). Capability
returned returns what the provider can support (can be more than what
was requested), but
can disable collectives if they weren't requested.
- Hints needed is really just the type of collective, not the size.
This topic needs more discussion
because is targeting hardware specific needs
* Separate FI_DIRECTED_RECV capability for message and tagged messages
- Sometimes application may only need FI_DIRECTED_RECV for only FI_MSG
or FI_TAGGED
- Proposal is to add FI_DIRECTED_TRECV capability bit.
- Original FI_DIRECTED_RECV covers both. Only new capability is
restricted
* Only allow binding Eps to one CQ
- Got a lot of feedback that separating CQs is helpful. This proposed
change will be dropped
from 2.0. Objections?
- Concern: makes it more difficult to map application CQ to hardware
CQ. Currently have a request
from customers to create a 1:1 relationship between OFI CQ and IBV
CQ. Having a single application
CQ for sends and receives makes the code messy in regard to hardware
mapping of resources
- Allowing separating of CQ won't affect that case. More has to
do with difficulty supporting
one CQ for multiple uses
* Allow different inject sizes for FI_MSG and FI_TAGGED
- Change already added - resolved
* Redefine FI_HMEM interface
- FI_HMEM is only an on/off capability bit but there are a lot of more
specific capabilities
(How to copy, async/sync, dmabuf reg
- Issue with CUDA calls conflicting with NCCL
- Psm3 uses driver API, not runtime API and wasn't able to
reproduce issue
- 2.19 CUDA switched APIs from the driver API to the virtual API -
broke AWS customers
- A lot of these issues seem to be CUDA specific, maybe don't want to
expose some issues
targeted more at CUDA (for example which API to use), but could be
good to define attributes
to query
- FI_HMEM interface uses same interface for dev/host/managed memory and
allows applications
to pass in any type of memory. However, some uses are restricted by
what type of memory is
used (ie RDMA or IPC cannot support host or managed memory).
- FI_HMEM_DEVICE_ONLY flag exists to communicate to provider
that the memory is
not managed or host memory and can be used through RDMA or
IPC protocols
* Logging API
- No more details, need to look into
* Next meeting - AWS will present on HMEM capabilities
* Summary:
Discussion centered around 2.0 release schedule and pending issues/discussions.
2.0 is a little
delayed (originally was July/August). New schedule is alpha in late August - no
RC. GA is also
pushed back - RC at end of November. Final release is targeted for mid-December.
Went over the following issues:
* Add option for not supporting any source receive - add mode bit
FI_NO_ANY_SOURCE to
disable receiving from any source to allow providers to optimize for
directed recv.
* Support FI_MULTI_RECV for tagged messages - add capability FI_MULTI_TRECV to
add
tagged multi receive capability to not break providers that advertise
FI_MULTI_RECV
and only support regular messaging with multi recv
* Add hints input and caps output to collective join - add input to join to
allow applications
to specify which collectives they need and add output for provider to
indicate which
collectives are enabled. This allows a a provider to optimize the
configuration.
* Separate FI_DIRECTED_RECV capability for message and tagged messages - add
FI_DIRECTED_TRECV capability to specific directed recv is only needed for
tagged interface.
Existing FI_DIRECTED_RECV remains untouched and indicates support for both
message
and tagged interfaces
* Only allow binding Eps to one CQ - got a lot of feedback that separating CQs
is helpful.
This proposed change will be dropped from 2.0.
* Allow different inject sizes for FI_MSG and FI_TAGGED - this was added and is
upstream
* Redefine FI_HMEM interface - FI_HMEM is only an on/off capability bit but
there is a
lot of more specific capabilities. AWS will give a presentation at the next
OFIWG meeting
to discuss HMEM capabilities
* Logging API - no more details, need to look into
_______________________________________________
ofiwg mailing list
[email protected]
https://lists.openfabrics.org/mailman/listinfo/ofiwg