Thanks. I couldn't find a statement to that effect and every time I read about 
FI_THREAD_DOMAIN, I just kept assuming a single domain with multiple endpoints 
and thinking what a horrible idea that would be. Since uncontended locks 
FI_THREAD_SAFE is certainly the simplest way to go assuming the only frequently 
used locks are on the endpoint and completion paths. If you have a MR Cache and 
are actively using it, then its locking gets annoying as things scale up, 
though.

John

From: Xiong, Jianxin <[email protected]>
Sent: Wednesday, June 5, 2024 3:35 PM
To: Byrne, John (Labs) <[email protected]>; [email protected]
Subject: RE: Clarification of definition of FI_THREAD_DOMAIN

Your understanding is correct.  The recommendation is based on how feasible to 
have a lockless implementation in the providers. That also matches with how 
middleware like MPI is doing today.

Does FI_THREAD_COMPLETION fits better with the multi-threaded RMA use case? It 
is recommended for scalable endpoints because that's when this threading model 
is more likely supported by the provider.  But using that with regular endpoint 
is totally fine if available.

For simplicity at the user end, maybe just go with FI_THREAD_SAFE.

-Jianxin

From: ofiwg 
<[email protected]<mailto:[email protected]>>
 On Behalf Of Byrne, John (Labs)
Sent: Wednesday, June 5, 2024 10:10 AM
To: [email protected]<mailto:[email protected]>
Subject: [ofiwg] Clarification of definition of FI_THREAD_DOMAIN

FI_THREAD_DOMAIN
A domain serialization model requires applications to serialize access to all 
objects belonging to a domain.

My immediate take on this definition is that if I am multi-threading I have to 
have a single lock that I use to access any object belonging to a fi_domain 
instance; which seems like a terrible idea for multi-threading. However, in 
Jianxin's 2.0 API update at the workshop 
https://www.openfabrics.org/wp-content/uploads/2024-workshop/2024-workshop-presentations/session-1.pdf<https://www.openfabrics.org/wp-content/uploads/2024-workshop/2024-workshop-presentations/session-1.pdf>,
 it says: "Recommend FI_THREAD_DOMAIN for multi-thread app with regular 
endpoint."  If my interpretation of the meaning of FI_THREAD_DOMAIN is correct, 
then the only way this makes sense to me is for the expectation to be that a 
unique fi_domain instance and endpoint be created for each thread. Is this 
correct or is there something I'm misunderstanding? If it is correct, then 
there are some painful implications for multi-threading RMA.

Thanks,

John Byrne


_______________________________________________
ofiwg mailing list
[email protected]
https://lists.openfabrics.org/mailman/listinfo/ofiwg

Reply via email to