Which application, or which MPI, is inserting duplicate addresses? I don't see 
how MPI could be doing this. At least the MPI implementations I'm familiar with 
use PMI1, PMI2, or PMIx to exchange addresses at job startup into a distributed 
key-value store, and then after a barrier each MPI rank initializes its av with 
all these unique addresses. For a duplicate address to happen multiple MPI 
ranks would have to get the *same* local address from the OFI provider - how 
would that happen?

Some providers, like bgq, can stuff all the fabric address information within 
the 64 bits of fi_addr_t, which basically makes the fi_av_insert() call a noop 
in FI_AV_MAP mode. So if this duplicate address problem happened on bgq it 
would still "just work" from the provider's perspective. Now MPI (or whatever 
is using the provider) might get messed up because of it, but the fabric 
communication operations would still work.

Mike

-----Original Message-----
From: ofiwg [mailto:[email protected]] On Behalf Of Hefty, 
Sean
Sent: Tuesday, March 20, 2018 11:54 AM
To: [email protected]
Subject: [ofiwg] inserting duplicate addresses into an AV

MPI is hitting into an issue that is the result of inserting the same address 
into an AV more than once.  There is no defined behavior for what a provider 
should do in this case.  At least one provider allows the duplicate insertion, 
and at least one fails the call... and neither work with MPI when this occurs.  
:/

There are a couple of problems trying to define this.  In the case of the 
provider that fails the call, the failure is detected when attempting to insert 
the same address into a hash table.  However, not all providers are easily able 
to detect duplicates.  Forcing them to do so _may_ require the provider to 
perform a linear search over the AV looking for a duplicate for every address 
that is inserted.  At scale, this is a significant overhead.

Even if the decision is made to force detecting duplicates (maybe even making 
this an AV option), there's the question of how a provider should respond.  
Should it insert the address twice -- creating a new fi_addr for it, discard 
the duplicate -- and return the existing fi_addr, or generate an error.  And 
does it matter if AV_TABLE or MAP is used?

We need to know what applications need here, and how difficult it will be for 
providers to detect duplicates.  It is apparently non-trivial for the apps to 
avoid duplicate insertions.

- Sean 

_______________________________________________
ofiwg mailing list
[email protected]
http://lists.openfabrics.org/mailman/listinfo/ofiwg
_______________________________________________
ofiwg mailing list
[email protected]
http://lists.openfabrics.org/mailman/listinfo/ofiwg

Reply via email to