We've been struggling a bit lately with the problem of resolving multiple names 
for the same host. Part of the problem has been the need to minimize DNS 
resolves as systems were taking way too long to perform them, resulting in very 
long startup times. I've done my best to minimize this and still get hostnames 
to properly resolve, even when people/systems insist on creating name confusion.

Historically, we disabled all DNS resolves when running under a managed 
allocation. We simply assumed that the RM would be consistent in its naming, 
and required users to always use the RM-provided host names for any -host or 
hostfile entries.

However, we defaulted to performing DNS resolves in non-managed situations, 
giving the user an MCA parameter to disable them if the DNS system was too 
slow. This unfortunately has been causing problems as people new to the project 
forget about the param and start seeing very long startup times.

Accordingly, we now no longer default to performing DNS resolves for 
non-managed scenarios, though the user can request that we do so if they run 
into hostname confusion issues. We still disable it completely for managed 
allocations.

This doesn't penalize the majority of users who don't engage in or have systems 
that generate multiple names for the same piece of hardware, and shifts the 
penalties onto those who do. Seemed more appropriate that those who want to 
screw around with host names should pay the price instead of inconveniencing 
everyone else out-of-the-box.

So if you need to resolve hostnames, then set PRTE_MCA_prte_do_not_resolve=0 
(or the equivalent in the PRRTE default MCA param or on the cmd line). 
Otherwise, you should be fine.
Ralph


Reply via email to