Hi Rémi! Thanks very much for your reply. Switching on PMI_DEBUG shows that most of the time is spent after the last call to
In: PMI_KVS_Get(hostname[95]) There are two calls that take a few seconds right afterwards, In: PMI_KVS_Get(hostname[95]) In: PMI_KVS_Get_key_length_max In: PMI_KVS_Get_value_length_max - alltogether maybe 5-10s to get here. These are followed by a large number of In: PMI_Get_rank and In: PMI_Get_size until the process is killed after about 30s Decreasing PMI_TIME from 500 to smaller values (all the way down to 50) changes the number of PMI_Get_size showing up in the logs (i.e. it gets slightly faster so PMI can finish more of the Get_rank-s and proceed with the PMI_Get_size-s, but it never finishes the initialisation before the timeout). Out of curiousity, how can I choose to use pmi2? I first compiled mvapich2-2.2b with "--with-pm=none --with-pmi=slurm". Will "--with-pm=none --with-pmi=pmi2” work? Thanks again, Dom > On 23/03/2016, at 12:58 PM, Rémi Palancher <[email protected]> wrote: > > > Le 23/03/2016 08:54, Dominikus Heinzeller a écrit : >> >> Hi all, >> >> I am having a problem with spawning a large number of threads on a >> node. My server consists of 4 sockets x 12 cores per socket x 2 threads >> per >> core = 96 procs >> >> [...] >> Any help or suggestion what I could do? > > It looks like you're using PMI1, there's must be something wrong in PMI > initialization. 96 tasks on one node is not something large, there's not > reason to spend more than 5 seconds on this... You can eventually profile PMI > calls by setting PMI_DEBUG environement variable to 1, to find out where it > takes time. > > Eventually, you can set PMI_TIME environement variable as well to a value > <500, and see if there's any difference. > > Best, > Rémi
