Looks like a bug here. It is considering source of var is
MCA_BASE_VAR_SOURCE_FILE for both variables reading from mca-param.conf andv
INI file(opal/mca/btl/openib/mca-btl-openib-device-params.ini).
But, in function mca_btl_openib_tune_endpoint(), where this error triggered is
only
I still see an issue with the openib receive queues settings.
Interestingly, it seems to work if I pass the setting with the mpirun
command, e.g.
mpirun --mca btl_openib_receive_queues
S,12288,128,64,32:S,65536,128,64,32 --npernode 1 -np 2 ./lat
but if I add it to the
I applied the patch manually and it seemed in fact to resolve the issue,
thanks! I must have done the git clone just right before this patch was
committed two days back, so I just missed it (redoing it right now as well).
Thanks
Edgar
On 12/26/2014 9:06 AM, Gilles Gouaillardet wrote:
Edgar,
Hmmm….this actually isn’t quite correct as we aren’t guaranteed to know our
binding that early in the procedure (see orte/mca/ess/base/ess_base_fns.c, the
orte_ess_base_proc_binding function). I think I see the right fix, so I’ll
update this a bit later.
> On Dec 25, 2014, at 10:40 PM,
Edgar,
First, make sure your master includes
https://github.com/open-mpi/ompi/commit/05af80b3025dbb95bdd4280087450791291d7219
If this is not enough, try with --mca coll ^ml
Hope this helps
Gilles.
Edgar Gabriel さんのメール:
>I have some problems running jobs with ompi-master
I have some problems running jobs with ompi-master on one of our
clusters (after doing a major software update). Here are scenarios that
work and don't work.
1. Everything still seems to work with 1.8.x series without any issues
2. With master, I can run without any issues single node,