Re: [OMPI devel] openib receive queue settings

2014-12-26 Thread Devendar Bureddy
Looks like a bug here. It is considering source of var is MCA_BASE_VAR_SOURCE_FILE for both variables reading from mca-param.conf andv INI file(opal/mca/btl/openib/mca-btl-openib-device-params.ini). But, in function mca_btl_openib_tune_endpoint(), where this error triggered is only

[OMPI devel] openib receive queue settings

2014-12-26 Thread Edgar Gabriel
I still see an issue with the openib receive queues settings. Interestingly, it seems to work if I pass the setting with the mpirun command, e.g. mpirun --mca btl_openib_receive_queues S,12288,128,64,32:S,65536,128,64,32 --npernode 1 -np 2 ./lat but if I add it to the

Re: [OMPI devel] problem running jobs on ompi-master

2014-12-26 Thread Edgar Gabriel
I applied the patch manually and it seemed in fact to resolve the issue, thanks! I must have done the git clone just right before this patch was committed two days back, so I just missed it (redoing it right now as well). Thanks Edgar On 12/26/2014 9:06 AM, Gilles Gouaillardet wrote: Edgar,

Re: [OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-618-g9e9261e

2014-12-26 Thread Ralph Castain
Hmmm….this actually isn’t quite correct as we aren’t guaranteed to know our binding that early in the procedure (see orte/mca/ess/base/ess_base_fns.c, the orte_ess_base_proc_binding function). I think I see the right fix, so I’ll update this a bit later. > On Dec 25, 2014, at 10:40 PM,

Re: [OMPI devel] problem running jobs on ompi-master

2014-12-26 Thread Gilles Gouaillardet
Edgar, First, make sure your master includes https://github.com/open-mpi/ompi/commit/05af80b3025dbb95bdd4280087450791291d7219 If this is not enough, try with --mca coll ^ml Hope this helps Gilles. Edgar Gabriel さんのメール: >I have some problems running jobs with ompi-master

[OMPI devel] problem running jobs on ompi-master

2014-12-26 Thread Edgar Gabriel
I have some problems running jobs with ompi-master on one of our clusters (after doing a major software update). Here are scenarios that work and don't work. 1. Everything still seems to work with 1.8.x series without any issues 2. With master, I can run without any issues single node,