You should be able to apply the patch - I don’t think that section of code differs from what is in the 1.8 repo.
The sha for 1.8.3 can be found on the web site (see right-most column in table): http://www.open-mpi.org/software/ompi/v1.8/ <http://www.open-mpi.org/software/ompi/v1.8/> > On Dec 10, 2014, at 7:35 AM, Eric Chamberland > <eric.chamberl...@giref.ulaval.ca> wrote: > > Hi Nathan, > > I pulled your commit d0da29351f9 and tested it against our example. > > It now works perfectly. Strangely, I can even unset > "OMPI_MCA_mpi_yield_when_idle=1" and it doesn't seems to last longer. > > Can I apply the patch to a fresh "1.8.3" and it should work? > > Other question: how can I retrieve the SHA for 1.8.3? (Should they be tagged > in the repository? Is it normal if I just see a "dev" tag??) > > Thanks, > > Eric > > > On 12/09/2014 04:19 PM, Nathan Hjelm wrote: >> >> yield when idle is broken on 1.8. Fixing now. >> >> -Nathan >> >> On Tue, Dec 09, 2014 at 01:02:08PM -0800, Ralph Castain wrote: >>> Hmmm….well, it looks like we are doing the right thing and running unbound >>> when oversubscribed like this. I don’t have any brilliant idea why it would >>> be running so slowly in that situation when compared with 1.6.5 - it could >>> be that yield-when-idle is borked. I’ll try to dig into that notion a bit. >>> >>> >>>> On Dec 9, 2014, at 10:39 AM, Eric Chamberland >>>> <eric.chamberl...@giref.ulaval.ca> wrote: >>>> >>>> Hi again, >>>> >>>> I sorted and "seded" (cat outpout.1.00 |sed 's/default/default >>>> value/g'|sed 's/true/1/g' |sed 's/false/0/g') the output.1.00 file from: >>>> >>>> mpirun --output-filename output -mca mpi_show_mca_params all >>>> --report-bindings -np 32 myprog >>>> >>>> between a launch with 165 vs 183. >>>> >>>> The diff may be interesting but I can't interpret everything that is >>>> written... >>>> >>>> The files are attached... >>>> >>>> Thanks, >>>> >>>> Eric >>>> >>>> On 12/09/2014 01:02 PM, Eric Chamberland wrote: >>>>> On 12/09/2014 12:24 PM, Ralph Castain wrote: >>>>>> Can you provide an example cmd line you use to launch one of these >>>>>> tests using 1.8.3? Some of the options changed between the 1.6 and 1.8 >>>>>> series, and we bind by default in 1.8 - the combination may be causing >>>>>> you a problem. >>>>> >>>>> I very simply launch: >>>>> >>>>> "mpirun -np 32 myprog" >>>>> >>>>> Maybe the result of "-mca mpi_show_mca_params all" would be insightful? >>>>> >>>>> Eric >>>>> >>>>>> >>>>>> >>>>>>> On Dec 9, 2014, at 9:14 AM, Eric Chamberland >>>>>>> <eric.chamberl...@giref.ulaval.ca> wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> we were used to do oversubscribing just to do code validation in >>>>>>> nightly automated parallel runs of our code. >>>>>>> >>>>>>> I just compiled openmpi 1.8.3 and launched the whole suit of >>>>>>> sequential/parallel tests and noticed a *major* slowdown in >>>>>>> oversubscribed parallel tests with 1.8.3 compared to 1.6.5. >>>>>>> >>>>>>> For example, on my computer (2 cpu), a validation test of 64 >>>>>>> processes launched with 1.8.3 took 1500 seconds (~29 minutes) to >>>>>>> execute, while the very same test compiled with 1.6.5 took only 7.4 >>>>>>> seconds! >>>>>>> >>>>>>> To have this result with 1.6.5 we had to set the variable >>>>>>> "OMPI_MCA_mpi_yield_when_idle=1", but it seems to have no effects in >>>>>>> 1.8.3 when I launch more processes than number of core in my >>>>>>> computer, even if it is still mentioned to work (see >>>>>>> http://www.open-mpi.org/faq/?category=running#force-aggressive-degraded). >>>>>>> However, when I launch with fewer processes than number of core, then >>>>>>> it is faster without "OMPI_MCA_mpi_yield_when_idle=1", which is the >>>>>>> same behavior in 1.6.5. >>>>>>> >>>>>>> I tried to launch with a host file like this: >>>>>>> >>>>>>> localhost slots=2 >>>>>>> >>>>>>> but it changed nothing... >>>>>>> >>>>>>> What do I do wrong? >>>>>>> >>>>>>> Is it possible to retrieve "performances" of 1.6.5 for oversubscription? >>>>>>> >>>>>>> Is there a compilation option that I have to enable in 1.8.3? >>>>>>> >>>>>>> Here are the config.log and "ompi_info --all" files for both versions >>>>>>> of mpi: >>>>>>> >>>>>>> http://www.giref.ulaval.ca/~ericc/ompi_bug/config.165.log.gz >>>>>>> http://www.giref.ulaval.ca/~ericc/ompi_bug/config.183.log.gz >>>>>>> http://www.giref.ulaval.ca/~ericc/ompi_bug/ompi_info.all.165.txt.gz >>>>>>> http://www.giref.ulaval.ca/~ericc/ompi_bug/ompi_info.all.183.txt.gz >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Eric >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> users mailing list >>>>>>> us...@open-mpi.org >>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>> Link to this post: >>>>>>> http://www.open-mpi.org/community/lists/users/2014/12/25936.php >>>>>> >>>>>> _______________________________________________ >>>>>> users mailing list >>>>>> us...@open-mpi.org >>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>> Link to this post: >>>>>> http://www.open-mpi.org/community/lists/users/2014/12/25938.php >>>>>> >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> Link to this post: >>>>> http://www.open-mpi.org/community/lists/users/2014/12/25940.php >>>> >>>> <output.1.00.filtre.165.sorted><output.1.00.filtre.183.sorted.seded> >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>> Link to this post: >>> http://www.open-mpi.org/community/lists/users/2014/12/25942.php > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/12/25947.php > <http://www.open-mpi.org/community/lists/users/2014/12/25947.php>