On Oct 15, 2014, at 11:46 AM, Gus Correa <g...@ldeo.columbia.edu> wrote:

> Thank you Ralph and Jeff for the help!
> 
> Glad to hear the segmentation fault is reproducible and will be fixed.
> 
> In any case, one can just avoid the old parameter name
> (rmaps_base_schedule_policy),
> and use instead the new parameter name
> (rmaps_base_mapping_policy)
> without any problem in OMPI 1.8.3.
> 

Fix is in the nightly 1.8 tarball - I'll release a 1.8.4 soon to cover the 
problem.

> **
> 
> Thanks Ralph for sending the new (OMPI 1.8)
> parameter names for process binding.
> 
> My recollection is that sometime ago somebody (Jeff perhaps?)
> posted here a link to a presentation (PDF or PPT) explaining the
> new style of process binding, but I couldn't find it in the
> list archives.
> Maybe the link could be part of the FAQ (if not already there)?

I don't think it is, but I'll try to add it over the next day or so.

> 
> **
> 
> The Open MPI runtime environment is really great.
> However, to take advantage of it one often has to do parameter guessing,
> and to do time consuming tests by trial and error,
> because the main sources of documentation are
> the terse output of ompi_info, and several sparse
> references in the FAQ.
> (Some of them outdated?)
> 
> In addition, the runtime environment has evolved over time,
> which is certainly a good thing.
> However, along with this evolution, several runtime parameters
> changed both name and functionality, new ones were introduced,
> old ones were deprecated, which can be somewhat confusing,
> and can lead to an ineffective use of the runtime environment.
> (In 1.8.3 I was using several deprecated parameters from 1.6.5
> that seem to be silently ignored at runtime.
> I only noticed the problem because that segmentation fault happened.)
> 
> I know asking for thorough documentation is foolish,

Not really - it is something we need to get better about :-(

> but I guess a simple table of runtime parameter names and valid values
> in the FAQ, maybe sorted by their purpose/function, along with a few examples 
> of use, could help a lot.
> Some of this material is now spread across several FAQ, but not so
> easy to find/compare.
> That doesn't need to be a comprehensive table, but commonly used
> items like selecting the btl, selecting interfaces,
> dealing with process binding,
> modifying/enriching the stdout/sterr output
> (tagging output, increasing verbosity, etc),
> probably have their place there.

Yeah, we fell down on this one. The changes were announced with each step in 
the 1.7 series, but if you step from 1.6 directly to 1.8, you'll get caught 
flat-footed. We honestly didn't think of that case, and so we mentally assumed 
that "of course people have been following the series - they know what 
happened".

You know what they say about those who "assume" :-/

I'll try to get something into the FAQ about the entire new mapping, ranking, 
and binding system. It is actually VERY powerful, allowing you to specify 
pretty much any placement pattern you can imagine and bind it to whatever level 
you desire. It was developed in response to requests from researchers who 
wanted to explore application performance versus placement strategies - but we 
provided some simplified options to support more common usage patterns.


> 
> 
> Many thanks,
> Gus Correa
> 
> 
> On 10/15/2014 11:12 AM, Jeff Squyres (jsquyres) wrote:
>> We talked off-list -- fixed this on master and just filed 
>> https://github.com/open-mpi/ompi-release/pull/33 to get this into the v1.8 
>> branch.
>> 
>> 
>> On Oct 14, 2014, at 7:39 PM, Ralph Castain <r...@open-mpi.org> wrote:
>> 
>>> 
>>> On Oct 14, 2014, at 5:32 PM, Gus Correa <g...@ldeo.columbia.edu> wrote:
>>> 
>>>> Dear Open MPI fans and experts
>>>> 
>>>> This is just a note in case other people run into the same problem.
>>>> 
>>>> I just built Open MPI 1.8.3.
>>>> As usual I put my old settings on openmpi-mca-params.conf,
>>>> with no further thinking.
>>>> Then I compiled the connectivity_c.c program and tried
>>>> to run it with mpiexec.
>>>> That is a routine that never failed before.
>>>> 
>>>> Bummer!
>>>> I've got a segmentation fault right away.
>>> 
>>> Strange  - it works fine from the cmd line:
>>> 
>>> 07:27:04  (v1.8) /home/common/openmpi/ompi-release$ mpirun -n 1 -mca 
>>> rmaps_base_schedule_policy core hostname
>>> --------------------------------------------------------------------------
>>> A deprecated MCA variable value was specified in the environment or
>>> on the command line.  Deprecated MCA variables should be avoided;
>>> they may disappear in future releases.
>>> 
>>>  Deprecated variable: rmaps_base_schedule_policy
>>>  New variable:        rmaps_base_mapping_policy
>>> --------------------------------------------------------------------------
>>> bend001
>>> 
>>> HOWEVER, I can replicate that behavior when it is in the default params 
>>> file! I don't see the immediate cause of the difference, but will 
>>> investigate.
>>> 
>>>> 
>>>> After some head scratching, checking my environment, etc,
>>>> I thought I might have configured OMPI incorrectly.
>>>> Hence, I tried to get information from ompi_info.
>>>> Oh well, ompi_info also segfaulted!
>>>> 
>>>> It took me a while to realize that the runtime parameter
>>>> configuration file was the culprit.
>>>> 
>>>> When I inserted the runtime parameter settings one by one,
>>>> the segfault came with this one:
>>>> 
>>>> rmaps_base_schedule_policy = core
>>>> 
>>>> Ompi_info (when I got it to work) told me that the parameter above
>>>> is now a deprecated synonym of:
>>>> 
>>>> rmaps_base_mapping_policy = core
>>>> 
>>>> In any case, the old synonym doesn't work and makes ompi_info and
>>>> mpiexec segfault (and I'd guess anything else that requires the OMPI 
>>>> runtime components).
>>>> Only the new parameter name works.
>>> 
>>> That's because the segfault is happening in the printing of the deprecation 
>>> warning.
>>> 
>>>> 
>>>> ***
>>>> 
>>>> I am also missing in the ompi_info output the following
>>>> (OMPI 1.6.5) parameters (not reported by ompi_info --all --all):
>>>> 
>>> 
>>> 1) orte_process_binding  ===> hwloc_base_binding_policy
>>> 
>>> 2) orte_report_bindings   ===> hwloc_base_report_bindings
>>> 
>>> 3) opal_paffinity_alone  ===> gone, use hwloc_base_binding_policy=none if 
>>> you don't want any binding
>>> 
>>>> 
>>>> Are they gone forever?
>>>> 
>>>> Are there replacements for them, with approximately the same functionality?
>>>> 
>>>> Is there a list comparing the new (1.8) vs. old (1.6)
>>>> OMPI runtime parameters, and/or any additional documentation
>>>> about the new style of OMPI 1.8 runtime parameters?
>>> 
>>> Will try to add this to the web site
>>> 
>>>> 
>>>> Since there seems to have been a major revamping of the OMPI
>>>> runtime parameters, that would be a great help.
>>>> 
>>>> Thank you,
>>>> Gus Correa
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post: 
>>>> http://www.open-mpi.org/community/lists/users/2014/10/25497.php
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/users/2014/10/25498.php
>> 
>> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/10/25501.php

Reply via email to