Paul,

SIGSEGV is always a bad idea, even after having displayed a comprehensive and user friendly error message

--------------------------------------------------------------------------
MCA framework parameters can only take a single negation operator
("^"), and it must be at the beginning of the value.  The following
value violates this rule:

    ^tcp,^ib

When used, the negation operator sets the "exclusive" behavior mode,
meaning that it will exclude all specified components (and implicitly
include all others).  If the negation operator is not specified, the
"inclusive" mode is assumed, meaning that all specified components
will be included (and implicitly exclude all others).

For example, "^a,b" specifies the exclusive behavior and means "use
all components *except* a and b", while "c,d" specifies the inclusive
behavior and means "use *only* components c and d."

You cannot mix inclusive and exclusive behavior.
--------------------------------------------------------------------------


that raises the question, what should we do when we run into this case ?

- one option is to propagate the error (currently, functions do not return anything) (and do what after ?)

- an other option is to brutally exit(1)

- yet an other option is to disregard the incorrect value of the parameter and continue


any thoughts anyone ?

Cheers,

Gilles

On 11/14/2016 9:28 PM, Paul Kapinos wrote:
Dear developers,
also the following issue is defintely raised by a misconfiguration of Open MPI, SIGSEGV's in 'ompi_info' isn'n a good thing, thus this one mail.

Just call:
$ export OMPI_MCA_mtl="^tcp,^ib"
$ ompi_info --param all all --level 9
... and take a look at the below core dump of 'ompi_info' like below one.

(yes we know that "^tcp,^ib" is a bad idea).

Have a nice day,

Paul Kapinos

P.S. Open MPI: 1.10.4 and 2.0.1 have the same behaviour

--------------------------------------------------------------------------
[lnm001:39957] *** Process received signal ***
[lnm001:39957] Signal: Segmentation fault (11)
[lnm001:39957] Signal code: Address not mapped (1)
[lnm001:39957] Failing at address: (nil)
[lnm001:39957] [ 0] /lib64/libpthread.so.0(+0xf100)[0x2b30f1a79100]
[lnm001:39957] [ 1] /opt/MPI/openmpi-1.10.4/linux/intel_16.0.2.181/lib/libopen-pal.so.13(+0x2f11f)[0x2b30f084911f] [lnm001:39957] [ 2] /opt/MPI/openmpi-1.10.4/linux/intel_16.0.2.181/lib/libopen-pal.so.13(+0x2f265)[0x2b30f0849265] [lnm001:39957] [ 3] /opt/MPI/openmpi-1.10.4/linux/intel_16.0.2.181/lib/libopen-pal.so.13(opal_info_show_mca_params+0x91)[0x2b30f0849031] [lnm001:39957] [ 4] /opt/MPI/openmpi-1.10.4/linux/intel_16.0.2.181/lib/libopen-pal.so.13(opal_info_do_params+0x1f4)[0x2b30f0848e84]
[lnm001:39957] [ 5] ompi_info[0x402643]
[lnm001:39957] [ 6] /lib64/libc.so.6(__libc_start_main+0xf5)[0x2b30f1ca7b15]
[lnm001:39957] [ 7] ompi_info[0x4022a9]
[lnm001:39957] *** End of error message ***
zsh: segmentation fault (core dumped) ompi_info --param all all --level 9 --------------------------------------------------------------------------





_______________________________________________
users mailing list
us...@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

Reply via email to