Don't see an rc3 yet.

My Solaris-10/SPARC runs fail slightly differently (see below).
It looks sufficiently similar that it MIGHT be the same root cause.
However, lacking an rc3 to test I figured it would be better to report this
than to ignore it.

The problem is present with both V8+ and V9 ABIs, and with both Gnu and Sun
compilers.

-Paul

[niagara1:29881] *** Process received signal ***
[niagara1:29881] Signal: Segmentation Fault (11)
[niagara1:29881] Signal code: Address not mapped (1)
[niagara1:29881] Failing at address: 2
/sandbox/hargrove/OMPI/openmpi-1.8.4rc2-solaris10-sparcT2-gcc346-v8plus/INST/lib/libopen-pal.so.6.2.1:opal_bac
ktrace_print+0x24
/sandbox/hargrove/OMPI/openmpi-1.8.4rc2-solaris10-sparcT2-gcc346-v8plus/INST/lib/libopen-pal.so.6.2.1:0xaa160
/lib/libc.so.1:0xc5364
/lib/libc.so.1:0xb9e64
/lib/libc.so.1:strlen+0x14 [ Signal 11 (SEGV)]
/sandbox/hargrove/OMPI/openmpi-1.8.4rc2-solaris10-sparcT2-gcc346-v8plus/INST/lib/libopen-pal.so.6.2.1:opal_vas
printf+0x20
/sandbox/hargrove/OMPI/openmpi-1.8.4rc2-solaris10-sparcT2-gcc346-v8plus/INST/lib/libopen-pal.so.6.2.1:opal_asp
rintf+0x30
/sandbox/hargrove/OMPI/openmpi-1.8.4rc2-solaris10-sparcT2-gcc346-v8plus/INST/lib/libopen-pal.so.6.2.1:opal_hwl
oc_base_get_topo_signature+0x24c
/sandbox/hargrove/OMPI/openmpi-1.8.4rc2-solaris10-sparcT2-gcc346-v8plus/INST/lib/openmpi/mca_ess_hnp.so:0x2d90
/sandbox/hargrove/OMPI/openmpi-1.8.4rc2-solaris10-sparcT2-gcc346-v8plus/INST/lib/libopen-rte.so.7.0.5:orte_ini
t+0x2f8
/sandbox/hargrove/OMPI/openmpi-1.8.4rc2-solaris10-sparcT2-gcc346-v8plus/INST/bin/orterun:orterun+0xaa8
/sandbox/hargrove/OMPI/openmpi-1.8.4rc2-solaris10-sparcT2-gcc346-v8plus/INST/bin/orterun:main+0x14
/sandbox/hargrove/OMPI/openmpi-1.8.4rc2-solaris10-sparcT2-gcc346-v8plus/INST/bin/orterun:_start+0x5c
[niagara1:29881] *** End of error message ***
Segmentation Fault - core dumped

On Thu, Dec 11, 2014 at 3:29 PM, Ralph Castain <r...@open-mpi.org> wrote:

> Ah crud - incomplete commit means we didn't send the topo string. Will
> roll rc3 in a few minutes.
>
> Thanks, Paul
> Ralph
>
> On Dec 11, 2014, at 3:08 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
> Testing the 1.8.4rc2 tarball on my x86-64 Solaris-11 systems I am getting
> the following crash for both "-m32" and "-m64" builds:
>
> $ mpirun -mca btl sm,self,openib -np 2 -host pcp-j-19,pcp-j-20
> examples/ring_c'
> [pcp-j-19:18762] *** Process received signal ***
> [pcp-j-19:18762] Signal: Segmentation Fault (11)
> [pcp-j-19:18762] Signal code: Address not mapped (1)
> [pcp-j-19:18762] Failing at address: 0
> /shared/OMPI/openmpi-1.8.4rc2-solaris11-x64-ib-gcc452/INST/lib/libopen-pal.so.6.2.1'opal_backtrace_print+0x26
> [0xfffffd7ffaf237ba]
> /shared/OMPI/openmpi-1.8.4rc2-solaris11-x64-ib-gcc452/INST/lib/libopen-pal.so.6.2.1'show_stackframe+0x833
> [0xfffffd7ffaf20ba1]
> /lib/amd64/libc.so.1'__sighndlr+0x6 [0xfffffd7fff202cc6]
> /lib/amd64/libc.so.1'call_user_handler+0x2aa [0xfffffd7fff1f648e]
> /lib/amd64/libc.so.1'strcmp+0x1a [0xfffffd7fff170fda] [Signal 11 (SEGV)]
> /shared/OMPI/openmpi-1.8.4rc2-solaris11-x64-ib-gcc452/INST/bin/orted'main+0x90
> [0x4010b7]
> /shared/OMPI/openmpi-1.8.4rc2-solaris11-x64-ib-gcc452/INST/bin/orted'_start+0x6c
> [0x400f2c]
> [pcp-j-19:18762] *** End of error message ***
> bash: line 1: 18762 Segmentation Fault      (core dumped)
> /shared/OMPI/openmpi-1.8.4rc2-solaris11-x64-ib-gcc452/INST/bin/orted -mca
> ess "env" -mca orte_ess_jobid "911343616" -mca orte_ess_vpid 1 -mca
> orte_ess_num_procs "2" -mca orte_hnp_uri "911343616.0;tcp://172.16.0.120,
> 172.18.0.120:50362" --tree-spawn -mca btl "sm,self,openib" -mca plm "rsh"
> -mca shmem_mmap_enable_nfs_warning "0"
>
> Running gdb against a core generated by the 32-bit build gives line
> numbers:
> #0  0xfea1cb45 in strcmp () from /lib/libc.so.1
> #1  0xfeef4900 in orte_daemon (argc=26, argv=0x80479b0)
>     at
> /shared/OMPI/openmpi-1.8.4rc2-solaris11-x86-ib-gcc452/openmpi-1.8.4rc2/orte/orted/orted_main.c:789
> #2  0x08050fb1 in main (argc=26, argv=0x80479b0)
>     at
> /shared/OMPI/openmpi-1.8.4rc2-solaris11-x86-ib-gcc452/openmpi-1.8.4rc2/orte/tools/orted/orted.c:62
>
> -Paul
>
> --
> Paul H. Hargrove                          phhargr...@lbl.gov
> Computer Languages & Systems Software (CLaSS) Group
> Computer Science Department               Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
>  _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/12/16514.php
>
>
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/12/16515.php
>



-- 
Paul H. Hargrove                          phhargr...@lbl.gov
Computer Languages & Systems Software (CLaSS) Group
Computer Science Department               Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900

Reply via email to