Tarball is now posted

On Feb 10, 2014, at 1:31 PM, Ralph Castain <r...@open-mpi.org> wrote:

> Generating it now - sorry for my lack of response, my OMPI email was down for 
> some reason. I can now receive it, but still haven't gotten the backlog from 
> the down period.
> 
> 
> On Feb 10, 2014, at 1:23 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
> 
>> Ralph,
>> 
>> If you give me a heads-up when this makes it into a tarball, I will retest 
>> my failing ppc and sparc platforms.
>> 
>> -Paul
>> 
>> 
>> On Mon, Feb 10, 2014 at 1:13 PM, Rolf vandeVaart <rvandeva...@nvidia.com> 
>> wrote:
>> I have tracked this down.  There is a missing commit that affects 
>> ompi_mpi_init.c causing it to initialize bml twice.
>> 
>> Ralph, can you apply r30310 to 1.7?
>> 
>>  
>> 
>> Thanks,
>> 
>> Rolf
>> 
>>  
>> 
>> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Rolf vandeVaart
>> Sent: Monday, February 10, 2014 12:29 PM
>> To: Open MPI Developers
>> Subject: Re: [OMPI devel] 1.7.5 fails on simple test
>> 
>>  
>> 
>> I have seen this same issue although my core dump is a little bit different. 
>>  I am running with tcp,self.  The first entry in the list of BTLs is 
>> garbage, but then there is tcp and self in the list.   Strange.  This is my 
>> core dump.  Line 208 in bml_r2.c is where I get the SEGV.
>> 
>>  
>> 
>> Program terminated with signal 11, Segmentation fault.
>> 
>> #0  0x00007fb6dec981d0 in ?? ()
>> 
>> Missing separate debuginfos, use: debuginfo-install 
>> glibc-2.12-1.107.el6_4.5.x86_64
>> 
>> (gdb) where
>> 
>> #0  0x00007fb6dec981d0 in ?? ()
>> 
>> #1  <signal handler called>
>> 
>> #2  0x00007fb6e82fff38 in main_arena () from /lib64/libc.so.6
>> 
>> #3  0x00007fb6e4103de2 in mca_bml_r2_add_procs (nprocs=2, procs=0x2061440, 
>> reachable=0x7fff80487b40)
>> 
>>     at ../../../../../ompi/mca/bml/r2/bml_r2.c:208
>> 
>> #4  0x00007fb6df50a751 in mca_pml_ob1_add_procs (procs=0x2060bc0, nprocs=2)
>> 
>>     at ../../../../../ompi/mca/pml/ob1/pml_ob1.c:332
>> 
>> #5  0x00007fb6e8570dca in ompi_mpi_init (argc=1, argv=0x7fff80488158, 
>> requested=0, provided=0x7fff80487cc8)
>> 
>>     at ../../ompi/runtime/ompi_mpi_init.c:776
>> 
>> #6  0x00007fb6e85a3606 in PMPI_Init (argc=0x7fff80487d8c, 
>> argv=0x7fff80487d80) at pinit.c:84
>> 
>> #7  0x0000000000401c56 in main (argc=1, argv=0x7fff80488158) at 
>> MPI_Isend_ator_c.c:143
>> 
>> (gdb)
>> 
>> #3  0x00007fb6e4103de2 in mca_bml_r2_add_procs (nprocs=2, procs=0x2061440, 
>> reachable=0x7fff80487b40)
>> 
>>     at ../../../../../ompi/mca/bml/r2/bml_r2.c:208
>> 
>> 208                 rc = btl->btl_add_procs(btl, n_new_procs, new_procs, 
>> btl_endpoints, reachable);
>> 
>> (gdb) print *btl
>> 
>> $1 = {btl_component = 0x7fb6e82ffee8, btl_eager_limit = 140423556234984, 
>> btl_rndv_eager_limit = 140423556235000,
>> 
>>   btl_max_send_size = 140423556235000, btl_rdma_pipeline_send_length = 
>> 140423556235016,
>> 
>>   btl_rdma_pipeline_frag_size = 140423556235016, btl_min_rdma_pipeline_size 
>> = 140423556235032,
>> 
>>   btl_exclusivity = 3895459608, btl_latency = 32694, btl_bandwidth = 
>> 3895459624, btl_flags = 32694,
>> 
>>   btl_seg_size = 140423556235048, btl_add_procs = 0x7fb6e82fff38 
>> <main_arena+184>,
>> 
>>   btl_del_procs = 0x7fb6e82fff38 <main_arena+184>, btl_register = 
>> 0x7fb6e82fff48 <main_arena+200>,
>> 
>>   btl_finalize = 0x7fb6e82fff48 <main_arena+200>, btl_alloc = 0x7fb6e82fff58 
>> <main_arena+216>,
>> 
>>   btl_free = 0x7fb6e82fff58 <main_arena+216>, btl_prepare_src = 
>> 0x7fb6e82fff68 <main_arena+232>,
>> 
>>   btl_prepare_dst = 0x7fb6e82fff68 <main_arena+232>, btl_send = 
>> 0x7fb6e82fff78 <main_arena+248>,
>> 
>>   btl_sendi = 0x7fb6e82fff78 <main_arena+248>, btl_put = 0x7fb6e82fff88 
>> <main_arena+264>,
>> 
>>   btl_get = 0x7fb6e82fff88 <main_arena+264>, btl_dump = 0x7fb6e82fff98 
>> <main_arena+280>,
>> 
>>   btl_mpool = 0x7fb6e82fff98, btl_register_error = 0x7fb6e82fffa8 
>> <main_arena+296>,
>> 
>>   btl_ft_event = 0x7fb6e82fffa8 <main_arena+296>}
>> 
>> (gdb)
>> 
>>  
>> 
>>  
>> 
>> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Mike Dubman
>> Sent: Monday, February 10, 2014 4:23 AM
>> To: Open MPI Developers
>> Subject: [OMPI devel] 1.7.5 fails on simple test
>> 
>>  
>> 
>>  
>>  
>> $/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/bin/mpirun
>>  -np 8 -mca pml ob1 -mca btl self,tcp 
>> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/examples/hello_usempi
>> [vegas12:12724] *** Process received signal ***
>> [vegas12:12724] Signal: Segmentation fault (11)
>> [vegas12:12724] Signal code:  (128)
>> [vegas12:12724] Failing at address: (nil)
>> [vegas12:12724] [ 0] /lib64/libpthread.so.0[0x3937c0f500]
>> [vegas12:12724] [ 1] 
>> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/openmpi/mca_btl_tcp.so(mca_btl_tcp_component_init+0x583)[0x7ffff395f813]
>> [vegas12:12724] [ 2] 
>> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(mca_btl_base_select+0x117)[0x7ffff78e14a7]
>> [vegas12:12724] [ 3] 
>> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/openmpi/mca_bml_r2.so(mca_bml_r2_component_init+0x12)[0x7ffff3ded6f2]
>> [vegas12:12724] [ 4] 
>> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(mca_bml_base_init+0x99)[0x7ffff78e0cc9]
>> [vegas12:12724] [ 5] 
>> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/openmpi/mca_pml_ob1.so(+0x51d8)[0x7ffff37481d8]
>> [vegas12:12724] [ 6] 
>> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(mca_pml_base_select+0x1e0)[0x7ffff78f31e0]
>> [vegas12:12724] [ 7] 
>> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(ompi_mpi_init+0x52b)[0x7ffff78bffdb]
>> [vegas12:12724] [ 8] 
>> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(MPI_Init+0x170)[0x7ffff78d4210]
>> [vegas12:12724] [ 9] 
>> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi_mpifh.so.2(PMPI_Init_f08+0x25)[0x7ffff7b71c25]
>> [vegas12:12724] [10] 
>> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/examples/hello_usempi[0x400c0b]
>> [vegas12:12724] [11] 
>> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/examples/hello_usempi[0x400d4a]
>> [vegas12:12724] [12] /lib64/libc.so.6(__libc_start_main+0xfd)[0x393741ecdd]
>> [vegas12:12724] [13] 
>> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/examples/hello_usempi[0x400b29]
>> [vegas12:12724] *** End of error message ***
>> [vegas12:12731] *** Process received signal ***
>> [vegas12:12731] Signal: Segmentation fault (11)
>> [vegas12:12731] Signal code:  (128)
>> [vegas12:12731] Failing at address: (nil)
>> [vegas12:12731] [ 0] /lib64/libpthread.so.0[0x3937c0f500]
>> [vegas12:12731] [ 1] 
>> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/openmpi/mca_btl_tcp.so(mca_btl_tcp_component_init+0x583)[0x7ffff395f813]
>> [vegas12:12731] [ 2] 
>> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(mca_btl_base_select+0x117)[0x7ffff78e14a7]
>> [vegas12:12731] [ 3] 
>> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/openmpi/mca_bml_r2.so(mca_bml_r2_component_init+0x12)[0x7ffff3ded6f2]
>> [vegas12:12731] [ 4] 
>> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(mca_bml_base_init+0x99)[0x7ffff78e0cc9]
>> [vegas12:12731] [ 5] 
>> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/openmpi/mca_pml_ob1.so(+0x51d8)[0x7ffff37481d8]
>> [vegas12:12731] [ 6] 
>> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(mca_pml_base_select+0x1e0)[0x7ffff78f31e0]
>> [vegas12:12731] [ 7] 
>> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(ompi_mpi_init+0x52b)[0x7ffff78bffdb]
>> [vegas12:12731] [ 8] 
>> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(MPI_Init+0x170)[0x7ffff78d4210]
>> [vegas12:12731] [ 9] 
>> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi_mpifh.so.2(PMPI_Init_f08+0x25)[0x7ffff7b71c25]
>> [vegas12:12731] [10] 
>> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/examples/hello_usempi[0x400c0b]
>> [vegas12:12731] [11] 
>> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/examples/hello_usempi[0x400d4a]
>> [vegas12:12731] [12] /lib64/libc.so.6(__libc_start_main+0xfd)[0x393741ecdd]
>> [vegas12:12731] [13] 
>> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/examples/hello_usempi[0x400b29]
>> [vegas12:12731] *** End of error message ***
>> --------------------------------------------------------------------------
>> mpirun noticed that process rank 0 with PID 12724 on node vegas12 exited on 
>> signal 11 (Segmentation fault).
>> --------------------------------------------------------------------------
>> jenkins@vegas12 ~
>>  
>> 
>>  
>> This email message is for the sole use of the intended recipient(s) and may 
>> contain confidential information.  Any unauthorized review, use, disclosure 
>> or distribution is prohibited.  If you are not the intended recipient, 
>> please contact the sender by reply email and destroy all copies of the 
>> original message.
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> 
>> -- 
>> Paul H. Hargrove                          phhargr...@lbl.gov
>> Future Technologies Group
>> Computer and Data Sciences Department     Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 

Reply via email to