Hi,

I'm sorry to answer so late, but last week I didn't have Internet
access. In the meantime I've installed openmpi-1.8.2rc3 and I get
the same error.

> That's quite odd that it only happens for Java programs -- it
> should happen for *all* programs, based on the stack trace you've shown.
> 
> Can you print the value of the lds struct where the error occurs?

sunpc1 java 102 /opt/solstudio12.3/bin/amd64/dbx 
/usr/local/openmpi-1.8.2_64_cc/bin/mpiexec 
For information about new features see `help changes'
To remove this message, put `dbxenv suppress_startup_message 7.9' in your .dbxrc
Reading mpiexec
Reading ld.so.1
Reading libopen-rte.so.7.0.4
Reading libopen-pal.so.6.2.0
Reading libsendfile.so.1
Reading libpicl.so.1
Reading libkstat.so.1
Reading liblgrp.so.1
Reading libsocket.so.1
Reading libnsl.so.1
Reading librt.so.1
Reading libm.so.2
Reading libthread.so.1
Reading libc.so.1
Reading libdoor.so.1
Reading libaio.so.1
Reading libmd.so.1
(dbx) check -all
access checking - ON
memuse checking - ON
(dbx) run -np 1 java InitFinalizeMain
Running: mpiexec -np 1 java InitFinalizeMain 
(process id 4064)
Reading rtcapihook.so
Reading libdl.so.1
Reading rtcaudit.so
Reading libmapmalloc.so.1
Reading rtcboot.so
Reading librtc.so
RTC: Enabling Error Checking...
RTC: Running program...
Reading disasm.so
Read from uninitialized (rui) on thread 1:
Attempting to read 1 byte at address 0x437387
    which is 15 bytes into a heap block of size 16 bytes at 0x437378
This block was allocated from:
        [1] vasprintf() at 0xfffffd7fdc9b335a 
        [2] asprintf() at 0xfffffd7fdc9b3452 
        [3] opal_output_init() at line 184 in "output.c"
        [4] do_open() at line 548 in "output.c"
        [5] opal_output_open() at line 219 in "output.c"
        [6] opal_malloc_init() at line 68 in "malloc.c"
        [7] opal_init_util() at line 258 in "opal_init.c"
        [8] opal_init() at line 363 in "opal_init.c"

t@1 (l@1) stopped in do_open at line 638 in file "output.c"
  638           info[i].ldi_prefix = strdup(lds->lds_prefix);
(dbx) print lds
lds = 0xfffffd7fe93d1b60
(dbx) print i
i = 0
(dbx) print info[0].ldi_prefix
info[0].ldi_prefix = (nil)
(dbx) print lds->lds_verbose_level
lds->lds_verbose_level = 0
(dbx)  print lds->lds_syslog_priority
lds->lds_syslog_priority = 0
(dbx) print lds->lds_syslog_ident
lds->lds_syslog_ident = (nil)
(dbx) print lds->lds_prefix
lds->lds_prefix = 0x437378 "[sunpc1:04090] "
(dbx) print lds->lds_suffix
lds->lds_suffix = (nil)
(dbx) print lds->lds_is_debugging
lds->lds_is_debugging = 0
(dbx) print lds->lds_want_syslog
lds->lds_want_syslog = 0
(dbx) print lds->lds_want_stdout
lds->lds_want_stdout = 0
(dbx) print lds->lds_want_stderr
lds->lds_want_stderr = 1U
(dbx) print lds->lds_want_file
lds->lds_want_file = 0
(dbx) print lds->lds_want_file_append
lds->lds_want_file_append = 0
(dbx)  print lds->lds_file_suffix
lds->lds_file_suffix = (nil)
(dbx) 

Is the above information helpful to track down the error? Do you need
anything else? Thank you very much for any help in advance.


Kind regards

Siegmar






> On Jul 25, 2014, at 2:29 AM, Siegmar Gross 
> <siegmar.gr...@informatik.hs-fulda.de> wrote:
> 
> > Hi,
> > 
> > I have installed openmpi-1.8.2rc2 with Sun c 5.12 on Solaris
> > 10 Sparc and x86_64 and I receive a segmentation fault, if I
> > run a small Java program.
> > 
> > rs0 java 105 mpiexec -np 1 java InitFinalizeMain
> > #
> > # A fatal error has been detected by the Java Runtime Environment:
> > #
> > #  SIGSEGV (0xb) at pc=0xffffffff7ea3c830, pid=18363, tid=2
> > #
> > # JRE version: Java(TM) SE Runtime Environment (8.0-b132) (build 1.8.0-b132)
> > # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.0-b70 mixed mode 
> > solaris-sparc 
> > compressed oops)
> > # Problematic frame:
> > # C  [libc.so.1+0x3c830]  strlen+0x50
> > ...
> > 
> > 
> > I get the following output if I run the program in "dbx".
> > 
> > ...
> > RTC: Running program...
> > Write to unallocated (wua) on thread 1:
> > Attempting to write 1 byte at address 0xffffffff79f04000
> > t@1 (l@1) stopped in _readdir at 0xffffffff56574da0
> > 0xffffffff56574da0: _readdir+0x0064:    call     
> > _PROCEDURE_LINKAGE_TABLE_+0x2380 [PLT] ! 0xffffffff56742a80
> > Current function is find_dyn_components
> >  397                       if (0 != lt_dlforeachfile(dir, save_filename, 
> > NULL)) 
> > {
> > (dbx) 
> > 
> > 
> > I get the following output if I run the program on Solaris 10
> > x86_64.
> > 
> > ...
> > RTC: Running program...
> > Reading disasm.so
> > Read from uninitialized (rui) on thread 1:
> > Attempting to read 1 byte at address 0x437387
> >    which is 15 bytes into a heap block of size 16 bytes at 0x437378
> > This block was allocated from:
> >        [1] vasprintf() at 0xfffffd7fdc9b335a 
> >        [2] asprintf() at 0xfffffd7fdc9b3452 
> >        [3] opal_output_init() at line 184 in "output.c"
> >        [4] do_open() at line 548 in "output.c"
> >        [5] opal_output_open() at line 219 in "output.c"
> >        [6] opal_malloc_init() at line 68 in "malloc.c"
> >        [7] opal_init_util() at line 258 in "opal_init.c"
> >        [8] opal_init() at line 363 in "opal_init.c"
> > 
> > t@1 (l@1) stopped in do_open at line 638 in file "output.c"
> >  638           info[i].ldi_prefix = strdup(lds->lds_prefix);
> > (dbx) 
> > 
> > 
> > Hopefully the above output helps to fix the errors. Can I provide
> > anything else? Thank you very much for any help in advance.
> > 
> > 
> > Kind regards
> > 
> > Siegmar
> > 
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/users/2014/07/24870.php
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 

Reply via email to