Hi, I'm sorry to answer so late, but last week I didn't have Internet access. In the meantime I've installed openmpi-1.8.2rc3 and I get the same error.
> That's quite odd that it only happens for Java programs -- it > should happen for *all* programs, based on the stack trace you've shown. > > Can you print the value of the lds struct where the error occurs? sunpc1 java 102 /opt/solstudio12.3/bin/amd64/dbx /usr/local/openmpi-1.8.2_64_cc/bin/mpiexec For information about new features see `help changes' To remove this message, put `dbxenv suppress_startup_message 7.9' in your .dbxrc Reading mpiexec Reading ld.so.1 Reading libopen-rte.so.7.0.4 Reading libopen-pal.so.6.2.0 Reading libsendfile.so.1 Reading libpicl.so.1 Reading libkstat.so.1 Reading liblgrp.so.1 Reading libsocket.so.1 Reading libnsl.so.1 Reading librt.so.1 Reading libm.so.2 Reading libthread.so.1 Reading libc.so.1 Reading libdoor.so.1 Reading libaio.so.1 Reading libmd.so.1 (dbx) check -all access checking - ON memuse checking - ON (dbx) run -np 1 java InitFinalizeMain Running: mpiexec -np 1 java InitFinalizeMain (process id 4064) Reading rtcapihook.so Reading libdl.so.1 Reading rtcaudit.so Reading libmapmalloc.so.1 Reading rtcboot.so Reading librtc.so RTC: Enabling Error Checking... RTC: Running program... Reading disasm.so Read from uninitialized (rui) on thread 1: Attempting to read 1 byte at address 0x437387 which is 15 bytes into a heap block of size 16 bytes at 0x437378 This block was allocated from: [1] vasprintf() at 0xfffffd7fdc9b335a [2] asprintf() at 0xfffffd7fdc9b3452 [3] opal_output_init() at line 184 in "output.c" [4] do_open() at line 548 in "output.c" [5] opal_output_open() at line 219 in "output.c" [6] opal_malloc_init() at line 68 in "malloc.c" [7] opal_init_util() at line 258 in "opal_init.c" [8] opal_init() at line 363 in "opal_init.c" t@1 (l@1) stopped in do_open at line 638 in file "output.c" 638 info[i].ldi_prefix = strdup(lds->lds_prefix); (dbx) print lds lds = 0xfffffd7fe93d1b60 (dbx) print i i = 0 (dbx) print info[0].ldi_prefix info[0].ldi_prefix = (nil) (dbx) print lds->lds_verbose_level lds->lds_verbose_level = 0 (dbx) print lds->lds_syslog_priority lds->lds_syslog_priority = 0 (dbx) print lds->lds_syslog_ident lds->lds_syslog_ident = (nil) (dbx) print lds->lds_prefix lds->lds_prefix = 0x437378 "[sunpc1:04090] " (dbx) print lds->lds_suffix lds->lds_suffix = (nil) (dbx) print lds->lds_is_debugging lds->lds_is_debugging = 0 (dbx) print lds->lds_want_syslog lds->lds_want_syslog = 0 (dbx) print lds->lds_want_stdout lds->lds_want_stdout = 0 (dbx) print lds->lds_want_stderr lds->lds_want_stderr = 1U (dbx) print lds->lds_want_file lds->lds_want_file = 0 (dbx) print lds->lds_want_file_append lds->lds_want_file_append = 0 (dbx) print lds->lds_file_suffix lds->lds_file_suffix = (nil) (dbx) Is the above information helpful to track down the error? Do you need anything else? Thank you very much for any help in advance. Kind regards Siegmar > On Jul 25, 2014, at 2:29 AM, Siegmar Gross > <siegmar.gr...@informatik.hs-fulda.de> wrote: > > > Hi, > > > > I have installed openmpi-1.8.2rc2 with Sun c 5.12 on Solaris > > 10 Sparc and x86_64 and I receive a segmentation fault, if I > > run a small Java program. > > > > rs0 java 105 mpiexec -np 1 java InitFinalizeMain > > # > > # A fatal error has been detected by the Java Runtime Environment: > > # > > # SIGSEGV (0xb) at pc=0xffffffff7ea3c830, pid=18363, tid=2 > > # > > # JRE version: Java(TM) SE Runtime Environment (8.0-b132) (build 1.8.0-b132) > > # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.0-b70 mixed mode > > solaris-sparc > > compressed oops) > > # Problematic frame: > > # C [libc.so.1+0x3c830] strlen+0x50 > > ... > > > > > > I get the following output if I run the program in "dbx". > > > > ... > > RTC: Running program... > > Write to unallocated (wua) on thread 1: > > Attempting to write 1 byte at address 0xffffffff79f04000 > > t@1 (l@1) stopped in _readdir at 0xffffffff56574da0 > > 0xffffffff56574da0: _readdir+0x0064: call > > _PROCEDURE_LINKAGE_TABLE_+0x2380 [PLT] ! 0xffffffff56742a80 > > Current function is find_dyn_components > > 397 if (0 != lt_dlforeachfile(dir, save_filename, > > NULL)) > > { > > (dbx) > > > > > > I get the following output if I run the program on Solaris 10 > > x86_64. > > > > ... > > RTC: Running program... > > Reading disasm.so > > Read from uninitialized (rui) on thread 1: > > Attempting to read 1 byte at address 0x437387 > > which is 15 bytes into a heap block of size 16 bytes at 0x437378 > > This block was allocated from: > > [1] vasprintf() at 0xfffffd7fdc9b335a > > [2] asprintf() at 0xfffffd7fdc9b3452 > > [3] opal_output_init() at line 184 in "output.c" > > [4] do_open() at line 548 in "output.c" > > [5] opal_output_open() at line 219 in "output.c" > > [6] opal_malloc_init() at line 68 in "malloc.c" > > [7] opal_init_util() at line 258 in "opal_init.c" > > [8] opal_init() at line 363 in "opal_init.c" > > > > t@1 (l@1) stopped in do_open at line 638 in file "output.c" > > 638 info[i].ldi_prefix = strdup(lds->lds_prefix); > > (dbx) > > > > > > Hopefully the above output helps to fix the errors. Can I provide > > anything else? Thank you very much for any help in advance. > > > > > > Kind regards > > > > Siegmar > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > > Link to this post: > > http://www.open-mpi.org/community/lists/users/2014/07/24870.php > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > >