Another thing to try is to load up the core file in gdb and see if that gives you a valid stack trace of where exactly the segv occurred.
On Apr 25, 2012, at 9:30 AM, Alex Margolin wrote: > On 04/25/2012 02:57 PM, Ralph Castain wrote: >> Strange that your code didn't generate any symbols - is that a mosix thing? >> Have you tried just adding opal_output (so it goes to a special diagnostic >> output channel) statements in your code to see where the segfault is >> occurring? >> >> It looks like you are getting thru orte_init. You could add -mca >> grpcomm_base_verbose 5 to see if you are getting in/thru the modex - if so, >> then you are probably failing in add_procs. >> > I guess the symbols are a mosix thing, but it should still show some sort of > segmentation fault trace, no? maybe only the assembly opcode... It seems that > the SEGV is detected, rather then caught. This may also be related to mosix - > I'll check it with the mosix developer. > > I added the parameter you suggested and appended the output. Modex seems to > be working because I use it to exchange the IP and PID, and as you can see at > the bottom these are received OK. I'll try debug printouts specifically in > add_procs. Thanks for the advice! > > alex@singularity:~/huji/benchmarks/mpi/npb$ mpirun -mca grpcomm_base_verbose > 5 -mca btl self,mosix -mca btl_base_verbose 100 -n 4 ft.S.4 > [singularity:08915] mca:base:select:(grpcomm) Querying component [bad] > [singularity:08915] mca:base:select:(grpcomm) Query of component [bad] set > priority to 10 > [singularity:08915] mca:base:select:(grpcomm) Selected component [bad] > [singularity:08915] [[37778,0],0] grpcomm:base:receive start comm > [singularity:08915] [[37778,0],0] grpcomm:bad:xcast sent to job [37778,0] tag > 1 > [singularity:08915] [[37778,0],0] grpcomm:xcast:recv:send_relay > [singularity:08915] [[37778,0],0] grpcomm:base:xcast updating nidmap > [singularity:08915] [[37778,0],0] orte:daemon:send_relay - recipient list is > empty! > [singularity:08916] mca:base:select:(grpcomm) Querying component [bad] > [singularity:08916] mca:base:select:(grpcomm) Query of component [bad] set > priority to 10 > [singularity:08916] mca:base:select:(grpcomm) Selected component [bad] > [singularity:08916] [[37778,1],0] grpcomm:base:receive start comm > [singularity:08919] mca:base:select:(grpcomm) Querying component [bad] > [singularity:08919] mca:base:select:(grpcomm) Query of component [bad] set > priority to 10 > [singularity:08919] mca:base:select:(grpcomm) Selected component [bad] > [singularity:08919] [[37778,1],2] grpcomm:base:receive start comm > [singularity:08917] mca:base:select:(grpcomm) Querying component [bad] > [singularity:08917] mca:base:select:(grpcomm) Query of component [bad] set > priority to 10 > [singularity:08917] mca:base:select:(grpcomm) Selected component [bad] > [singularity:08917] [[37778,1],1] grpcomm:base:receive start comm > [singularity:08921] mca:base:select:(grpcomm) Querying component [bad] > [singularity:08921] mca:base:select:(grpcomm) Query of component [bad] set > priority to 10 > [singularity:08921] mca:base:select:(grpcomm) Selected component [bad] > [singularity:08921] [[37778,1],3] grpcomm:base:receive start comm > [singularity:08916] [[37778,1],0] grpcomm:set_proc_attr: setting attribute > MPI_THREAD_LEVEL data size 1 > [singularity:08916] [[37778,1],0] grpcomm:set_proc_attr: setting attribute > OMPI_ARCH data size 11 > [singularity:08919] [[37778,1],2] grpcomm:set_proc_attr: setting attribute > MPI_THREAD_LEVEL data size 1 > [singularity:08919] [[37778,1],2] grpcomm:set_proc_attr: setting attribute > OMPI_ARCH data size 11 > [singularity:08917] [[37778,1],1] grpcomm:set_proc_attr: setting attribute > MPI_THREAD_LEVEL data size 1 > [singularity:08917] [[37778,1],1] grpcomm:set_proc_attr: setting attribute > OMPI_ARCH data size 11 > [singularity:08921] [[37778,1],3] grpcomm:set_proc_attr: setting attribute > MPI_THREAD_LEVEL data size 1 > [singularity:08921] [[37778,1],3] grpcomm:set_proc_attr: setting attribute > OMPI_ARCH data size 11 > [singularity:08916] mca: base: components_open: Looking for btl components > [singularity:08916] mca: base: components_open: opening btl components > [singularity:08916] mca: base: components_open: found loaded component mosix > [singularity:08916] mca: base: components_open: component mosix register > function successful > [singularity:08916] mca: base: components_open: component mosix open function > successful > [singularity:08916] mca: base: components_open: found loaded component self > [singularity:08916] mca: base: components_open: component self has no > register function > [singularity:08916] mca: base: components_open: component self open function > successful > [singularity:08919] mca: base: components_open: Looking for btl components > [singularity:08917] mca: base: components_open: Looking for btl components > [singularity:08919] mca: base: components_open: opening btl components > [singularity:08919] mca: base: components_open: found loaded component mosix > [singularity:08919] mca: base: components_open: component mosix register > function successful > [singularity:08919] mca: base: components_open: component mosix open function > successful > [singularity:08919] mca: base: components_open: found loaded component self > [singularity:08919] mca: base: components_open: component self has no > register function > [singularity:08919] mca: base: components_open: component self open function > successful > [singularity:08921] mca: base: components_open: Looking for btl components > [singularity:08917] mca: base: components_open: opening btl components > [singularity:08917] mca: base: components_open: found loaded component mosix > [singularity:08917] mca: base: components_open: component mosix register > function successful > [singularity:08917] mca: base: components_open: component mosix open function > successful > [singularity:08917] mca: base: components_open: found loaded component self > [singularity:08917] mca: base: components_open: component self has no > register function > [singularity:08917] mca: base: components_open: component self open function > successful > [singularity:08921] mca: base: components_open: opening btl components > [singularity:08921] mca: base: components_open: found loaded component mosix > [singularity:08921] mca: base: components_open: component mosix register > function successful > [singularity:08921] mca: base: components_open: component mosix open function > successful > [singularity:08921] mca: base: components_open: found loaded component self > [singularity:08921] mca: base: components_open: component self has no > register function > [singularity:08921] mca: base: components_open: component self open function > successful > [singularity:08916] select: initializing btl component mosix > [singularity:08916] [[37778,1],0] grpcomm:set_proc_attr: setting attribute > btl.mosix.1.7 data size 20 > [singularity:08919] select: initializing btl component mosix > [singularity:08916] select: init of component mosix returned success > [singularity:08916] select: initializing btl component self > [singularity:08916] select: init of component self returned success > [singularity:08916] [[37778,1],0] grpcomm:base:modex: performing modex > [singularity:08916] [[37778,1],0] grpcomm:base:pack_modex: reporting 3 entries > [singularity:08916] [[37778,1],0] grpcomm:base:full:modex: executing allgather > [singularity:08916] [[37778,1],0] grpcomm:bad entering allgather > [singularity:08915] [[37778,0],0] COLLECTIVE RECVD FROM [[37778,1],0] > [singularity:08915] [[37778,0],0] WORKING COLLECTIVE 0 > [singularity:08915] [[37778,0],0] ADDING [[37778,1],WILDCARD] TO PARTICIPANTS > [singularity:08915] [[37778,0],0] PROGRESSING COLLECTIVE 0 > [singularity:08915] [[37778,0],0] PROGRESSING COLL id 0 > [singularity:08915] [[37778,0],0] ALL LOCAL PROCS CONTRIBUTE 4 > [singularity:08916] [[37778,1],0] grpcomm:bad allgather underway > [singularity:08916] [[37778,1],0] grpcomm:base:modex: modex posted > [singularity:08919] [[37778,1],2] grpcomm:set_proc_attr: setting attribute > btl.mosix.1.7 data size 20 > [singularity:08917] select: initializing btl component mosix > [singularity:08917] [[37778,1],1] grpcomm:set_proc_attr: setting attribute > btl.mosix.1.7 data size 20 > [singularity:08921] select: initializing btl component mosix > [singularity:08921] [[37778,1],3] grpcomm:set_proc_attr: setting attribute > btl.mosix.1.7 data size 20 > [singularity:08919] select: init of component mosix returned success > [singularity:08919] select: initializing btl component self > [singularity:08919] select: init of component self returned success > [singularity:08919] [[37778,1],2] grpcomm:base:modex: performing modex > [singularity:08919] [[37778,1],2] grpcomm:base:pack_modex: reporting 3 entries > [singularity:08919] [[37778,1],2] grpcomm:base:full:modex: executing allgather > [singularity:08919] [[37778,1],2] grpcomm:bad entering allgather > [singularity:08915] [[37778,0],0] COLLECTIVE RECVD FROM [[37778,1],2] > [singularity:08915] [[37778,0],0] WORKING COLLECTIVE 0 > [singularity:08915] [[37778,0],0] PROGRESSING COLLECTIVE 0 > [singularity:08915] [[37778,0],0] PROGRESSING COLL id 0 > [singularity:08915] [[37778,0],0] ALL LOCAL PROCS CONTRIBUTE 4 > [singularity:08919] [[37778,1],2] grpcomm:bad allgather underway > [singularity:08919] [[37778,1],2] grpcomm:base:modex: modex posted > [singularity:08917] select: init of component mosix returned success > [singularity:08917] select: initializing btl component self > [singularity:08917] select: init of component self returned success > [singularity:08917] [[37778,1],1] grpcomm:base:modex: performing modex > [singularity:08917] [[37778,1],1] grpcomm:base:pack_modex: reporting 3 entries > [singularity:08917] [[37778,1],1] grpcomm:base:full:modex: executing allgather > [singularity:08917] [[37778,1],1] grpcomm:bad entering allgather > [singularity:08915] [[37778,0],0] COLLECTIVE RECVD FROM [[37778,1],1] > [singularity:08915] [[37778,0],0] WORKING COLLECTIVE 0 > [singularity:08915] [[37778,0],0] PROGRESSING COLLECTIVE 0 > [singularity:08915] [[37778,0],0] PROGRESSING COLL id 0 > [singularity:08915] [[37778,0],0] ALL LOCAL PROCS CONTRIBUTE 4 > [singularity:08917] [[37778,1],1] grpcomm:bad allgather underway > [singularity:08917] [[37778,1],1] grpcomm:base:modex: modex posted > [singularity:08921] select: init of component mosix returned success > [singularity:08921] select: initializing btl component self > [singularity:08921] select: init of component self returned success > [singularity:08921] [[37778,1],3] grpcomm:base:modex: performing modex > [singularity:08921] [[37778,1],3] grpcomm:base:pack_modex: reporting 3 entries > [singularity:08921] [[37778,1],3] grpcomm:base:full:modex: executing allgather > [singularity:08921] [[37778,1],3] grpcomm:bad entering allgather > [singularity:08915] [[37778,0],0] COLLECTIVE RECVD FROM [[37778,1],3] > [singularity:08915] [[37778,0],0] WORKING COLLECTIVE 0 > [singularity:08915] [[37778,0],0] PROGRESSING COLLECTIVE 0 > [singularity:08915] [[37778,0],0] PROGRESSING COLL id 0 > [singularity:08915] [[37778,0],0] ALL LOCAL PROCS CONTRIBUTE 4 > [singularity:08915] [[37778,0],0] COLLECTIVE 0 LOCALLY COMPLETE - SENDING TO > GLOBAL COLLECTIVE > [singularity:08915] [[37778,0],0] grpcomm:base:daemon_coll: daemon collective > recvd from [[37778,0],0] > [singularity:08915] [[37778,0],0] grpcomm:base:daemon_coll: WORKING > COLLECTIVE 0 > [singularity:08915] [[37778,0],0] grpcomm:base:daemon_coll: NUM CONTRIBS: 4 > [singularity:08915] [[37778,0],0] grpcomm:bad:xcast sent to job [37778,1] tag > 30 > [singularity:08915] [[37778,0],0] grpcomm:xcast:recv:send_relay > [singularity:08915] [[37778,0],0] orte:daemon:send_relay - recipient list is > empty! > [singularity:08921] [[37778,1],3] grpcomm:bad allgather underway > [singularity:08921] [[37778,1],3] grpcomm:base:modex: modex posted > [singularity:08921] [[37778,1],3] grpcomm:base:receive processing collective > return for id 0 > [singularity:08921] [[37778,1],3] CHECKING COLL id 0 > [singularity:08921] [[37778,1],3] STORING MODEX DATA > [singularity:08921] [[37778,1],3] grpcomm:base:store_modex adding modex entry > for proc [[37778,1],0] > [singularity:08921] [[37778,1],3] grpcomm:base:update_modex_entries: adding 3 > entries for proc [[37778,1],0] > [singularity:08921] [[37778,1],3] grpcomm:base:store_modex adding modex entry > for proc [[37778,1],2] > [singularity:08917] [[37778,1],1] grpcomm:base:receive processing collective > return for id 0 > [singularity:08916] [[37778,1],0] grpcomm:base:receive processing collective > return for id 0 > [singularity:08916] [[37778,1],0] CHECKING COLL id 0 > [singularity:08917] [[37778,1],1] CHECKING COLL id 0 > [singularity:08916] [[37778,1],0] STORING MODEX DATA > [singularity:08917] [[37778,1],1] STORING MODEX DATA > [singularity:08921] [[37778,1],3] grpcomm:base:update_modex_entries: adding 3 > entries for proc [[37778,1],2] > [singularity:08916] [[37778,1],0] grpcomm:base:store_modex adding modex entry > for proc [[37778,1],0] > [singularity:08917] [[37778,1],1] grpcomm:base:store_modex adding modex entry > for proc [[37778,1],0] > [singularity:08921] [[37778,1],3] grpcomm:base:store_modex adding modex entry > for proc [[37778,1],1] > [singularity:08916] [[37778,1],0] grpcomm:base:update_modex_entries: adding 3 > entries for proc [[37778,1],0] > [singularity:08917] [[37778,1],1] grpcomm:base:update_modex_entries: adding 3 > entries for proc [[37778,1],0] > [singularity:08921] [[37778,1],3] grpcomm:base:update_modex_entries: adding 3 > entries for proc [[37778,1],1] > [singularity:08916] [[37778,1],0] grpcomm:base:store_modex adding modex entry > for proc [[37778,1],2] > [singularity:08917] [[37778,1],1] grpcomm:base:store_modex adding modex entry > for proc [[37778,1],2] > [singularity:08917] [[37778,1],1] grpcomm:base:update_modex_entries: adding 3 > entries for proc [[37778,1],2] > [singularity:08916] [[37778,1],0] grpcomm:base:update_modex_entries: adding 3 > entries for proc [[37778,1],2] > [singularity:08917] [[37778,1],1] grpcomm:base:store_modex adding modex entry > for proc [[37778,1],1] > [singularity:08916] [[37778,1],0] grpcomm:base:store_modex adding modex entry > for proc [[37778,1],1] > [singularity:08917] [[37778,1],1] grpcomm:base:update_modex_entries: adding 3 > entries for proc [[37778,1],1] > [singularity:08916] [[37778,1],0] grpcomm:base:update_modex_entries: adding 3 > entries for proc [[37778,1],1] > [singularity:08917] [[37778,1],1] grpcomm:base:store_modex adding modex entry > for proc [[37778,1],3] > [singularity:08916] [[37778,1],0] grpcomm:base:store_modex adding modex entry > for proc [[37778,1],3] > [singularity:08917] [[37778,1],1] grpcomm:base:update_modex_entries: adding 3 > entries for proc [[37778,1],3] > [singularity:08916] [[37778,1],0] grpcomm:base:update_modex_entries: adding 3 > entries for proc [[37778,1],3] > [singularity:08921] [[37778,1],3] grpcomm:base:store_modex adding modex entry > for proc [[37778,1],3] > [singularity:08921] [[37778,1],3] grpcomm:base:update_modex_entries: adding 3 > entries for proc [[37778,1],3] > [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for attr > OMPI_ARCH on proc [[37778,1],0] > [singularity:08915] [[37778,0],0] COLLECTIVE RECVD FROM [[37778,1],3] > [singularity:08915] [[37778,0],0] WORKING COLLECTIVE 1 > [singularity:08915] [[37778,0],0] ADDING [[37778,1],WILDCARD] TO PARTICIPANTS > [singularity:08915] [[37778,0],0] PROGRESSING COLLECTIVE 1 > [singularity:08915] [[37778,0],0] PROGRESSING COLL id 1 > [singularity:08915] [[37778,0],0] ALL LOCAL PROCS CONTRIBUTE 4 > [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for attr > OMPI_ARCH on proc [[37778,1],0] > [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 11 bytes for > attr OMPI_ARCH on proc [[37778,1],0] > [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for attr > OMPI_ARCH on proc [[37778,1],1] > [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 11 bytes for > attr OMPI_ARCH on proc [[37778,1],1] > [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for attr > OMPI_ARCH on proc [[37778,1],2] > [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 11 bytes for > attr OMPI_ARCH on proc [[37778,1],2] > [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for attr > btl.mosix.1.7 on proc [[37778,1],0] > [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 20 bytes for > attr btl.mosix.1.7 on proc [[37778,1],0] > [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for attr > btl.mosix.1.7 on proc [[37778,1],1] > [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 20 bytes for > attr btl.mosix.1.7 on proc [[37778,1],1] > [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for attr > btl.mosix.1.7 on proc [[37778,1],2] > [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 20 bytes for > attr btl.mosix.1.7 on proc [[37778,1],2] > [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for attr > btl.mosix.1.7 on proc [[37778,1],3] > [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 20 bytes for > attr btl.mosix.1.7 on proc [[37778,1],3] > [singularity:08921] [[37778,1],3] grpcomm:bad entering barrier > [singularity:08921] [[37778,1],3] grpcomm:bad barrier underway > [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for attr > OMPI_ARCH on proc [[37778,1],1] > [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 11 bytes for > attr OMPI_ARCH on proc [[37778,1],1] > [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for attr > OMPI_ARCH on proc [[37778,1],2] > [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 11 bytes for > attr OMPI_ARCH on proc [[37778,1],2] > [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for attr > OMPI_ARCH on proc [[37778,1],3] > [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 11 bytes for > attr OMPI_ARCH on proc [[37778,1],3] > [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for attr > btl.mosix.1.7 on proc [[37778,1],0] > [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 20 bytes for > attr btl.mosix.1.7 on proc [[37778,1],0] > [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for attr > btl.mosix.1.7 on proc [[37778,1],1] > [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 20 bytes for > attr btl.mosix.1.7 on proc [[37778,1],1] > [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for attr > btl.mosix.1.7 on proc [[37778,1],2] > [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 11 bytes for > attr OMPI_ARCH on proc [[37778,1],0] > [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for attr > OMPI_ARCH on proc [[37778,1],2] > [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 11 bytes for > attr OMPI_ARCH on proc [[37778,1],2] > [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for attr > OMPI_ARCH on proc [[37778,1],3] > [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 11 bytes for > attr OMPI_ARCH on proc [[37778,1],3] > [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for attr > btl.mosix.1.7 on proc [[37778,1],0] > [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 20 bytes for > attr btl.mosix.1.7 on proc [[37778,1],0] > [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for attr > btl.mosix.1.7 on proc [[37778,1],1] > [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 20 bytes for > attr btl.mosix.1.7 on proc [[37778,1],1] > [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for attr > btl.mosix.1.7 on proc [[37778,1],2] > [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 20 bytes for > attr btl.mosix.1.7 on proc [[37778,1],2] > [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for attr > btl.mosix.1.7 on proc [[37778,1],3] > [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 20 bytes for > attr btl.mosix.1.7 on proc [[37778,1],3] > [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 20 bytes for > attr btl.mosix.1.7 on proc [[37778,1],2] > [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for attr > btl.mosix.1.7 on proc [[37778,1],3] > [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 20 bytes for > attr btl.mosix.1.7 on proc [[37778,1],3] > [singularity:08916] [[37778,1],0] grpcomm:bad entering barrier > [singularity:08917] [[37778,1],1] grpcomm:bad entering barrier > [singularity:08915] [[37778,0],0] COLLECTIVE RECVD FROM [[37778,1],0] > [singularity:08915] [[37778,0],0] WORKING COLLECTIVE 1 > [singularity:08915] [[37778,0],0] PROGRESSING COLLECTIVE 1 > [singularity:08915] [[37778,0],0] PROGRESSING COLL id 1 > [singularity:08915] [[37778,0],0] ALL LOCAL PROCS CONTRIBUTE 4 > [singularity:08915] [[37778,0],0] COLLECTIVE RECVD FROM [[37778,1],1] > [singularity:08915] [[37778,0],0] WORKING COLLECTIVE 1 > [singularity:08915] [[37778,0],0] PROGRESSING COLLECTIVE 1 > [singularity:08915] [[37778,0],0] PROGRESSING COLL id 1 > [singularity:08915] [[37778,0],0] ALL LOCAL PROCS CONTRIBUTE 4 > [singularity:08917] [[37778,1],1] grpcomm:bad barrier underway > [singularity:08916] [[37778,1],0] grpcomm:bad barrier underway > [singularity:08919] [[37778,1],2] grpcomm:base:receive processing collective > return for id 0 > [singularity:08919] [[37778,1],2] CHECKING COLL id 0 > [singularity:08919] [[37778,1],2] STORING MODEX DATA > [singularity:08919] [[37778,1],2] grpcomm:base:store_modex adding modex entry > for proc [[37778,1],0] > [singularity:08919] [[37778,1],2] grpcomm:base:update_modex_entries: adding 3 > entries for proc [[37778,1],0] > [singularity:08919] [[37778,1],2] grpcomm:base:store_modex adding modex entry > for proc [[37778,1],2] > [singularity:08919] [[37778,1],2] grpcomm:base:update_modex_entries: adding 3 > entries for proc [[37778,1],2] > [singularity:08919] [[37778,1],2] grpcomm:base:store_modex adding modex entry > for proc [[37778,1],1] > [singularity:08919] [[37778,1],2] grpcomm:base:update_modex_entries: adding 3 > entries for proc [[37778,1],1] > [singularity:08919] [[37778,1],2] grpcomm:base:store_modex adding modex entry > for proc [[37778,1],3] > [singularity:08919] [[37778,1],2] grpcomm:base:update_modex_entries: adding 3 > entries for proc [[37778,1],3] > [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for attr > OMPI_ARCH on proc [[37778,1],0] > [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 11 bytes for > attr OMPI_ARCH on proc [[37778,1],0] > [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for attr > OMPI_ARCH on proc [[37778,1],1] > [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 11 bytes for > attr OMPI_ARCH on proc [[37778,1],1] > [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for attr > OMPI_ARCH on proc [[37778,1],3] > [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 11 bytes for > attr OMPI_ARCH on proc [[37778,1],3] > [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for attr > btl.mosix.1.7 on proc [[37778,1],0] > [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 20 bytes for > attr btl.mosix.1.7 on proc [[37778,1],0] > [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for attr > btl.mosix.1.7 on proc [[37778,1],1] > [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 20 bytes for > attr btl.mosix.1.7 on proc [[37778,1],1] > [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for attr > btl.mosix.1.7 on proc [[37778,1],2] > [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 20 bytes for > attr btl.mosix.1.7 on proc [[37778,1],2] > [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for attr > btl.mosix.1.7 on proc [[37778,1],3] > [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 20 bytes for > attr btl.mosix.1.7 on proc [[37778,1],3] > [singularity:08919] [[37778,1],2] grpcomm:bad entering barrier > [singularity:08915] [[37778,0],0] COLLECTIVE RECVD FROM [[37778,1],2] > [singularity:08915] [[37778,0],0] WORKING COLLECTIVE 1 > [singularity:08915] [[37778,0],0] PROGRESSING COLLECTIVE 1 > [singularity:08915] [[37778,0],0] PROGRESSING COLL id 1 > [singularity:08915] [[37778,0],0] ALL LOCAL PROCS CONTRIBUTE 4 > [singularity:08915] [[37778,0],0] COLLECTIVE 1 LOCALLY COMPLETE - SENDING TO > GLOBAL COLLECTIVE > [singularity:08915] [[37778,0],0] grpcomm:base:daemon_coll: daemon collective > recvd from [[37778,0],0] > [singularity:08915] [[37778,0],0] grpcomm:base:daemon_coll: WORKING > COLLECTIVE 1 > [singularity:08915] [[37778,0],0] grpcomm:base:daemon_coll: NUM CONTRIBS: 4 > [singularity:08915] [[37778,0],0] grpcomm:bad:xcast sent to job [37778,1] tag > 30 > [singularity:08915] [[37778,0],0] grpcomm:xcast:recv:send_relay > [singularity:08915] [[37778,0],0] orte:daemon:send_relay - recipient list is > empty! > [singularity:08919] [[37778,1],2] grpcomm:bad barrier underway > [singularity:08916] [[37778,1],0] grpcomm:base:receive processing collective > return for id 1 > [singularity:08916] [[37778,1],0] CHECKING COLL id 1 > [singularity:08917] [[37778,1],1] grpcomm:base:receive processing collective > return for id 1 > [singularity:08921] [[37778,1],3] grpcomm:base:receive processing collective > return for id 1 > [singularity:08921] [[37778,1],3] CHECKING COLL id 1 > [singularity:08917] [[37778,1],1] CHECKING COLL id 1 > [singularity:08919] [[37778,1],2] grpcomm:base:receive processing collective > return for id 1 > [singularity:08919] [[37778,1],2] CHECKING COLL id 1 > [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for attr > MPI_THREAD_LEVEL on proc [[37778,1],0] > [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 1 bytes for > attr MPI_THREAD_LEVEL on proc [[37778,1],0] > [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for attr > MPI_THREAD_LEVEL on proc [[37778,1],1] > [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 1 bytes for > attr MPI_THREAD_LEVEL on proc [[37778,1],1] > [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for attr > MPI_THREAD_LEVEL on proc [[37778,1],2] > [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 1 bytes for > attr MPI_THREAD_LEVEL on proc [[37778,1],2] > [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for attr > MPI_THREAD_LEVEL on proc [[37778,1],3] > [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 1 bytes for > attr MPI_THREAD_LEVEL on proc [[37778,1],3] > [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for attr > MPI_THREAD_LEVEL on proc [[37778,1],0] > [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 1 bytes for > attr MPI_THREAD_LEVEL on proc [[37778,1],0] > [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for attr > MPI_THREAD_LEVEL on proc [[37778,1],1] > [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 1 bytes for > attr MPI_THREAD_LEVEL on proc [[37778,1],1] > [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for attr > MPI_THREAD_LEVEL on proc [[37778,1],2] > [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 1 bytes for > attr MPI_THREAD_LEVEL on proc [[37778,1],2] > [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for attr > MPI_THREAD_LEVEL on proc [[37778,1],3] > [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 1 bytes for > attr MPI_THREAD_LEVEL on proc [[37778,1],3] > [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for attr > MPI_THREAD_LEVEL on proc [[37778,1],0] > [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for attr > MPI_THREAD_LEVEL on proc [[37778,1],0] > [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 1 bytes for > attr MPI_THREAD_LEVEL on proc [[37778,1],0] > [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for attr > MPI_THREAD_LEVEL on proc [[37778,1],1] > [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 1 bytes for > attr MPI_THREAD_LEVEL on proc [[37778,1],1] > [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for attr > MPI_THREAD_LEVEL on proc [[37778,1],2] > [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 1 bytes for > attr MPI_THREAD_LEVEL on proc [[37778,1],2] > [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for attr > MPI_THREAD_LEVEL on proc [[37778,1],3] > [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 1 bytes for > attr MPI_THREAD_LEVEL on proc [[37778,1],3] > [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 1 bytes for > attr MPI_THREAD_LEVEL on proc [[37778,1],0] > [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for attr > MPI_THREAD_LEVEL on proc [[37778,1],1] > [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 1 bytes for > attr MPI_THREAD_LEVEL on proc [[37778,1],1] > [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for attr > MPI_THREAD_LEVEL on proc [[37778,1],2] > [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 1 bytes for > attr MPI_THREAD_LEVEL on proc [[37778,1],2] > [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for attr > MPI_THREAD_LEVEL on proc [[37778,1],3] > [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 1 bytes for > attr MPI_THREAD_LEVEL on proc [[37778,1],3] > > > NAS Parallel Benchmarks 3.3 -- FT Benchmark > > No input file inputft.data. Using compiled defaults > Size : 64x 64x 64 > Iterations : 6 > Number of processes : 4 > Processor array : 1x 4 > Layout type : 1D > [singularity:08916] btl: mosix: Establishind TCP link to address 127.0.0.1 > and PID #8917 > [singularity:08917] btl: mosix: Establishind TCP link to address 127.0.0.1 > and PID #8921 > [singularity:08916] btl: mosix: Establishind TCP link to address 127.0.0.1 > and PID #8919 > [singularity:08919] btl: mosix: Establishind TCP link to address 127.0.0.1 > and PID #8921 > [singularity:08921] btl: mosix: Establishind TCP link to address 127.0.0.1 > and PID #8919 > [singularity:08917] btl: mosix: Establishind TCP link to address 127.0.0.1 > and PID #8916 > [singularity:08921] btl: mosix: Establishind TCP link to address 127.0.0.1 > and PID #8917 > [singularity:08915] [[37778,0],0] grpcomm:bad:xcast sent to job [37778,0] tag > 1 > [singularity:08915] [[37778,0],0] grpcomm:xcast:recv:send_relay > [singularity:08915] [[37778,0],0] orte:daemon:send_relay - recipient list is > empty! > -------------------------------------------------------------------------- > mpirun noticed that process rank 2 with PID 8919 on node singularity exited > on signal 11 (Segmentation fault). > -------------------------------------------------------------------------- > [singularity:08915] [[37778,0],0] grpcomm:bad:xcast sent to job [37778,0] tag > 1 > [singularity:08915] [[37778,0],0] grpcomm:xcast:recv:send_relay > [singularity:08915] [[37778,0],0] orte:daemon:send_relay - recipient list is > empty! > alex@singularity:~/huji/benchmarks/mpi/npb$ > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/