Re: [OMPI users] MPI_Comm_spawn error messages
Thank you for the diagnosis. Saadat. On 7/6/06, Ralph Castainwrote: Hi Saadat That's the problem, then – you need to run comm_spawn applications using mpirun, I'm afraid. We plan to fix this in the near future, but for now we can only offer that workaround. Ralph On 7/6/06 5:30 PM, "s anwar" wrote: Ralph: I am running the application without mpirun, i.e. ./foobar. So, according to you definition of singleton above, I am calling comm_spawn from a singleton. Thanks. Saadat. On 7/6/06, *Ralph Castain* wrote: Thanks Saadat Could you clarify how you are running this application? We have a known problem with comm_spawn from a singleton (i.e., if you just did a.outinstead of mpirun —np 1 a.out) - the errors look somewhat like what you are showing here, hence our curiousity. Thanks Ralph On 7/6/06 3:12 PM, "s anwar" wrote: Ralph: I am using Fedora Core 4 (Linux turkana 2.6.12-1.1390_FC4smp #1 SMP Tue Jul 5 20:21:11 EDT 2005 i686 athlon i386 GNU/Linux). The machine is a dual processor Athlon based machine. No, cluster resource manager, just an rsh/ssh based setup. Thanks. Saadat. On 7/6/06, *Ralph H Castain* wrote: Hi Saadat Could you tell us something more about the system you are using? What type of processors, operating system, any resource manager (e.g., SLURM, PBS), etc? Thanks Ralph On 7/6/06 10:49 AM, "s anwar" wrote: Good Day: I am getting the following error messages every time I run a very simple program that spawns child processes: [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file base/soh_base_get_proc_soh.c at line 80 [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file base/oob_base_xcast.c at line 108 [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file base/rmgr_base_stage_gate.c at line 276 [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file base/soh_base_get_proc_soh.c at line 80 [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file base/oob_base_xcast.c at line 108 [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file base/rmgr_base_stage_gate.c at line 276 These errors are being generated by the master process. Does any body know what do they mean? Also, if I spawn four child processes, not all of them run to completion, i.e. till MPI_Finalize. Saadat. -- ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] MPI_Comm_spawn error messages
Ralph: I am running the application without mpirun, i.e. ./foobar. So, according to you definition of singleton above, I am calling comm_spawn from a singleton. Thanks. Saadat. On 7/6/06, Ralph Castainwrote: Thanks Saadat Could you clarify how you are running this application? We have a known problem with comm_spawn from a singleton (i.e., if you just did a.outinstead of mpirun —np 1 a.out) - the errors look somewhat like what you are showing here, hence our curiousity. Thanks Ralph On 7/6/06 3:12 PM, "s anwar" wrote: Ralph: I am using Fedora Core 4 (Linux turkana 2.6.12-1.1390_FC4smp #1 SMP Tue Jul 5 20:21:11 EDT 2005 i686 athlon i386 GNU/Linux). The machine is a dual processor Athlon based machine. No, cluster resource manager, just an rsh/ssh based setup. Thanks. Saadat. On 7/6/06, *Ralph H Castain* wrote: Hi Saadat Could you tell us something more about the system you are using? What type of processors, operating system, any resource manager (e.g., SLURM, PBS), etc? Thanks Ralph On 7/6/06 10:49 AM, "s anwar" wrote: Good Day: I am getting the following error messages every time I run a very simple program that spawns child processes: [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file base/soh_base_get_proc_soh.c at line 80 [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file base/oob_base_xcast.c at line 108 [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file base/rmgr_base_stage_gate.c at line 276 [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file base/soh_base_get_proc_soh.c at line 80 [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file base/oob_base_xcast.c at line 108 [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file base/rmgr_base_stage_gate.c at line 276 These errors are being generated by the master process. Does any body know what do they mean? Also, if I spawn four child processes, not all of them run to completion, i.e. till MPI_Finalize. Saadat. -- ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] MPI_Comm_spawn error messages
Ralph: I am using Fedora Core 4 (Linux turkana 2.6.12-1.1390_FC4smp #1 SMP Tue Jul 5 20:21:11 EDT 2005 i686 athlon i386 GNU/Linux). The machine is a dual processor Athlon based machine. No, cluster resource manager, just an rsh/ssh based setup. Thanks. Saadat. On 7/6/06, Ralph H Castainwrote: Hi Saadat Could you tell us something more about the system you are using? What type of processors, operating system, any resource manager (e.g., SLURM, PBS), etc? Thanks Ralph On 7/6/06 10:49 AM, "s anwar" wrote: Good Day: I am getting the following error messages every time I run a very simple program that spawns child processes: [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file base/soh_base_get_proc_soh.c at line 80 [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file base/oob_base_xcast.c at line 108 [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file base/rmgr_base_stage_gate.c at line 276 [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file base/soh_base_get_proc_soh.c at line 80 [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file base/oob_base_xcast.c at line 108 [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file base/rmgr_base_stage_gate.c at line 276 These errors are being generated by the master process. Does any body know what do they mean? Also, if I spawn four child processes, not all of them run to completion, i.e. till MPI_Finalize. Saadat. -- ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] MPI_Comm_spawn error messages
Hi Saadat Could you tell us something more about the system you are using? What type of processors, operating system, any resource manager (e.g., SLURM, PBS), etc? Thanks Ralph On 7/6/06 10:49 AM, "s anwar"wrote: > Good Day: > > I am getting the following error messages every time I run a very simple > program that spawns child processes: > [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file > base/soh_base_get_proc_soh.c at line 80 > [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file > base/oob_base_xcast.c at line 108 > [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file > base/rmgr_base_stage_gate.c at line 276 > [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file > base/soh_base_get_proc_soh.c at line 80 > [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file > base/oob_base_xcast.c at line 108 > [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file > base/rmgr_base_stage_gate.c at line 276 > > These errors are being generated by the master process. Does any body know > what do they mean? > > Also, if I spawn four child processes, not all of them run to completion, i.e. > till MPI_Finalize. > > Saadat. > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] MPI_Comm_spawn error messages
Good Day: I am getting the following error messages every time I run a very simple program that spawns child processes: [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file base/soh_base_get_proc_soh.c at line 80 [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file base/oob_base_xcast.c at line 108 [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file base/rmgr_base_stage_gate.c at line 276 [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file base/soh_base_get_proc_soh.c at line 80 [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file base/oob_base_xcast.c at line 108 [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file base/rmgr_base_stage_gate.c at line 276 These errors are being generated by the master process. Does any body know what do they mean? Also, if I spawn four child processes, not all of them run to completion, i.e. till MPI_Finalize. Saadat.