FWIW: there already is a flag in ORTE that gets set when procs are launched by a non-orterun entity: orte_standalone_operation. So all you would have to do is add an appropriate check for that flag to be true.
On Feb 10, 2011, at 9:18 AM, Jeff Squyres wrote: > I think what Ralph was trying to say is that Open MPI doesn't (currently) > support running parallel debuggers when only srun is used (and mpirun is not). > > We'd certainly be open to someone submitting a patch to enable this > functionality, though! > > > On Feb 10, 2011, at 8:02 AM, Nikolay Piskun wrote: > >> Actually in SLURM 2.2.0 that I am using now, there is a support for >> parallel debugger and srun does provide needed info and fill proc_table and >> set up all debug variable correctly. The only problem that I see so far is >> the one that I described. Maybe the solution would be to check if job was >> started by non orterun and then/or check for MPIR_debug_gate before waiting >> for signal. >> >> Nikolay Piskun | Director of Continuing Engineering | Totalview Technologies >> | >> Rogue Wave Software Inc | 24 Prime Parkway, Natick, MA 01760 | p >> 508-652-7739| >> nikolay.pis...@roguewave.com >> www.roguewave.com >> >> From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] On >> Behalf Of Ralph Castain >> Sent: Thursday, February 10, 2011 10:47 AM >> To: Open MPI Developers >> Subject: Re: [OMPI devel] Debugger problem with srun and openmpi 1.5 (hang >> in OMPI) >> >> If you srun a job, then there is no "mpirun" to provide a proc_table. So >> running a job directly via srun means you cannot run TV on it. >> >> >> On Feb 10, 2011, at 8:34 AM, Nikolay Piskun wrote: >> >> >> >> Hi, >> I am trying to use Totalview with srun and hit interesting problem. Looks >> like if OMPI is started by “srun –mpi=ompi ”, mpi job is hang in >> ompi_wait_for_debugger() subroutine. What happen, I think is ompi was >> compiled without ORTE_DISABLE_FULL_SUPPORT and as result rank 0 is waiting >> for message from HNP (by the way what is HNP?) that was supposed to be send >> by orterun. The problem is that orterun was never invoked because MPI was >> initiated by srun, not orterun. So what is the solution? Should we always >> compile OMPI with ORTE_DISABLE_FULL_SUPPORT=true for anything that uses >> different starters like srun from SLURM? >> Thanks >> Nikolay >> >> Nikolay Piskun | Director of Continuing Engineering | Totalview Technologies >> | >> Rogue Wave Software Inc | 24 Prime Parkway, Natick, MA 01760 | p >> 508-652-7739| >> nikolay.pis...@roguewave.com >> www.roguewave.com >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel