Thanks much, looks like this should work. The patch is one line:
--------------------------------------------------------------
diff -c ompi_debuggers.c ompi_debuggers.c.old
*** ompi_debuggers.c Thu Feb 10 15:13:07 2011
--- ompi_debuggers.c.old Fri Jan 22 09:21:23 2010
***************
*** 222,228 ****
mpimsgq_dll_locations = tmp1;
mpidbg_dll_locations = tmp2;
! if (ORTE_DISABLE_FULL_SUPPORT || orte_standalone_operation) {
/* spin until debugger attaches and releases us */
while (MPIR_debug_gate == 0) {
#if defined(__WINDOWS__)
--- 222,228 ----
mpimsgq_dll_locations = tmp1;
mpidbg_dll_locations = tmp2;
! if (ORTE_DISABLE_FULL_SUPPORT) {
/* spin until debugger attaches and releases us */
while (MPIR_debug_gate == 0) {
#if defined(__WINDOWS__)
----------------------------------------------------------------
What would be the best way to put it in?
--
Nikolay Piskun
Director of Continuing Engineering
TotalView Technologies, Rogue Wave Software company
mailto:[email protected] phone: 508-652-7739
24 Prime Parkway, Natick, MA 01760
http://www.totalviewtech.com
________________________________________
From: [email protected] [[email protected]] On Behalf Of
Ralph Castain [[email protected]]
Sent: Thursday, February 10, 2011 12:42 PM
To: Open MPI Developers
Subject: Re: [OMPI devel] Debugger problem with srun and openmpi 1.5 (hang
in OMPI)
FWIW: there already is a flag in ORTE that gets set when procs are launched by
a non-orterun entity: orte_standalone_operation. So all you would have to do is
add an appropriate check for that flag to be true.
On Feb 10, 2011, at 9:18 AM, Jeff Squyres wrote:
> I think what Ralph was trying to say is that Open MPI doesn't (currently)
> support running parallel debuggers when only srun is used (and mpirun is not).
>
> We'd certainly be open to someone submitting a patch to enable this
> functionality, though!
>
>
> On Feb 10, 2011, at 8:02 AM, Nikolay Piskun wrote:
>
>> Actually in SLURM 2.2.0 that I am using now, there is a support for
>> parallel debugger and srun does provide needed info and fill proc_table and
>> set up all debug variable correctly. The only problem that I see so far is
>> the one that I described. Maybe the solution would be to check if job was
>> started by non orterun and then/or check for MPIR_debug_gate before waiting
>> for signal.
>>
>> Nikolay Piskun | Director of Continuing Engineering | Totalview Technologies
>> |
>> Rogue Wave Software Inc | 24 Prime Parkway, Natick, MA 01760 | p
>> 508-652-7739|
>> [email protected]
>> www.roguewave.com
>>
>> From: [email protected] [mailto:[email protected]] On
>> Behalf Of Ralph Castain
>> Sent: Thursday, February 10, 2011 10:47 AM
>> To: Open MPI Developers
>> Subject: Re: [OMPI devel] Debugger problem with srun and openmpi 1.5 (hang
>> in OMPI)
>>
>> If you srun a job, then there is no "mpirun" to provide a proc_table. So
>> running a job directly via srun means you cannot run TV on it.
>>
>>
>> On Feb 10, 2011, at 8:34 AM, Nikolay Piskun wrote:
>>
>>
>>
>> Hi,
>> I am trying to use Totalview with srun and hit interesting problem. Looks
>> like if OMPI is started by “srun –mpi=ompi ”, mpi job is hang in
>> ompi_wait_for_debugger() subroutine. What happen, I think is ompi was
>> compiled without ORTE_DISABLE_FULL_SUPPORT and as result rank 0 is waiting
>> for message from HNP (by the way what is HNP?) that was supposed to be send
>> by orterun. The problem is that orterun was never invoked because MPI was
>> initiated by srun, not orterun. So what is the solution? Should we always
>> compile OMPI with ORTE_DISABLE_FULL_SUPPORT=true for anything that uses
>> different starters like srun from SLURM?
>> Thanks
>> Nikolay
>>
>> Nikolay Piskun | Director of Continuing Engineering | Totalview Technologies
>> |
>> Rogue Wave Software Inc | 24 Prime Parkway, Natick, MA 01760 | p
>> 508-652-7739|
>> [email protected]
>> www.roguewave.com
>>
>> _______________________________________________
>> devel mailing list
>> [email protected]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>> _______________________________________________
>> devel mailing list
>> [email protected]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> --
> Jeff Squyres
> [email protected]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> devel mailing list
> [email protected]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
[email protected]
http://www.open-mpi.org/mailman/listinfo.cgi/devel