FWIW: HNP = head node process = mpirun.

On Feb 10, 2011, at 7:46 AM, Ralph Castain wrote:

> If you srun a job, then there is no "mpirun" to provide a proc_table. So 
> running a  job directly via srun means you cannot run TV on it.
> 
> 
> On Feb 10, 2011, at 8:34 AM, Nikolay Piskun wrote:
> 
>>  
>>    Hi,
>> I am trying to use Totalview with srun and hit interesting problem. Looks 
>> like if OMPI is started by “srun   –mpi=ompi ”, mpi job is hang in 
>> ompi_wait_for_debugger() subroutine. What happen, I think is ompi was 
>> compiled without ORTE_DISABLE_FULL_SUPPORT and as result rank 0 is waiting 
>> for message from HNP (by the way what is HNP?)  that was supposed to be send 
>> by orterun. The problem is that orterun was never invoked because MPI was 
>> initiated by srun, not orterun.  So what is the solution? Should we always 
>> compile OMPI with  ORTE_DISABLE_FULL_SUPPORT=true for anything that uses 
>> different starters like srun from SLURM?
>> Thanks
>> Nikolay
>>  
>> Nikolay Piskun | Director of Continuing Engineering | Totalview Technologies 
>> |
>> Rogue Wave Software Inc  |  24 Prime Parkway, Natick, MA 01760 | p 
>> 508-652-7739|
>> nikolay.pis...@roguewave.com
>> www.roguewave.com
>>  
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to