thanks
I use         qsub -I nsga2_job.sh        qsub: waiting for job 
48270.clusterName to start
By qstat    I found the job name is none and no results show up. 
No shell prompt appear, the command line is hang there , no response. 
Any help is appreciated. 
Thanks
Jack 
Oct. 25 2010
> From: jsquy...@cisco.com
> Date: Mon, 25 Oct 2010 13:39:30 -0400
> To: us...@open-mpi.org
> Subject: Re: [OMPI users] Open MPI program cannot complete
> 
> Can you use the interactive mode of PBS to get 5 cores on 1 node?  IIRC, 
> "qsub -I ..." ?
> 
> Then you get a shell prompt with your allocated cores and can run stuff 
> interactively.  I don't know if your site allows this, but interactive 
> debugging here might be *significantly* easier than try to automate some 
> debugging.
> 
> 
> On Oct 25, 2010, at 1:35 PM, Jack Bryan wrote:
> 
> > thanks
> > 
> > I have to use #PBS to submit any jobs in my cluster. 
> > I cannot use command line to hang a job on my cluster. 
> > 
> > this is my script: 
> > --------------------------------------
> > #!/bin/bash
> > #PBS -N jobname
> > #PBS -l walltime=00:08:00,nodes=1
> > #PBS -q queuename
> > COMMAND=/mypath/myprog
> > NCORES=5
> > 
> > cd $PBS_O_WORKDIR
> > NODES=`cat $PBS_NODEFILE | wc -l`
> > NPROC=$(( $NCORES * $NODES ))
> > 
> > mpirun -np $NPROC --mca btl self,sm,openib  $COMMAND
> > 
> > -------------------------------------------
> > 
> > Where should I put the (gdb --batch -ex 'bt full' -ex 'info reg' -pid 
> > ZOMBIE_PID) in the script ? 
> > And how to get ZOMBIE_PID from the script ? 
> > 
> > Any help is appreciated. 
> > 
> > thanks
> > 
> > Oct. 25 2010
> > 
> > Date: Mon, 25 Oct 2010 19:24:35 +0200
> > From: j...@59a2.org
> > To: us...@open-mpi.org
> > Subject: Re: [OMPI users] Open MPI program cannot complete
> > 
> > On Mon, Oct 25, 2010 at 19:07, Jack Bryan <dtustud...@hotmail.com> wrote:
> > I need to use #PBS parallel job script to submit a job on MPI cluster. 
> > 
> > Is it not possible to reproduce locally?  Most clusters have a way to 
> > submit an interactive job (which would let you start this thing and then 
> > inspect individual processes).  Ashley's Padb suggestion will certainly be 
> > better in a non-interactive environment.
> >  
> > Where should I put the (gdb --batch -ex 'bt full' -ex 'info reg' -pid 
> > ZOMBIE_PID) in the script ? 
> > 
> > Is control returning to your script after rank 0 has exited?  In that case, 
> > you can just put this on the next line.
> >  
> > How to get the ZOMBIE_PID ? 
> > 
> > "ps" from the command line, or getpid() from C code.
> > 
> > Jed
> > 
> > _______________________________________________ users mailing list 
> > us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
                                          

Reply via email to