Am 26.06.2012 um 15:16 schrieb Semi:

> I put this entry in rungms:
> 
> set TARGET=mpi
> /storage/openmpi-1.5_openib/bin/mpirun -np $NPROCS 
> /storage/app/ymiller/gamess_openib/gamess.$VERNO.x $JOB
> 
> when I run: 
> ./rungms exam01 00 20 > & exam01.log
> it works. log attached
> 
> when I try run it via SGE in such way I get error:
> #!/bin/sh 
> #$ -N test 
> #$ -o /storage/app/ymiller/gamess_openib/TEST/test.o -e 
> /storage/app/ymiller/gamess_openib/TEST/test.e 
> #$ -m ea
> #$ -A gamess_parallel
> #$ -R y
> #$ -pe ompi 10 
> export GAMESS=/storage/app/ymiller/gamess_openib
> export LD_LIBRARY_PATH=$GAMESS/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}
> cd $TMPDIR
> cp $GAMESS/tests/exam01.inp exam01.F05
> JOB=exam01
> CUSTOM_TMPDIR=$GAMESS/TEST
> BINARY_LOCATION=$GAMESS
> . $GAMESS/subgms_export

If you use `rungms`, you don't need to set my variables before. It's a 
contradiction.


> unset JOB
> unset CUSTOM_TMPDIR
> unset BINARY_LOCATION
> rm -f $IRCDATA
> rm -f $PUNCH
> rm -f $SIMEN
> rm -f $SIMCOR
> 
> HOSTFILE=$TMPDIR/machines
> awk '{ for (i=0;i<$2;++i) {print $1} }' $PE_HOSTFILE >> $HOSTFILE
> 
> $GAMESS/rungms $GAMESS/tests/exam01 00 $NSLOTS -scr $TMPDIR < /dev/null > 
> $GAMESS/TEST/exam01.out
> 
> more exam01.out
> ----- GAMESS execution script 'rungms' -----
> This job is running on host sge177
> under operating system Linux at Tue Jun 26 14:44:09 IDT 2012
> SGE has assigned the following compute nodes to this run:
> sge177
> Available scratch disk space (Kbyte units) at beginning of the job is
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/sda1            206424760   6633828 189305172   4% /
> Copying input file /storage/app/ymiller/gamess_openib/tests/exam01.inp to 
> your run's scratch directory...
> 
> 
>  ERROR OPENING PRE-EXISTING FILE INPUT,
>  ASSIGNED TO EXPLICIT FILE NAME 
> /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F05,
>  PLEASE CHECK THE -SETENV- FILE ASSIGNMENTS IN YOUR -RUNGMS- SCRIPT.
>  EXECUTION OF GAMESS TERMINATED -ABNORMALLY- AT Tue Jun 26 14:44:10 2012
>  CPU     0: STEP CPU TIME=     0.00 TOTAL CPU TIME=        0.0 (    0.0 MIN)
>  TOTAL WALL CLOCK TIME=        0.0 SECONDS, CPU UTILIZATION IS 100.00%
> more test.e
> cp /storage/app/ymiller/gamess_openib/tests/exam01.inp 
> /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openi
> b/tests/exam01.F05
> cp: cannot create regular file 
> `/tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F05':
>  No

Did you solve it in the meantime?

The /tmp/6053922.1.bioinfo.q will be created by SGE, but not the rest of the 
mentioned subdirectories. I wonder where this is added to the path. It looks 
like $GAMESS is added always to the paths.


>  such file or directory
> unset echo
> setenv AUXDATA ./auxdata

Instead of "." I suggest to put the actual (absolute) path to GAMESS in 
`rungms`.


> setenv EXTBAS /dev/null
> setenv NUCBAS /dev/null
> setenv POSBAS /dev/null
> setenv ERICFMT ./ericfmt.dat
> setenv MCPPATH ./mcpdata
> setenv BASPATH ./auxdata/BASES
> setenv QUANPOL ./auxdata/QUANPOL
> setenv MAKEFP .///storage/app/ymiller/gamess_openib/tests/exam01.efp

This is:

setenv  MAKEFP ~$USER/scr/$JOB.efp

Did you edited the line by hand?


> setenv GAMMA .///storage/app/ymiller/gamess_openib/tests/exam01.gamma
> setenv TRAJECT .///storage/app/ymiller/gamess_openib/tests/exam01.trj
> setenv RESTART .///storage/app/ymiller/gamess_openib/tests/exam01.rst
> setenv INPUT 
> /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F05
> setenv PUNCH .///storage/app/ymiller/gamess_openib/tests/exam01.dat
> setenv AOINTS 
> /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F08
> .......................................
> setenv GMCCCS 
> /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F99
> unset echo
> grep: 
> /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F05: 
> No such file or directory
> grep: 
> /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F05: 
> No such file or directory
> grep: 
> /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F05: 
> No such file or directory
> grep: 
> /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F05: 
> No such file or directory

There are exactly 4 `grep`s in `rungms`. Due to the wrong path it's not found.

-- Reuti


>  DDI Process 0: error code 911
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD 
> with errorcode 911.
> 
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> mpirun has exited due to process rank 0 with PID 30229 on
> node sge177 exiting improperly. There are two reasons this could occur:
> 
> 1. this process did not call "init" before exiting, but others in
> the job did. This can cause a job to hang indefinitely while it waits
> for all processes to call "init". By rule, if one process calls "init",
> then ALL processes must call "init" prior to termination.
> 
> 2. this process called "init", but exited without calling "finalize".
> By rule, all processes that call "init" MUST call "finalize" prior to
> exiting or it will be considered an "abnormal termination"
> 
> This may have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
> --------------------------------------------------------------------------
> touch: cannot touch 
> `/tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.nodes.mpd':
>  No such
>  file or directory
> uniq: 
> /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.nodes.mpd:
>  No such file or directo
> ry
> uniq: write error: No such file or directory
> wc: 
> /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.nodes.mpd:
>  No such file or directory
> NNODES: Subscript out of range.
> 
> On 6/25/2012 3:11 PM, Reuti wrote:
>> Am 25.06.2012 um 14:03 schrieb Dave Love:
>> 
>> 
>>> Reuti <re...@staff.uni-marburg.de>
>>>  writes:
>>> 
>>> 
>>>> Well, we also use GAMESS sometimes but just with the default socket 
>>>> communication.
>>>> 
>>> Did you ever manage to get that tightly integrated?  Alternatively, is
>>> there a good reason not to use the MPI support I seem to remember it has
>>> now?
>>> 
>> No, only the last state we talked about a year ago or so: it starts tightly 
>> integrated (i.e. with `qrsh -inherit ...` and ssh/rsh completely disabled in 
>> the cluster), but then jumps out of the process tree and there is no 
>> accounting.
>> 
>> I found the manual with the explanation about the MPI data servers quite 
>> confusing, and as as we use it only once in a while I didn't spend more time 
>> on it.
>> 
>> -- Reuti
>> 
>> 
>> 
>>> -- 
>>> Community Grid Engine:  
>>> http://arc.liv.ac.uk/SGE/
>> 
>> _______________________________________________
>> users mailing list
>> 
>> users@gridengine.org
>> https://gridengine.org/mailman/listinfo/users
> 
> 
> <exam01.log.rtf>


_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to