Am 26.06.2012 um 15:16 schrieb Semi: > I put this entry in rungms: > > set TARGET=mpi > /storage/openmpi-1.5_openib/bin/mpirun -np $NPROCS > /storage/app/ymiller/gamess_openib/gamess.$VERNO.x $JOB > > when I run: > ./rungms exam01 00 20 > & exam01.log > it works. log attached > > when I try run it via SGE in such way I get error: > #!/bin/sh > #$ -N test > #$ -o /storage/app/ymiller/gamess_openib/TEST/test.o -e > /storage/app/ymiller/gamess_openib/TEST/test.e > #$ -m ea > #$ -A gamess_parallel > #$ -R y > #$ -pe ompi 10 > export GAMESS=/storage/app/ymiller/gamess_openib > export LD_LIBRARY_PATH=$GAMESS/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH} > cd $TMPDIR > cp $GAMESS/tests/exam01.inp exam01.F05 > JOB=exam01 > CUSTOM_TMPDIR=$GAMESS/TEST > BINARY_LOCATION=$GAMESS > . $GAMESS/subgms_export
If you use `rungms`, you don't need to set my variables before. It's a contradiction. > unset JOB > unset CUSTOM_TMPDIR > unset BINARY_LOCATION > rm -f $IRCDATA > rm -f $PUNCH > rm -f $SIMEN > rm -f $SIMCOR > > HOSTFILE=$TMPDIR/machines > awk '{ for (i=0;i<$2;++i) {print $1} }' $PE_HOSTFILE >> $HOSTFILE > > $GAMESS/rungms $GAMESS/tests/exam01 00 $NSLOTS -scr $TMPDIR < /dev/null > > $GAMESS/TEST/exam01.out > > more exam01.out > ----- GAMESS execution script 'rungms' ----- > This job is running on host sge177 > under operating system Linux at Tue Jun 26 14:44:09 IDT 2012 > SGE has assigned the following compute nodes to this run: > sge177 > Available scratch disk space (Kbyte units) at beginning of the job is > Filesystem 1K-blocks Used Available Use% Mounted on > /dev/sda1 206424760 6633828 189305172 4% / > Copying input file /storage/app/ymiller/gamess_openib/tests/exam01.inp to > your run's scratch directory... > > > ERROR OPENING PRE-EXISTING FILE INPUT, > ASSIGNED TO EXPLICIT FILE NAME > /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F05, > PLEASE CHECK THE -SETENV- FILE ASSIGNMENTS IN YOUR -RUNGMS- SCRIPT. > EXECUTION OF GAMESS TERMINATED -ABNORMALLY- AT Tue Jun 26 14:44:10 2012 > CPU 0: STEP CPU TIME= 0.00 TOTAL CPU TIME= 0.0 ( 0.0 MIN) > TOTAL WALL CLOCK TIME= 0.0 SECONDS, CPU UTILIZATION IS 100.00% > more test.e > cp /storage/app/ymiller/gamess_openib/tests/exam01.inp > /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openi > b/tests/exam01.F05 > cp: cannot create regular file > `/tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F05': > No Did you solve it in the meantime? The /tmp/6053922.1.bioinfo.q will be created by SGE, but not the rest of the mentioned subdirectories. I wonder where this is added to the path. It looks like $GAMESS is added always to the paths. > such file or directory > unset echo > setenv AUXDATA ./auxdata Instead of "." I suggest to put the actual (absolute) path to GAMESS in `rungms`. > setenv EXTBAS /dev/null > setenv NUCBAS /dev/null > setenv POSBAS /dev/null > setenv ERICFMT ./ericfmt.dat > setenv MCPPATH ./mcpdata > setenv BASPATH ./auxdata/BASES > setenv QUANPOL ./auxdata/QUANPOL > setenv MAKEFP .///storage/app/ymiller/gamess_openib/tests/exam01.efp This is: setenv MAKEFP ~$USER/scr/$JOB.efp Did you edited the line by hand? > setenv GAMMA .///storage/app/ymiller/gamess_openib/tests/exam01.gamma > setenv TRAJECT .///storage/app/ymiller/gamess_openib/tests/exam01.trj > setenv RESTART .///storage/app/ymiller/gamess_openib/tests/exam01.rst > setenv INPUT > /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F05 > setenv PUNCH .///storage/app/ymiller/gamess_openib/tests/exam01.dat > setenv AOINTS > /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F08 > ....................................... > setenv GMCCCS > /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F99 > unset echo > grep: > /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F05: > No such file or directory > grep: > /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F05: > No such file or directory > grep: > /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F05: > No such file or directory > grep: > /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F05: > No such file or directory There are exactly 4 `grep`s in `rungms`. Due to the wrong path it's not found. -- Reuti > DDI Process 0: error code 911 > -------------------------------------------------------------------------- > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD > with errorcode 911. > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > You may or may not see output from other processes, depending on > exactly when Open MPI kills them. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > mpirun has exited due to process rank 0 with PID 30229 on > node sge177 exiting improperly. There are two reasons this could occur: > > 1. this process did not call "init" before exiting, but others in > the job did. This can cause a job to hang indefinitely while it waits > for all processes to call "init". By rule, if one process calls "init", > then ALL processes must call "init" prior to termination. > > 2. this process called "init", but exited without calling "finalize". > By rule, all processes that call "init" MUST call "finalize" prior to > exiting or it will be considered an "abnormal termination" > > This may have caused other processes in the application to be > terminated by signals sent by mpirun (as reported here). > -------------------------------------------------------------------------- > touch: cannot touch > `/tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.nodes.mpd': > No such > file or directory > uniq: > /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.nodes.mpd: > No such file or directo > ry > uniq: write error: No such file or directory > wc: > /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.nodes.mpd: > No such file or directory > NNODES: Subscript out of range. > > On 6/25/2012 3:11 PM, Reuti wrote: >> Am 25.06.2012 um 14:03 schrieb Dave Love: >> >> >>> Reuti <re...@staff.uni-marburg.de> >>> writes: >>> >>> >>>> Well, we also use GAMESS sometimes but just with the default socket >>>> communication. >>>> >>> Did you ever manage to get that tightly integrated? Alternatively, is >>> there a good reason not to use the MPI support I seem to remember it has >>> now? >>> >> No, only the last state we talked about a year ago or so: it starts tightly >> integrated (i.e. with `qrsh -inherit ...` and ssh/rsh completely disabled in >> the cluster), but then jumps out of the process tree and there is no >> accounting. >> >> I found the manual with the explanation about the MPI data servers quite >> confusing, and as as we use it only once in a while I didn't spend more time >> on it. >> >> -- Reuti >> >> >> >>> -- >>> Community Grid Engine: >>> http://arc.liv.ac.uk/SGE/ >> >> _______________________________________________ >> users mailing list >> >> users@gridengine.org >> https://gridengine.org/mailman/listinfo/users > > > <exam01.log.rtf> _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users