Re: [Wien] PBS run

2017-09-08 Thread Gavin Abo
You might have a look at the "WIEN2k-notes of the University of Texas" document (slide 7) at: http://susi.theochem.tuwien.ac.at/reg_user/faq/pbs.html The line: echo -n 'lapw0:' > .machines It looks like that writes to the .machines file: lapw0: However, you need to have it write the

[Wien] PBS run

2017-09-08 Thread Subrata Jana
Hi Gavin Abo, It looks I am facing the same problem. ## #!/bin/bash #PBS -N wien2k #PBS -o out.log #PBS -j oe #PBS -l nodes=1:ppn=1 # Load Intel environment source /apps/intel_2016_u2/compilers_and_libraries_2016.2.181/linux/bin/compilervars.sh intel64 export

Re: [Wien] PBS run

2017-09-08 Thread Gavin Abo
It look like something is wrong with this line [ https://stackoverflow.com/questions/26816605/awk-fatal-cannot-open-file-for-reading-no-such-file-or-directory ]: awk '{print "1:"$1":1"}' $PBS_NODEFILE >>.machines Maybe quotes are needed around the $PBS_NODEFILE: awk '{print "1:"$1":1"}'

[Wien] PBS run

2017-09-08 Thread Subrata Jana
Hi Gavin Abo, I change my job script as follows: # #!/bin/bash #PBS -N wien2k #PBS -o out.log #PBS -j oe #PBS -l nodes=1:ppn=1 # Load Intel environment source /apps/intel_2016_u2/compilers_and_libraries_2016.2.181/linux/bin/compilervars.sh intel64 cd

Re: [Wien] PBS run

2017-09-08 Thread Gavin Abo
Does lapw0 exist in your WIEN2k directory (/home/sjana/WIEN2k_14.2)? Maybe #PBS -V is needed [ https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg15985.html ]. On 9/8/2017 1:42 AM, Subrata Jana wrote: Dear All,  I am trying to run WIEN2k parallel. My shell script is looking

[Wien] PBS run

2017-09-08 Thread Subrata Jana
Dear All, I am trying to run WIEN2k parallel. My shell script is looking like this. However, in the out.log file it is showing -- lapw0: Command not found. > stop error --- please help. ###

[Wien] PBS

2012-01-07 Thread Yundi Quan
Thank all for helping to tackle this problem. Actually, my system administrator seems to have done something which makes my life much easier. Now, everything is done automatically. When the job is killed, I will get the following. .machine0 : 80 processors Child id 1 Process

[Wien] PBS

2012-01-06 Thread Florent Boucher
Dear Laurence, your last lines are exactly what we need ! Thank you for this. set remote = /bin/csh $WIENROOT/pbsh $WIENROOT/pbsh is just mpirun -x LD_LIBRARY_PATH -x PATH -np 1 --host $1 /bin/csh -c $2 I will try but I pretty sure that it will work fine. Regards Florent Le 05/01/2012

[Wien] PBS

2012-01-05 Thread Florent Boucher
Dear Yundi, this is a known limitation of ssh and rsh that does not pass the interrupt signal to the remote host. Under LSF I had in the past a solution. It was a specific rshlsf for doing this. Actually I use either SGE or PBS on two different cluster and the problem exists. You will see that

[Wien] PBS

2012-01-05 Thread Peter Blaha
I've never done this myself, but as far as I know one can define a prolog script in all those queuing systems and this prolog script should ssh to all assigned nodes and kill all remaining jobs of this user. Am 05.01.2012 10:17, schrieb Florent Boucher: Dear Yundi, this is a known limitation

[Wien] PBS

2012-01-05 Thread Laurence Marks
As Florent said, this is a known issue with some (not all) versions of ssh, and it is also a torque bug. What you have to do is use mpirun instead of ssh to launch jobs which I think you can do by setting the MPI_REMOTE/USE_REMOTE switches. I think I posted how to do this some time ago, so please

[Wien] PBS

2012-01-05 Thread Peter Blaha
It is NOT true that queuing systems cannot do the WIEN2k style. We have two big clusters and run on them all three types of jobs, i) only ssh (k-parallel), ii) only mpi-parallel (no mpi) and also of mixed type. And of course the administrators configured the sun grid engine so that it makes sure

[Wien] PBS

2012-01-05 Thread Laurence Marks
I gave a slightly jetlagged response -- for certain WIEN2k style works fine with all queuing systems. But...it may not fit how the queuing system has been designed and admins may not be accomodating. My understanding (second hand) is that torque is designed to work well with openmpi for

[Wien] PBS

2012-01-04 Thread Yundi Quan
I'm working on a cluster using torque queue system. I can directly ssh to any nodes without using password. When I use qdel( or canceljob) jobid to terminate a running job, the job will be terminated in the queue system. However, when I ssh to the nodes, the job are still running. Does anyone know