Am 30.09.2008 um 00:30 schrieb Zhiliang Hu:

At 12:10 AM 9/30/2008 +0200, you wrote:

Can you please try this jobscript instead:

#!/bin/sh
set | grep PBS
/path/to/mpirun /path/to/my_program

All should be handled by Open MPI automatically. With the "set"
bash
command you will get a list with all defined variables for further analysis; and where you can check for the variables set by Torque.

-- Reuti

"set | grep PBS" part had nothing in output.

Strange - you checked the .o end .e files of the job? - Reuti

There is nothing in -o nor -e output.  I had to kill the job.
I checked torque log, it shows (/var/spool/torque/server_logs):

09/29/2008 15:52:16;0100;PBS_Server;Job;799.xxx.xxx.xxx;enqueuing
into default, state 1 hop 1
09/29/2008 15:52:16;0008;PBS_Server;Job;799.xxx.xxx.xxx;Job Queued
at request of z...@xxx.xxx.xxx, owner = z...@xxx.xxx.xxx, job name =
mpiblastn.sh, queue = default
09/29/2008 15:52:16;0040;PBS_Server;Svr;xxx.xxx.xxx;Scheduler sent
command new
09/29/2008 15:52:16;0008;PBS_Server;Job;799.xxx.xxx.xxx;Job
Modified at request of schedu...@xxx.xxx.xxx
09/29/2008 15:52:27;0008;PBS_Server;Job;799.xxx.xxx.xxx;Job
deleted at request of z...@xxx.xxx.xxx
09/29/2008 15:52:27;0100;PBS_Server;Job;799.xxx.xxx.xxx;dequeuing
from default, state EXITING
09/29/2008 15:52:27;0040;PBS_Server;Svr;xxx.xxx.xxx;Scheduler sent
command term
09/29/2008 15:52:47;0001;PBS_Server;Svr;PBS_Server;is_request, bad
attempt to connect from 172.16.100.1:1021 (address not trusted -
check entry in server_priv/nodes)

As you blank out some addresses: have the nodes and the headnode one
or two network cards installed? All the names like node001 et al. are
known on neach node by the correct address? I.e. 172.16.100.1 = node001?

-- Reuti

There should be no problem in this regard -- the set up is by a
commercial company.

Okay, then they should solve the problem as you paid for it.

-- Reuti

Reply via email to