I tried some new parameters.
print server output of qmgr
----------------
create queue batch
set queue batch queue_type = Execution
set queue batch resources_default.mem = 2000mb
set queue batch resources_default.nodes = 1
set queue batch resources_default.pvmem = 16000mb
set queue batch resources_default.walltime = 06:00:00
set queue batch enabled = True
set queue batch started = True
#
# Set server attributes.
#
set server scheduling = True
set server managers = [EMAIL PROTECTED]
set server operators = [EMAIL PROTECTED]
set server default_queue = batch
set server log_events = 511
set server mail_from = adm
set server scheduler_iteration = 600
set server node_check_rate = 150
set server tcp_timeout = 6
set server pbs_version = 2.1.8
----------------------
checkjob output:
----------------------
checking job 90 (RM job '90.em-research00')
State: Idle EState: Deferred
Creds: user:abaqus group:users class:batch qos:DEFAULT
WallTime: 00:00:00 of 5:00:00
SubmitTime: Tue May 15 11:59:03
(Time Queued Total: 1:58:17 Eligible: 00:00:00)
Total Tasks: 4
Req[0] TaskCount: 4 Partition: ALL
Network: [NONE] Memory >= 0 Disk >= 0 Swap >= 15G
Opsys: [NONE] Arch: [NONE] Features: [NONE]
Exec: '' ExecSize: 0 ImageSize: 0
Dedicated Resources Per Task: PROCS: 1 MEM: 250M SWAP: 15G
NodeAccess: SHARED
TasksPerNode: 2 NodeCount: 2
IWD: [NONE] Executable: [NONE]
Bypass: 0 StartCount: 0
PartitionMask: [ALL]
SystemQueueTime: Tue May 15 13:00:06
Flags: RESTARTABLE
job is deferred. Reason: NoResources (cannot create reservation for
job '90' (intital reservation attempt)
)
Holds: Defer (hold reason: NoResources)
PE: 6.07 StartPriority: 57
cannot select job 90 for partition DEFAULT (job hold active)
-------------------
pbs-script:
-------------------
#!/bin/bash
#PBS -l nodes=2:ppn=2
#PBS -l walltime=05:00:00
#PBS -l mem=1000mb
#PBS -l vmem=7000mb
#PBS -j oe
#PBS -M [EMAIL PROTECTED]
#PBS -m bae
# Go to the directory from which you submitted the job
mkdir $PBS_O_WORKDIR
string="$PBS_O_WORKDIR/plus2gb.inp"
scp 10.1.0.52:$string $PBS_O_WORKDIR
cd $PBS_O_WORKDIR
#module load abaqus
#
/Apps/abaqus/Commands/abaqus job=plus2gb queue=abaqus4cpu
input=Standard_plus2gbyte.inp cpus=4
---------------------------
abaqus environment file.
--------------------------
import os
os.environ['LAMRSH'] = 'ssh'
max_cpus=6
mp_host_list=[['em-research00',3],['10.1.0.97',2]]
run_mode = BATCH
scratch = "/home/abaqus"
queue_name=["cpu","abaqus4cpu"]
queue_cmd="qsub -r n -q batch -S /bin/bash -V -l nodes=1:ppn=1 %S"
cpu="qsub -r n -q batch -S /bin/bash -V -l nodes=1:ppn=2 %S"
abaqus4cpu="qsub -r n -q batch -S /bin/bash -V -l nodes=2:ppn=2 %S"
pre_memory = "3000 mb"
standard_memory = "7000 mb"
---------------------------
but still no changes
thanks for al the help until now.
rishi pathak schreef:
> Also try in your job script file
> PBS -l pvmem=<amount of virtual memory>
>
> On 5/15/07, *rishi pathak* <[EMAIL PROTECTED]
> <mailto:[EMAIL PROTECTED]>> wrote:
>
> I did not see any specific queue in th submit script
> have you specified the following for the queue you are using
>
> resources_default.mem #available ram
> resources_default.pvmem #virtual memory
>
>
>
>
>
> On 5/15/07, *Daniel Boone* <[EMAIL PROTECTED]
> <mailto:[EMAIL PROTECTED]>> wrote:
>
> Hi
>
> I need to use the swap. I know I don't have enough RAM, but
> the job must
> be able to run. Even if it swaps a lot.
> Time is not an issue here.
> On 1 machine the job uses about 7.4GB swap. We don't have any
> other
> machines with more RAM to run it on.
> Otherwise the other option is to run the job outside
> torque/maui, but I
> rather don't do that.
>
> Can some tell me how to read the checkjob -v output, because I
> don't
> understand how to find errors in it.
>
> rishi pathak schreef:
> > Hi
> > system memory(RAM) available to per process is less than the
> requested
> > amount
> > It is not considering swap as an extention of RAM
> > Try with reduced system memory
> >
> >
> >
> > On 5/14/07, *Daniel Boone* <[EMAIL PROTECTED]
> <mailto:[EMAIL PROTECTED]>
> > <mailto: [EMAIL PROTECTED]
> <mailto:[EMAIL PROTECTED]>>> wrote:
> >
> > Hi
> >
> > I'm having the following problem. When I submit a very
> > memory-intensive(most swap) job, the job doesn't want to
> start.
> > It gives the error: cannot select job 62 for partition
> DEFAULT
> > (job hold
> > active)
> > But I don't understand what the error means.
> >
> > I run torque 2.1.8 with maui maui-3.2.6p19
> >
> > checkjob -v returns the following:
> > -------------------
> > checking job 62 (RM job '62.em-research00')
> >
> > State: Idle EState: Deferred
> > Creds: user:abaqus group:users class:batch qos:DEFAULT
> > WallTime: 00:00:00 of 6:00:00
> > SubmitTime: Mon May 14 14:13:41
> > (Time Queued Total: 1:53:39 Eligible: 00:00:00)
> >
> > Total Tasks: 4
> >
> > Req[0] TaskCount: 4 Partition: ALL
> > Network: [NONE] Memory >= 0 Disk >= 0 Swap >= 0
> > Opsys: [NONE] Arch: [NONE] Features: [NONE]
> > Exec: '' ExecSize: 0 ImageSize: 0
> > Dedicated Resources Per Task: PROCS: 1 MEM: 3875M
> > NodeAccess: SHARED
> > TasksPerNode: 2 NodeCount: 2
> >
> >
> > IWD: [NONE] Executable: [NONE]
> > Bypass: 0 StartCount: 0
> > PartitionMask: [ALL]
> > SystemQueueTime: Mon May 14 15:14:13
> >
> > Flags: RESTARTABLE
> >
> > job is deferred. Reason: NoResources (cannot create
> reservation for
> > job '62' (intital reservation attempt)
> > )
> > Holds: Defer (hold reason: NoResources)
> > PE: 19.27 StartPriority: 53
> > cannot select job 62 for partition DEFAULT (job hold active)
> > ------------------------
> > checknode of the two nodes:checking node em-research00
> > ------------
> > State: Idle (in current state for 2:31:21)
> > Configured Resources: PROCS: 3 MEM: 2010M SWAP:
> 33G DISK: 72G
> >
> >
> > Utilized Resources: DISK: 9907M
> > Dedicated Resources: [NONE]
> > Opsys: linux Arch: [NONE]
> > Speed: 1.00 Load: 0.000
> > Network: [DEFAULT]
> > Features: [F]
> > Attributes: [Batch]
> > Classes: [batch 3:3]
> >
> > Total Time: 2:29:18 Up: 2:29:18 (100.00%) Active:
> 00:00:00 (0.00% )
> >
> > Reservations:
> > NOTE: no reservations on node
> >
> > --------------------
> > State: Idle (in current state for 2:31:52)
> > Configured Resources: PROCS: 2 MEM: 2012M SWAP:
> 17G DISK: 35G
> > Utilized Resources: DISK: 24G
> > Dedicated Resources: [NONE]
> > Opsys: linux Arch: [NONE]
> > Speed: 1.00 Load: 0.590
> > Network: [DEFAULT]
> > Features: [NONE]
> > Attributes: [Batch]
> > Classes: [batch 2:2]
> >
> > Total Time: 2:29:49 Up: 2:29:49 ( 100.00%) Active:
> 00:00:00 ( 0.00%)
> >
> > Reservations:
> > NOTE: no reservations on node
> > -----------------
> > The pbs scipt I'm using:
> > #!/bin/bash
> > #PBS -l nodes=2:ppn=2
> > #PBS -l walltime=06:00:00
> > #PBS -l mem=15500mb
> > #PBS -j oe
> > # Go to the directory from which you submitted the job
> > mkdir $PBS_O_WORKDIR
> > string="$PBS_O_WORKDIR/plus2gb.inp"
> > scp 10.1.0.52:$string $PBS_O_WORKDIR
> > #scp 10.1.0.52:$PBS_O_WORKDIR'/'$PBS_JOBNAME ./
> > cd $PBS_O_WORKDIR
> > #module load abaqus
> > #
> > /Apps/abaqus/Commands/abaqus job=plus2gb queue=cpu2
> > input=Standard_plus2gbyte.inp cpus=4 mem=15000mb
> > ---------------------------
> > If you need some extra info please let me know.
> >
> > Thank you
> >
> > _______________________________________________
> > mauiusers mailing list
> > [email protected]
> <mailto:[email protected]> <mailto:
> [email protected] <mailto:[email protected]>>
> > http://www.supercluster.org/mailman/listinfo/mauiusers
> >
> >
> >
> >
> > --
> > Regards--
> > Rishi Pathak
> > National PARAM Supercomputing Facility
> > Center for Development of Advanced Computing(C-DAC)
> > Pune University Campus,Ganesh Khind Road
> > Pune-Maharastra
>
>
>
>
> --
> Regards--
> Rishi Pathak
> National PARAM Supercomputing Facility
> Center for Development of Advanced Computing(C-DAC)
> Pune University Campus,Ganesh Khind Road
> Pune-Maharastra
>
>
>
>
> --
> Regards--
> Rishi Pathak
> National PARAM Supercomputing Facility
> Center for Development of Advanced Computing(C-DAC)
> Pune University Campus,Ganesh Khind Road
> Pune-Maharastra
_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers