On Mon, 22 Sep 2008 12:29:02 -0500 Tom Rudwick wrote: Hi Tom, > We have this in our mom configs: > size [fs=/tmp] yep, something similar (/home instead of /tmp).
> This makes the size resource hold the space available in /tmp > > Then we do: > > qsub -l ddisk=5gb > > Or you could set a default for ddisk for a queue or server. Exactly. > It should only run on nodes with the proper amount of free > space. Sure, but what happens if I don't define that resource on some nodes, cause we have some nodes with big disks and other with not. > I'm sure you can do this other ways, but I don't have > experience with those. Hopefully you can adapt this to what > you need. I was trying to define a dynamic rsource. A simple script that returns free disk space in nodes with small disks, and big constant value on nodes with big disks... But I don't know how to use it, cause I send a job requesting some big space (that no nodes have), and the job is always scheduled when it shouldn't. I have also modified the dynamic resource so now it's a boolean value (if it has enough space=1, if not, =0), but, again, if all nodes have the resource=0, and I submit a job requesting that resource, the job is scheduled. Following: http://www.clusterresources.com/torquedocs21/a.cmomconfig.shtml [EMAIL PROTECTED] ~]# pbsnodes -a|grep -c espacio 122 [EMAIL PROTECTED] ~]# pbsnodes -a|grep -c "espacio:0" 122 So, all nodes have the resource=0. I submit a job: [EMAIL PROTECTED] ~]$ echo sleep 5|qsub -l other=espacio -q short 560171.pbs02.pic.es [EMAIL PROTECTED] ~]$ qstat -f 560171.pbs02.pic.es Job Id: 560171.pbs02.pic.es Job_Name = STDIN Job_Owner = [EMAIL PROTECTED] job_state = Q queue = short server = pbs02.pic.es Checkpoint = u ctime = Tue Sep 23 10:49:10 2008 Error_Path = ui01.pic.es:/nfs/pic.es/user/a/arnaubria/STDIN.e560171 Hold_Types = n Join_Path = n Keep_Files = n Mail_Points = a mtime = Tue Sep 23 10:49:10 2008 Output_Path = ui01.pic.es:/nfs/pic.es/user/a/arnaubria/STDIN.o560171 Priority = 0 qtime = Tue Sep 23 10:49:10 2008 Rerunable = True Resource_List.cput = 01:30:00 Resource_List.other = espacio Resource_List.walltime = 03:00:00 Variable_List = PBS_O_HOME=/nfs/pic.es/user/a/arnaubria, PBS_O_LANG=en_US.UTF-8,PBS_O_LOGNAME=arnaubria, PBS_O_PATH=/usr/kerberos/bin:/opt/glite/bin:/opt/glite/externals/bin: /opt/lcg/bin:/opt/lcg/sbin:/opt/edg/bin:/opt/edg/sbin:/opt/globus/sbin :/opt/globus/bin:/opt/gpt/sbin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6 /bin:/opt/d-cache//srm/bin:/opt/d-cache//dcap/bin:/usr/java/jdk1.5.0_1 4/bin:/nfs/pic.es/user/a/arnaubria/bin, PBS_O_MAIL=/var/spool/mail/arnaubria,PBS_O_SHELL=/bin/bash, PBS_O_HOST=ui01.pic.es,PBS_O_WORKDIR=/nfs/pic.es/user/a/arnaubria, PBS_O_QUEUE=short etime = Tue Sep 23 10:49:10 2008 submit_args = -l other=espacio -q short [EMAIL PROTECTED] ~]# qstat 560171 qstat: Unknown Job Id 560171.pbs02.pic.es [EMAIL PROTECTED] ~]# checkjob 560171 checking job 560171 State: Running Creds: user:arnaubria group:grid class:short qos:DEFAULT WallTime: 00:00:00 of 3:00:00 SubmitTime: Tue Sep 23 10:49:10 (Time Queued Total: 00:01:46 Eligible: 00:01:46) StartTime: Tue Sep 23 10:50:56 Total Tasks: 1 Req[0] TaskCount: 1 Partition: DEFAULT Network: [NONE] Memory >= 0 Disk >= 0 Swap >= 0 Opsys: [NONE] Arch: [NONE] Features: [slc4] Allocated Nodes: [td060.pic.es:1] IWD: [NONE] Executable: [NONE] Bypass: 0 StartCount: 1 PartitionMask: [ALL] Flags: BACKFILL RESTARTABLE Reservation '560171' (00:00:00 -> 3:00:00 Duration: 3:00:00) PE: 1.00 StartPriority: 0 .... It shouldn't start.... If I do it requesting file=10000kb, as you, it works, but then, I lose the chance of specifying diff directory at worke node level. > Tom Cheers, Arnau _______________________________________________ mauiusers mailing list [email protected] http://www.supercluster.org/mailman/listinfo/mauiusers
