Hi Andre,

We have preemption working at our site on that version of maui.

We have found that the settings below seem to be necessary for
it to work at our site. I don't see a SYSCFG in your config,
and I don't see a GROUPCFG for the admins group? I may be off
base on these, since I know some bugs have been fixed since we
got this working, but you may want to try setting those.

On this line you set the "sys" QOS but I don't see it elsewhere...

CLASSCFG[admins]        MAXPROC=280 QDEF=sys   PRIORITY=2001

I see this "admins" one...

QOSCFG[admins]          QFLAGS=PREEMPTOR  PRIORITY=1000

Good luck,

Tom

( this is a fragment of our maui config file ...)

QOSWEIGHT 1
SYSCFG QLIST=bigmem,integration,interactive,debug,regress,contingent
QOSCFG[bigmem]       PRIORITY=1  QFLAGS=PREEMPTOR,RESTARTPREEMPT
QOSCFG[integration]  PRIORITY=1  QFLAGS=USERESERVED
QOSCFG[interactive]  PRIORITY=2  QFLAGS=PREEMPTOR,RESTARTPREEMPT
QOSCFG[debug]        PRIORITY=1
QOSCFG[regress]      PRIORITY=-1
QOSCFG[contingent]   PRIORITY=-2 QFLAGS=PREEMPTEE
GROUPCFG[users] QDEF=DEFAULT 
QLIST=bigmem,integration,interactive,debug,regress,contingent
CLASSCFG[regress] QDEF=contingent



Andre Gauthier wrote:
> HI, I'm trying to get preemption to work with Maui and Torque.     I
> have dozen queues, but one is define as a preemptee (general queue &
> qos) and another as a preemptor (admins queue & qos).  I submit a job
> to the queue that is a premptee then a job to the preemptor.  The
> preemptor does not run.  Maui version 3.2.6p21, Torque Version
> 2.3.6-1.
>
> qstat:
>
> Job id                    Name             User            Time Use S Queue
> ------------------------- ---------------- --------------- -------- - -----
> 459.hpc-test              sleep.sh         user2           00:00:00 R
> general
> 460.hpc-test              sleep.sh         user1                  0 Q admins
>
>
> checkjob 460:
>
> checking job 460
>
> State: Idle  EState: Deferred
> Creds:  user:user1  group:admins  class:admins  qos:admins
> WallTime: 00:00:00 of 1:00:00
> SubmitTime: Tue Apr 20 11:41:28
>   (Time Queued  Total: 00:00:02  Eligible: 00:00:01)
>
> StartDate: 00:00:00  Tue Apr 20 11:41:30
> Total Tasks: 8
>
> Req[0]  TaskCount: 8  Partition: ALL
> Network: [NONE]  Memory >= 0  Disk >= 0  Swap >= 0
> Opsys: [NONE]  Arch: [NONE]  Features: [NONE]
> Dedicated Resources Per Task: PROCS: 1  MEM: 32M
>
>
> IWD: [NONE]  Executable:  [NONE]
> Bypass: 0  StartCount: 1
> PartitionMask: [ALL]
> Flags:       RESTARTABLE PREEMPTOR
>
> job is deferred.  Reason:  RMFailure  (cannot start job - RM failure,
> rc: 15044, msg: 'Resource temporarily unavailable MSG=job allocation
> request exceeds currently available cluster nodes, 1 requested, 0
> available')
> Holds:    Defer  (hold reason:  RMFailure)
> PE:  8.00  StartPriority:  3001
> cannot select job 460 for partition DEFAULT (job hold active)
>
>
> checkjob 459:
>
> checking job 459
>
> State: Running
> Creds:  user:user2  group:user2  class:general  qos:general
> WallTime: 00:03:05 of 1:00:00
> SubmitTime: Tue Apr 20 11:41:11
>   (Time Queued  Total: 00:00:19  Eligible: 00:00:01)
>
> StartTime: Tue Apr 20 11:41:30
> Total Tasks: 96
>
> Req[0]  TaskCount: 96  Partition: DEFAULT
> Network: [NONE]  Memory >= 0  Disk >= 0  Swap >= 0
> Opsys: [NONE]  Arch: [NONE]  Features: [NONE]
> Dedicated Resources Per Task: PROCS: 1  MEM: 2M
> Allocated Nodes:
> [compute-0-15:8][compute-0-13:8][compute-0-12:8][compute-0-11:8]
> [compute-0-10:8][compute-0-9:8][compute-0-8:8][compute-0-7:8]
> [compute-0-6:8][compute-0-5:8][compute-0-4:8][compute-0-3:8]
>
>
>
> IWD: [NONE]  Executable:  [NONE]
> Bypass: 0  StartCount: 2
> PartitionMask: [ALL]
> Flags:       RESTARTABLE PREEMPTEE
> Attr:        PREEMPTEE
>
> Reservation '459' (-00:03:06 -> 00:56:54  Duration: 1:00:00)
> PE:  96.00  StartPriority:  200
>
>
>
>
>
> showconfig:
>
>
>
> IWD: [NONE]  Executable:  [NONE]
> Bypass: 0  StartCount: 2
> PartitionMask: [ALL]
> Flags:       RESTARTABLE PREEMPTEE
> Attr:        PREEMPTEE
>
> Reservation '459' (-00:03:06 -> 00:56:54  Duration: 1:00:00)
> PE:  96.00  StartPriority:  200
>
> [r...@hpc-test maui]# showconfig
> # Maui version 3.2.6p21 (PID: 16046)
> # global policies
>
> REJECTNEGPRIOJOBS[0]              FALSE
> ENABLENEGJOBPRIORITY[0]           FALSE
> ENABLEMULTINODEJOBS[0]            TRUE
> ENABLEMULTIREQJOBS[0]             FALSE
> BFPRIORITYPOLICY[0]               [NONE]
> JOBPRIOACCRUALPOLICY            QUEUEPOLICY
> NODELOADPOLICY                  ADJUSTSTATE
> USEMACHINESPEED                 FALSE
> USESYSTEMQUEUETIME              TRUE
> USELOCALMACHINEPRIORITY         FALSE
> NODEUNTRACKEDLOADFACTOR         1.2
> JOBNODEMATCHPOLICY[0]
>
> JOBMAXSTARTTIME[0]                  INFINITY
>
> METAMAXTASKS[0]                   0
> NODESETPOLICY[0]                  [NONE]
> NODESETATTRIBUTE[0]               [NONE]
> NODESETLIST[0]
> NODESETDELAY[0]                   00:00:00
> NODESETPRIORITYTYPE[0]            MINLOSS
> NODESETTOLERANCE[0]                 0.00
>
> BACKFILLPOLICY[0]                 FIRSTFIT
> BACKFILLDEPTH[0]                  0
> BACKFILLPROCFACTOR[0]             0
> BACKFILLMAXSCHEDULES[0]           10000
> BACKFILLMETRIC[0]                 PROCS
>
> BFCHUNKDURATION[0]                00:00:00
> BFCHUNKSIZE[0]                    0
> PREEMPTPOLICY[0]                  REQUEUE
> MINADMINSTIME[0]                  00:00:00
> RESOURCELIMITPOLICY[0]
> NODEAVAILABILITYPOLICY[0]         COMBINED:[DEFAULT]
> NODEALLOCATIONPOLICY[0]           MINRESOURCE
> TASKDISTRIBUTIONPOLICY[0]         DEFAULT
> RESERVATIONPOLICY[0]              NEVER
> RESERVATIONRETRYTIME[0]           00:00:00
> RESERVATIONTHRESHOLDTYPE[0]       NONE
> RESERVATIONTHRESHOLDVALUE[0]      0
>
> FSPOLICY                        [NONE]
> FSPOLICY                        [NONE]
> FSINTERVAL                      12:00:00
> FSDEPTH                         8
> FSDECAY                         1.00
>
>
>
> # Priority Weights
>
> SERVICEWEIGHT[0]                  1
> TARGETWEIGHT[0]                   1
> CREDWEIGHT[0]                     1
> ATTRWEIGHT[0]                     1
> FSWEIGHT[0]                       1
> RESWEIGHT[0]                      1
> USAGEWEIGHT[0]                    1
> QUEUETIMEWEIGHT[0]                1
> XFACTORWEIGHT[0]                  0
> SPVIOLATIONWEIGHT[0]              0
> BYPASSWEIGHT[0]                   0
> TARGETQUEUETIMEWEIGHT[0]          0
> TARGETXFACTORWEIGHT[0]            0
> USERWEIGHT[0]                     1
> GROUPWEIGHT[0]                    1
> ACCOUNTWEIGHT[0]                  0
> QOSWEIGHT[0]                      1
> CLASSWEIGHT[0]                    1
> FSUSERWEIGHT[0]                   0
> FSGROUPWEIGHT[0]                  0
> FSACCOUNTWEIGHT[0]                0
> FSQOSWEIGHT[0]                    0
> FSCLASSWEIGHT[0]                  0
> ATTRATTRWEIGHT[0]                 0
> ATTRSTATEWEIGHT[0]                0
> NODEWEIGHT[0]                     0
> PROCWEIGHT[0]                     0
> MEMWEIGHT[0]                      0
> SWAPWEIGHT[0]                     0
> DISKWEIGHT[0]                     0
> PSWEIGHT[0]                       0
> PEWEIGHT[0]                       0
> WALLTIMEWEIGHT[0]                 0
> UPROCWEIGHT[0]                    0
> UJOBWEIGHT[0]                     0
> CONSUMEDWEIGHT[0]                 0
> USAGEEXECUTIONTIMEWEIGHT[0]       0
> REMAININGWEIGHT[0]                0
> PERCENTWEIGHT[0]                  0
> XFMINWCLIMIT[0]                   00:02:00
>
>
> # partition DEFAULT policies
>
> REJECTNEGPRIOJOBS[1]              FALSE
> ENABLENEGJOBPRIORITY[1]           FALSE
> ENABLEMULTINODEJOBS[1]            TRUE
> ENABLEMULTIREQJOBS[1]             FALSE
> BFPRIORITYPOLICY[1]               [NONE]
> JOBPRIOACCRUALPOLICY            QUEUEPOLICY
> NODELOADPOLICY                  ADJUSTSTATE
> JOBNODEMATCHPOLICY[1]
>
> JOBMAXSTARTTIME[1]                  INFINITY
>
> METAMAXTASKS[1]                   0
> NODESETPOLICY[1]                  [NONE]
> NODESETATTRIBUTE[1]               [NONE]
> NODESETLIST[1]
> NODESETDELAY[1]                   00:00:00
> NODESETPRIORITYTYPE[1]            MINLOSS
> NODESETTOLERANCE[1]                 0.00
>
> # Priority Weights
>
> XFMINWCLIMIT[1]                   00:00:00
>
> RMAUTHTYPE[0]                     CHECKSUM
>
> CLASSCFG[[NONE]]  DEFAULT.FEATURES=[NONE]
> CLASSCFG[[ALL]]  DEFAULT.FEATURES=[NONE]
> CLASSCFG[DEFAULT]  DEFAULT.FEATURES=[NONE]
> CLASSCFG[batch]  DEFAULT.FEATURES=[NONE]
> CLASSCFG[interactive]  DEFAULT.FEATURES=[NONE]
> CLASSCFG[general]  DEFAULT.FEATURES=[NONE]
> CLASSCFG[priya]  DEFAULT.FEATURES=[NONE]
> CLASSCFG[admins]  DEFAULT.FEATURES=[NONE]
> CLASSCFG[sohrab]  DEFAULT.FEATURES=[NONE]
> CLASSCFG[micro]  DEFAULT.FEATURES=[NONE]
> CLASSCFG[altonji]  DEFAULT.FEATURES=[NONE]
> CLASSCFG[easther]  DEFAULT.FEATURES=[NONE]
> CLASSCFG[berry]  DEFAULT.FEATURES=[NONE]
> CLASSCFG[hpcprog]  DEFAULT.FEATURES=[NONE]
> CLASSCFG[macro]  DEFAULT.FEATURES=[NONE]
> QOSPRIORITY[0]                    0
> QOSQTWEIGHT[0]                    0
> QOSXFWEIGHT[0]                    0
> QOSTARGETXF[0]                      0.00
> QOSTARGETQT[0]                    00:00:00
> QOSFLAGS[0]
> QOSPRIORITY[1]                    0
> QOSQTWEIGHT[1]                    0
> QOSXFWEIGHT[1]                    0
> QOSTARGETXF[1]                      0.00
> QOSTARGETQT[1]                    00:00:00
> QOSFLAGS[1]
> QOSPRIORITY[2]                    100
> QOSQTWEIGHT[2]                    0
> QOSXFWEIGHT[2]                    0
> QOSTARGETXF[2]                    100.00
> QOSTARGETQT[2]                    00:00:00
> QOSFLAGS[2]
> QOSPRIORITY[3]                    -1000
> QOSQTWEIGHT[3]                    0
> QOSXFWEIGHT[3]                    0
> QOSTARGETXF[3]                      0.00
> QOSTARGETQT[3]                    00:00:00
> QOSFLAGS[3]
> QOSPRIORITY[4]                    1000
> QOSQTWEIGHT[4]                    0
> QOSXFWEIGHT[4]                    0
> QOSTARGETXF[4]                      0.00
> QOSTARGETQT[4]                    00:00:00
> QOSFLAGS[4]                       PREEMPTOR
> QOSPRIORITY[5]                    100
> QOSQTWEIGHT[5]                    0
> QOSXFWEIGHT[5]                    0
> QOSTARGETXF[5]                      0.00
> QOSTARGETQT[5]                    00:00:00
> QOSFLAGS[5]                       PREEMPTEE
> # SERVER MODULES:  MX
> SERVERMODE                      NORMAL
> SERVERNAME
> SERVERHOST                      hpc-test.wss.yale.edu
> SERVERPORT                      42559
> LOGFILE                         maui.log
> LOGFILEMAXSIZE                  10000000
> LOGFILEROLLDEPTH                1
> LOGLEVEL                        3
> LOGFACILITY                     fALL
> SERVERHOMEDIR                   /opt/maui/
> TOOLSDIR                        /opt/maui/tools/
> LOGDIR                          /opt/maui/log/
> STATDIR                         /opt/maui/stats/
> LOCKFILE                        /opt/maui/maui.pid
> SERVERCONFIGFILE                /opt/maui/maui.cfg
> CHECKPOINTFILE                  /opt/maui/maui.ck
> CHECKPOINTINTERVAL              00:05:00
> CHECKPOINTEXPIRATIONTIME        3:11:20:00
> TRAPJOB
> TRAPNODE
> TRAPFUNCTION
> RESDEPTH                        24
>
> RMPOLLINTERVAL                  00:00:30
> NODEACCESSPOLICY                SHARED
> ALLOCLOCALITYPOLICY             [NONE]
> SIMTIMEPOLICY                   [NONE]
> ADMIN1                          maui root
> ADMINHOSTS                      ALL
> NODEPOLLFREQUENCY               0
> DISPLAYFLAGS
> DEFAULTDOMAIN
> DEFAULTCLASSLIST                [DEFAULT:1]
> FEATURENODETYPEHEADER
> FEATUREPROCSPEEDHEADER
> FEATUREPARTITIONHEADER
> DEFERTIME                       1:00:00
> DEFERCOUNT                      24
> DEFERSTARTCOUNT                 1
> JOBPURGETIME                    0
> NODEPURGETIME                   2140000000
> APIFAILURETHRESHHOLD            6
> NODESYNCTIME                    600
> JOBSYNCTIME                     600
> JOBMAXOVERRUN                   00:10:00
> NODEMAXLOAD                     0.0
>
> PLOTMINTIME                     120
> PLOTMAXTIME                     245760
> PLOTTIMESCALE                   11
> PLOTMINPROC                     1
> PLOTMAXPROC                     512
> PLOTPROCSCALE                   9
> SCHEDCFG[]                        MODE=NORMAL
> SERVER=hpc-test.wss.yale.edu:42559
> # RM MODULES: PBS SSS WIKI NATIVE
> RMCFG[base] AUTHTYPE=CHECKSUM EPORT=15004 TIMEOUT=00:01:30 TYPE=PBS
> SIMWORKLOADTRACEFILE            workload
> SIMRESOURCETRACEFILE            resource
> SIMAUTOSHUTDOWN                 OFF
> SIMSTARTTIME                    0
> SIMSCALEJOBRUNTIME              FALSE
> SIMFLAGS
> SIMJOBSUBMISSIONPOLICY          CONSTANTJOBDEPTH
> SIMINITIALQUEUEDEPTH            16
> SIMWCACCURACY                   0.00
> SIMWCACCURACYCHANGE             0.00
> SIMNODECOUNT                    0
> SIMNODECONFIGURATION            NORMAL
> SIMWCSCALINGPERCENT             100
> SIMCOMRATE                      0.10
> SIMCOMTYPE                      ROUNDROBIN
> COMINTRAFRAMECOST               0.30
> COMINTERFRAMECOST               0.30
> SIMSTOPITERATION                -1
> SIMEXITITERATION                -1
>
>
>
> cat maui.cfg:
>
>
> # maui.cfg.tmpl for Maui v3.2.5
>
> # full parameter docs at http://supercluster.org/mauidocs/a.fparameters.html
> # use the 'schedctl -l' command to display current configuration
>
> RMPOLLINTERVAL                00:00:30
>
> SERVERHOST            hpc-test.wss.yale.edu
> SERVERPORT            42559
> SERVERMODE            NORMAL
>
> RMCFG[base]           TYPE=PBS TIMEOUT=90
>
> # Admin: http://supercluster.org/mauidocs/a.esecurity.html
> # ADMIN1 users have full scheduler control
>
> ADMIN1                maui root
>
> LOGFILE               maui.log
> LOGFILEMAXSIZE        10000000
> LOGLEVEL              3
>
> # Job Priority: http://supercluster.org/mauidocs/5.1jobprioritization.html
>
> QUEUETIMEWEIGHT       1
>
> # FairShare: http://supercluster.org/mauidocs/6.3fairshare.html
>
> #FSPOLICY              PSDEDICATED
> #FSDEPTH               7
> #FSINTERVAL            86400
> #FSDECAY               0.80
>
> # Throttling Policies:
> http://supercluster.org/mauidocs/6.2throttlingpolicies.html
>
> # NONE SPECIFIED
>
> # Backfill: http://supercluster.org/mauidocs/8.2backfill.html
>
> BACKFILLPOLICY        FIRSTFIT
> RESERVATIONPOLICY     NEVER # set to never for premption.
>
> # Node Allocation: http://supercluster.org/mauidocs/5.2nodeallocation.html
>
> NODEALLOCATIONPOLICY  MINRESOURCE
>
> # QOS: http://supercluster.org/mauidocs/7.3qos.html
>
>  QOSCFG[hi]  PRIORITY=100 XFTARGET=100 FLAGS=PREEMPTOR:IGNMAXJOB
>  QOSCFG[low] PRIORITY=-1000 FLAGS=PREEMPTEE
>
> # Standing Reservations:
> http://supercluster.org/mauidocs/7.1.3standingreservations.html
>
> # SRSTARTTIME[test] 8:00:00
> # SRENDTIME[test]   17:00:00
> # SRDAYS[test]      MON TUE WED THU FRI
> # SRTASKCOUNT[test] 20
> # SRMAXTIME[test]   0:30:00
>
> #PREEMPTPOLICY set by  AG
> PREEMPTIONPOLICY REQUEUE
>
> # Creds: http://supercluster.org/mauidocs/6.1fairnessoverview.html
>
>  USERCFG[DEFAULT]      FSTARGET=25.0
>  USERCFG[john]         PRIORITY=100  FSTARGET=10.0-
>  GROUPCFG[staff]       PRIORITY=1000 QLIST=hi:low QDEF=hi
>  CLASSCFG[batch]       FLAGS=PREEMPTEE
>  CLASSCFG[interactive] FLAGS=PREEMPTOR
>
> ###set QOS needed for premptions
> QOSWEIGHT 1
> QOSCFG[admins]                QFLAGS=PREEMPTOR  PRIORITY=1000
> QOSCFG[general]               QFLAGS=PREEMPTEE PRIORITY=100
>
> GROUPWEIGHT 1
> CLASSWEIGHT 1
> CREDWEIGHT 1
> USERWEIGHT 1
>
>
> CLASSCFG[general] QDEF=general PRIORITY=100
>
> GROUPWEIGHT 1
> CLASSCFG[DEFAULT]     MAXPROC=280 QDEF=general  PRIORITY=200
> CLASSCFG[admins]      MAXPROC=280 QDEF=sys   PRIORITY=2001
> _______________________________________________
> mauiusers mailing list
> [email protected]
> http://www.supercluster.org/mailman/listinfo/mauiusers
>
>   

_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to