Moe,

Regarding PreemptMode=SUSPEND, maybe that's not what I want.  I just
realized that PreemptMode=SUSPEND sends a SIGSTOP/SIGCONT (right?),
which doesn't free resources (as I first thought), but merely pauses
the job's associated processes.  That works pretty well for CPU, but
not for memory!  I might have to rethink wanting to use SUSPEND, and
focus instead on getting a working REQUEUE for jobs in low-priorty
partitions.  Unfortunately, I have had some problems with that also...

With PreemptMode=REQUEUE, ProctrackType=proctrack/pgid,
SchedulerType=sched/backfill, SelectType=select/linear, both
partitions Shared=NO.
  - the job in the lower-priority partition is "preempted," in that
its resource allocation is relinquished but it doesn't stop running.
  - slurm's job log lists the job in the low-priority job as being "CANCELLED"

With PreemptMode=CANCEL, ProctrackType=proctrack/pgid,
SchedulerType=sched/backfill, SelectType=select/linear, both
partitions Shared=NO.
  - the job in the lower-priority partition is "preempted," in that
its resource allocation is relinquished but it doesn't stop running.
  - slurm's job log lists the job in the low-priority job as being "CANCELLED"

Interestingly, if I submit the jobs with `srun` instead of `salloc`
the preemption actually TERMINATES the jobs in the lower-priority
queue, with either PreemptMode=REQUEUE or PreemptMode=CANCEL.

Thanks again,

Alan

p.s. Forgive me, I forgot to CC the list in the previous message!

On Fri, Jul 15, 2011 at 6:07 PM,  <[email protected]> wrote:
> Quoting Alan Orth <[email protected]>:
>
>> Moe,
>>
>> Thanks for the quick response.  I've just updated my configuration to
>> include some of your tips, but I'm still having problems.  I can
>> confirm the same behavior happens with the linear select plugin; with
>> "PreemptMode=REQUEUE" the resource allocation is revoked from the job
>> in lower-priority partition, but the job continues to run (consuming
>> CPU resources).
>
> Is the job state CG (completing)? If so, then the problem isn't in the
> preemption logic, but in the configuration or communications (i.e. the
> slurmd daemon on the compute nodes isn't doing what the slurmctld (the slurm
> control daemon) is telling it to do). Alternately it might not be finding
> all of the processes due to the ProctractType configuration.If that's the
> case, the SlurmdLogFile and SlurmctldLogFile should help to diagnose the
> problem. Running "scontrol show config" will show you what all of these
> values are.
>
>
>> With "PreemptMode=SUSPEND,GANG" the job submitted in
>> the higher-priority partition simply waits until there are free slots.
>>  The behavior doesn't seem to change with either select/linear or
>> select/cons_res.
>>
>> Again, relevant slurm.conf sections from my slurm 2.2.7 test
>> installation (on Ubuntu 11.04):
>>
>> SchedulerType=sched/backfill
>> SelectType=select/linear
>> PreemptType=preempt/partition_prio
>> NodeName=noma CoresPerSocket=4 ThreadsPerCore=1 Sockets=1 State=UNKNOWN
>> PartitionName=batch Nodes=noma Default=NO DefaultTime=INFINITE
>> MaxTime=INFINITE State=UP Priority=10 Shared=NO
>> PartitionName=interactive Nodes=noma Default=NO MaxTime=INFINITE
>> State=UP Priority=20 Shared=NO
>>
>> Regarding my previous use of the "Shared=Force:1" option in the
>> low-priority partition, I had specified it because the
>> documentation[1] mentions "By default the max_share value is 4. In
>> order to preempt jobs (and not gang schedule them), always set
>> max_share to 1."
>
> That is the correct configuration for preempting a job through suspending
> it, but not if you want its resources to be relinquished before starting
> another job on the same resources (i.e. with PreemptMode=Cancel or Requeue).
> In the latter case, you need Shared=NO.
>
>> Cheers and thanks,
>>
>> Alan
>>
>> [1] https://computing.llnl.gov/linux/slurm/preempt.html
>>
>> On Thu, Jul 14, 2011 at 6:15 PM,  <[email protected]> wrote:
>>>
>>> Alan,
>>>
>>> I believe that you need "Shared=NO" for both partitions and preemption
>>> modes
>>> PreemptMode=CANCEL or REQUEUE. For PreemptMode=Suspend, it seems to work
>>> fine for SelectType=select/linear, but not for
>>> SelectType=select/cons_res.
>>> I'll make a note of this bug in the select/cons_res plugin, but I'm not
>>> sure
>>> when it will get fixed.
>>>
>>> Moe Jette
>>>
>>>
>>> Quoting Alan Orth <[email protected]>:
>>>
>>>> I'm having problems getting basic partition-based preemption working.
>>>> For testing purposes I've set up a cluster with 4 CPUs and two
>>>> partitions (each with different priorities). I can't figure out how to
>>>> get the higher-priority partition to preempt the lower-priority
>>>> partition.  This test configuration has 4 CPU slots.
>>>>
>>>> First, ask for 4 CPUs, in the batch partition.
>>>> $ salloc -n4 -p batch openssl speed
>>>> salloc: Granted job allocation 68
>>>> Doing md2 for 3s on 16 size blocks: 305643 md2's in 2.97s
>>>>
>>>> Second, ask for 4 CPUs, in the interactive partition:
>>>> $ salloc -n4 -p interactive openssl speed
>>>> salloc: Pending job allocation 71
>>>> salloc: job 71 queued and waiting for resources
>>>>
>>>> With PreemptMode=SUSPEND it will wait until the low-priority job
>>>> finishes (as shown above).  If PreemptMode=CANCEL or REQUEUE, the
>>>> low-priority job allocation is "revoked", but the job keeps running!!!
>>>>  Have I misread or misunderstood something about Preemption in
>>>> partitions?
>>>>
>>>> Thanks!
>>>>
>>>> Here are the relevant configuration options I've set:
>>>>
>>>> From slurm.conf:
>>>> SchedulerType=sched/backfill
>>>> SelectType=select/cons_res
>>>> SelectTypeParameters=CR_CPU
>>>> PreemptMode=SUSPEND,GANG
>>>> PreemptType=preempt/partition_prio
>>>> NodeName=noma CoresPerSocket=4 ThreadsPerCore=1 Sockets=1 State=UNKNOWN
>>>> PartitionName=batch Nodes=noma Default=NO DefaultTime=INFINITE
>>>> MaxTime=INFINITE State=UP Priority=10 Shared=Force:1
>>>> PartitionName=interactive Nodes=noma Default=NO MaxTime=INFINITE
>>>> State=UP Priority=20 Shared=NO
>>>>
>>>> --
>>>> Alan Orth
>>>> [email protected]
>>>> http://alaninkenya.org
>>>> http://mjanja.co.ke
>>>> "You cannot simultaneously prevent and prepare for war." -Albert
>>>> Einstein
>>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>> --
>> Alan Orth
>> [email protected]
>> http://alaninkenya.org
>> http://mjanja.co.ke
>> "You cannot simultaneously prevent and prepare for war." -Albert Einstein
>>
>
>
>
>



-- 
Alan Orth
[email protected]
http://alaninkenya.org
http://mjanja.co.ke
"You cannot simultaneously prevent and prepare for war." -Albert Einstein

Reply via email to