Re: Try setting kern.sched.preempt_thresh != 0

2018-04-04 Thread Peter

Stefan Esser wrote:


I'm guessing that the problem is caused by kern.sched.preempt_thresh=0, which
prevents preemption of low priority processes by interactive or I/O bound
processes.

For a quick test try:

# sysctl kern.sched.preempt_thresh=1


Hi Stefan,

thank You, thats an interesting knob! Only it is actually the other way 
round: it is not set to 0. My settings (as default) are:


kern.sched.steal_thresh: 2
kern.sched.steal_idle: 1
kern.sched.balance_interval: 127
kern.sched.balance: 1
kern.sched.affinity: 1
kern.sched.idlespinthresh: 157
kern.sched.idlespins: 1
kern.sched.static_boost: 152
kern.sched.preempt_thresh: 80
kern.sched.interact: 30
kern.sched.slice: 12
kern.sched.quantum: 94488
kern.sched.name: ULE
kern.sched.preemption: 1
kern.sched.cpusetsize: 4

But then, if I change kern.sched.preempt_thresh to 1 *OR* 0, things 
behave properly! Precisely, changing from 8 down to 7 changes things 
completely:


>poolalloc   free   read  write   read  write
>cache   -  -  -  -  -  -
>  ada1s47.08G  10.9G927  0  7.32M  0

>  PID USERNAME   PRI NICE   SIZERES STATETIMEWCPU COMMAND
> 1900 pgsql   820   618M 17532K RUN  0:53  34.90% postgres
> 1911 admin   810  7044K  2824K RUN  6:07  28.34% bash

(Notice the PRI values which also look differnt now.)

rgds,
P.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Try setting kern.sched.preempt_thresh != 0 (was: Re: kern.sched.quantum: Creepy, sadistic scheduler)

2018-04-04 Thread Stefan Esser
Am 04.04.18 um 12:39 schrieb Alban Hertroys:
> 
>> On 4 Apr 2018, at 2:52, Peter  wrote:
>>
>> Occasionally I noticed that the system would not quickly process the
>> tasks i need done, but instead prefer other, longrunning tasks. I
>> figured it must be related to the scheduler, and decided it hates me.
> 
> If it hated you, it would behave much worse.
> 
>> A closer look shows the behaviour as follows (single CPU):
> 
> A single CPU? That's becoming rare! Is that a VM? Old hardware? Something 
> really specific?
> 
>> Lets run an I/O-active task, e.g, postgres VACUUM that would
> 
> And you're running a multi-process database server on it no less. That is 
> going to hurt, no matter how well the scheduler works.
> 
>> continuousely read from big files (while doing compute as well [1]):
>>> poolalloc   free   read  write   read  write
>>> cache   -  -  -  -  -  -
>>>  ada1s47.08G  10.9G  1.58K  0  12.9M  0
>>
>> Now start an endless loop:
>> # while true; do :; done
>>
>> And the effect is:
>>> poolalloc   free   read  write   read  write
>>> cache   -  -  -  -  -  -
>>>  ada1s47.08G  10.9G  9  0  76.8K  0
>>
>> The VACUUM gets almost stuck! This figures with WCPU in "top":
>>
>>>  PID USERNAME   PRI NICE   SIZERES STATETIMEWCPU COMMAND
>>> 85583 root990  7044K  1944K RUN  1:06  92.21% bash
>>> 53005 pgsql   520   620M 91856K RUN  5:47   0.50% postgres
>>
>> Hacking on kern.sched.quantum makes it quite a bit better:
>> # sysctl kern.sched.quantum=1
>> kern.sched.quantum: 94488 -> 7874
>>
>>> poolalloc   free   read  write   read  write
>>> cache   -  -  -  -  -  -
>>>  ada1s47.08G  10.9G395  0  3.12M  0
>>
>>>  PID USERNAME   PRI NICE   SIZERES STATETIMEWCPU COMMAND
>>> 85583 root940  7044K  1944K RUN  4:13  70.80% bash
>>> 53005 pgsql   520   276M 91856K RUN  5:52  11.83% postgres
>>
>>
>> Now, as usual, the "root-cause" questions arise: What exactly does
>> this "quantum"? Is this solution a workaround, i.e. actually something
>> else is wrong, and has it tradeoff in other situations? Or otherwise,
>> why is such a default value chosen, which appears to be ill-deceived?
>>
>> The docs for the quantum parameter are a bit unsatisfying - they say
>> its the max num of ticks a process gets - and what happens when
>> they're exhausted? If by default the endless loop is actually allowed
>> to continue running for 94k ticks (or 94ms, more likely) uninterrupted,
>> then that explains the perceived behaviour - buts thats certainly not
>> what a scheduler should do when other procs are ready to run.
> 
> I can answer this from the operating systems course I followed recently. This 
> does not apply to FreeBSD specifically, it is general job scheduling theory. 
> I still need to read up on SCHED_ULE to see how the details were implemented 
> there. Or are you using the older SCHED_4BSD?
> Anyway...
> 
> Jobs that are ready to run are collected on a ready queue. Since you have a 
> single CPU, there can only be a single job active on the CPU. When that job 
> is finished, the scheduler takes the next job in the ready queue and assigns 
> it to the CPU, etc.

I'm guessing that the problem is caused by kern.sched.preempt_thresh=0, which
prevents preemption of low priority processes by interactive or I/O bound
processes.

For a quick test try:

# sysctl kern.sched.preempt_thresh=1

to see whether it makes a difference. The value 1 is unreasonably low, but it
has the most visible effect in that any higher priority process can steal the
CPU from any lower priority one (high priority corresponds to low PRI values
as displayed by ps -l or top).

Reasonable values might be in the range of 80 to 224 depending on the system
usage scenario (that's what I found to have been suggested in the mail-lists).

Higher values result in less preemption.

Regards, STefan
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"