Re: [PATCH] kthread: always create the kernel threads with normal priority
* Michal Schmidt <[EMAIL PROTECTED]> wrote: > Allow the administrator to change kthreadd's priority and affinity. > Ensure that the kernel threads are created with the usual nice level > and affinity even if kthreadd's properties were changed from the > default. > > Signed-off-by: Michal Schmidt <[EMAIL PROTECTED]> thanks Michal, applied your patch to sched-devel.git. Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] kthread: always create the kernel threads with normal priority
On Mon, 7 Jan 2008 09:29:56 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > On Mon, 7 Jan 2008 12:09:04 +0100 Ingo Molnar <[EMAIL PROTECTED]> wrote: > > > > > > > This causes a practical problem. When a runaway real-time task > > > > is eating 100% CPU and we attempt to put the CPU offline, > > > > sometimes we block while waiting for the creation of the > > > > highest-priority "kstopmachine" thread. > > > > sched-devel.git has new mechanisms against runaway RT tasks. > > There's a new RLIMIT_RTTIME rlimit - if an RT task exceeds that > > rlimit then it is sent SIGXCPU. > > Is that "total RT CPU time" or "elapsed time since last schedule()"? > > If the former, it is not useful for this problem. It's "runtime since last sleep" so it is useful. I still think the kthread patch is good to have anyway. The user can have other reasons to change kthreadd's priority/cpumask. Michal -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] kthread: always create the kernel threads with normal priority
On Mon, 7 Jan 2008 09:29:56 -0800 Andrew Morton [EMAIL PROTECTED] wrote: On Mon, 7 Jan 2008 12:09:04 +0100 Ingo Molnar [EMAIL PROTECTED] wrote: This causes a practical problem. When a runaway real-time task is eating 100% CPU and we attempt to put the CPU offline, sometimes we block while waiting for the creation of the highest-priority kstopmachine thread. sched-devel.git has new mechanisms against runaway RT tasks. There's a new RLIMIT_RTTIME rlimit - if an RT task exceeds that rlimit then it is sent SIGXCPU. Is that total RT CPU time or elapsed time since last schedule()? If the former, it is not useful for this problem. It's runtime since last sleep so it is useful. I still think the kthread patch is good to have anyway. The user can have other reasons to change kthreadd's priority/cpumask. Michal -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] kthread: always create the kernel threads with normal priority
* Michal Schmidt [EMAIL PROTECTED] wrote: Allow the administrator to change kthreadd's priority and affinity. Ensure that the kernel threads are created with the usual nice level and affinity even if kthreadd's properties were changed from the default. Signed-off-by: Michal Schmidt [EMAIL PROTECTED] thanks Michal, applied your patch to sched-devel.git. Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] kthread: always create the kernel threads with normal priority
On Mon, 2008-01-07 at 09:29 -0800, Andrew Morton wrote: > On Mon, 7 Jan 2008 12:09:04 +0100 Ingo Molnar <[EMAIL PROTECTED]> wrote: > > > > > > > This causes a practical problem. When a runaway real-time task is > > > > eating 100% CPU and we attempt to put the CPU offline, sometimes we > > > > block while waiting for the creation of the highest-priority > > > > "kstopmachine" thread. > > > > sched-devel.git has new mechanisms against runaway RT tasks. There's a > > new RLIMIT_RTTIME rlimit - if an RT task exceeds that rlimit then it is > > sent SIGXCPU. > > Is that "total RT CPU time" or "elapsed time since last schedule()"? > > If the former, it is not useful for this problem. > > > there's also a new group scheduling extension that is driven via a > > sysctl: > > > > /proc/sys/kernel/sched_rt_ratio > > > > this way if a user has a runaway RT task, other users (and root) will > > still have some CPU time left. (in Peter's latest patchset that is > > replaced via rt_runtime_ns - but this is a detail) > > Doesn't this make the RT task non-RT? Would need to understand more > details to tell. Its an artifact of rt group scheduling. Each group will have to specify a period and runtime limit therein (and the normalized sum thereof must not exceed the total time available - otherwise the set is not schedulable). So say we have two groups A and B. A has a period of 2 seconds and a runtime limit of 1, that gives him an avg of 50% cpu time. If B then has a period of 1 second with a runtime limit of .25s (avg 25%) the total time required to schedule the realtime groups would be 75% on average. Without group scheduling everything is considered one group but we still have the period and runtime limits. So as long as the realtime cpu usage fits within the given limits it acts as before. Once it exceeds its limit it will be capped hard - which is ok, since it exceeded its hard limit, and realtime applications are supposed to be deterministic and thus be able to tell how much time they'd require. [ If only this model were true, but its a model frequently used and quite accepted ] signature.asc Description: This is a digitally signed message part
Re: [PATCH] kthread: always create the kernel threads with normal priority
On Mon, 7 Jan 2008 12:09:04 +0100 Ingo Molnar <[EMAIL PROTECTED]> wrote: > > > > This causes a practical problem. When a runaway real-time task is > > > eating 100% CPU and we attempt to put the CPU offline, sometimes we > > > block while waiting for the creation of the highest-priority > > > "kstopmachine" thread. > > sched-devel.git has new mechanisms against runaway RT tasks. There's a > new RLIMIT_RTTIME rlimit - if an RT task exceeds that rlimit then it is > sent SIGXCPU. Is that "total RT CPU time" or "elapsed time since last schedule()"? If the former, it is not useful for this problem. > there's also a new group scheduling extension that is driven via a > sysctl: > > /proc/sys/kernel/sched_rt_ratio > > this way if a user has a runaway RT task, other users (and root) will > still have some CPU time left. (in Peter's latest patchset that is > replaced via rt_runtime_ns - but this is a detail) Doesn't this make the RT task non-RT? Would need to understand more details to tell. > so instead of the never-ending arms race of kernel thread priorities > against RT task priorities, we are going towards making RT tasks safer > on a policy level. > > Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] kthread: always create the kernel threads with normal priority
Hello Michal, > > Maybe we can find a way to use a similar mechanism as I used in my > > patchset for the priorities of the remaining kthreads. > > I do not like the way of forcing userland to change the priorities, > > because that would require a userland with the chrt tool installed, > > and that is not that practical for embedded systems (in which there > > could be cases that there is no userland at all, or the init-process > > is the whole embedded application). In that case an option to do it on > > the kernel commandline is more practical. > > > > I propose this kernel cmd-line option: > > kthread_pmap=somethread:50,otherthread:12,34 > > I see. kthreadd would look up the priority for itself and > kthread_create would consult the map for all other kernel threads. > That should work. > Your sirq_pmap would not be needed anymore, as kthread_pmap could be > used for softirq threads too, right? That is correct. The soft-irqs are just ordinary kernel-threads, but irq_pmap is still needed, to set the priority of a certain interrupt handler. In this case it also possible to set the prio of the IRQ-kthreads as well as the prio of a certain interrupt handler. This might give some conflicts, and I have to check how to resolve these. Kind Regards, Remy 2008/1/7, Michal Schmidt <[EMAIL PROTECTED]>: > On Mon, 7 Jan 2008 12:22:51 +0100 > "Remy Bohmer" <[EMAIL PROTECTED]> wrote: > > > Hello Michal and Andrew, > > > > > Let's not make the decision for the user. Just allow the > > > administrator to change kthreadd's priority safely if he chooses to > > > do it. Ensure that the kernel threads are created with the usual > > > nice level even if kthreadd's priority is changed from the default. > > > > Last year, I posted a patchset (that was meant for Preempt-RT at that > > time) to be able to prioritise the interrupt-handler-threads (which > > are kthreads) and softirq-threads from the kernel commandline. See > > http://lkml.org/lkml/2007/12/19/208 > > > > Maybe we can find a way to use a similar mechanism as I used in my > > patchset for the priorities of the remaining kthreads. > > I do not like the way of forcing userland to change the priorities, > > because that would require a userland with the chrt tool installed, > > and that is not that practical for embedded systems (in which there > > could be cases that there is no userland at all, or the init-process > > is the whole embedded application). In that case an option to do it on > > the kernel commandline is more practical. > > > > I propose this kernel cmd-line option: > > kthread_pmap=somethread:50,otherthread:12,34 > > I see. kthreadd would look up the priority for itself and > kthread_create would consult the map for all other kernel threads. > That should work. > Your sirq_pmap would not be needed anymore, as kthread_pmap could be > used for softirq threads too, right? > > Michal > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] kthread: always create the kernel threads with normal priority
On Mon, 7 Jan 2008 02:25:13 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > On Mon, 7 Jan 2008 11:06:03 +0100 Michal Schmidt > <[EMAIL PROTECTED]> wrote: > > > On Sat, 22 Dec 2007 01:30:21 -0800 > > Andrew Morton <[EMAIL PROTECTED]> wrote: > > > > > On Mon, 17 Dec 2007 23:43:14 +0100 Michal Schmidt > > > <[EMAIL PROTECTED]> wrote: > > > > > > > kthreadd, the creator of other kernel threads, runs as a normal > > > > priority task. This is a potential for priority inversion when a > > > > task wants to spawn a high-priority kernel thread. A middle > > > > priority SCHED_FIFO task can block kthreadd's execution > > > > indefinitely and thus prevent the timely creation of the > > > > high-priority kernel thread. > > > > This causes a practical problem. When a runaway real-time task > > > > is eating 100% CPU and we attempt to put the CPU offline, > > > > sometimes we block while waiting for the creation of the > > > > highest-priority "kstopmachine" thread. > > > > > > > > The fix is to run kthreadd with the highest possible SCHED_FIFO > > > > priority. Its children must still run as slightly negatively > > > > reniced SCHED_NORMAL tasks. > > > > > > Did you hit this problem with the stock kernel, or have you been > > > working on other stuff? > > > > This was with RHEL5 and with current Fedora kernels. > > > > > A locked-up SCHED_FIFO process will cause kernel threads all > > > sorts of problems. You've hit one instance, but there will be > > > others. (pdflush stops working, for one). > > > > > > The general approach we've taken to this is "don't do that". > > > Yes, we could boost lots of kernel threads in the way which this > > > patch does but this actually takes control *away* from > > > userspace. Userspace no longer has the ability to guarantee > > > itself minimum possible latency without getting preempted by > > > kernel threads. > > > > > > And yes, giving userspace this minimum-latency capability does > > > imply that userspace has a responsibility to not 100% starve > > > kernel threads. It's a reasonable compromise, I think? > > > > You're right. We should not run kthreadd with SCHED_FIFO by default. > > But the user should be able to change it using chrt if he wants to > > avoid this particular problem. So how about this instead?: > > > > > > > > kthreadd, the creator of other kernel threads, runs as a normal > > priority task. This is a potential for priority inversion when a > > task wants to spawn a high-priority kernel thread. A middle > > priority SCHED_FIFO task can block kthreadd's execution > > indefinitely and thus prevent the timely creation of the > > high-priority kernel thread. > > > > This causes a practical problem. When a runaway real-time task is > > eating 100% CPU and we attempt to put the CPU offline, sometimes we > > block while waiting for the creation of the highest-priority > > "kstopmachine" thread. > > > > This could be solved by always running kthreadd with the highest > > possible SCHED_FIFO priority, but that would be undesirable policy > > decision in the kernel. kthreadd would cause unwanted latencies > > even for the realtime users who know what they're doing. > > > > Let's not make the decision for the user. Just allow the > > administrator to change kthreadd's priority safely if he chooses to > > do it. Ensure that the kernel threads are created with the usual > > nice level even if kthreadd's priority is changed from the default. > > > > Signed-off-by: Michal Schmidt <[EMAIL PROTECTED]> > > --- > > kernel/kthread.c | 11 +++ > > 1 files changed, 11 insertions(+), 0 deletions(-) > > > > diff --git a/kernel/kthread.c b/kernel/kthread.c > > index dcfe724..e832a85 100644 > > --- a/kernel/kthread.c > > +++ b/kernel/kthread.c > > @@ -94,10 +94,21 @@ static void create_kthread(struct > > kthread_create_info *create) if (pid < 0) { > > create->result = ERR_PTR(pid); > > } else { > > + struct sched_param param = { .sched_priority = 0 }; > > wait_for_completion(>started); > > read_lock(_lock); > > create->result = find_task_by_pid(pid); > > read_unlock(_lock); > > + /* > > +* root may want to change our (kthreadd's) > > priority to > > +* realtime to solve a corner case priority > > inversion problem > > +* (a realtime task consuming 100% CPU blocking > > the creation of > > +* kernel threads). The kernel thread should not > > inherit the > > +* higher priority. Let's always create it with > > the usual nice > > +* level. > > +*/ > > + sched_setscheduler(create->result, SCHED_NORMAL, > > ); > > + set_user_nice(create->result, -5); > > } > > complete(>done); > > } > > Seems reasonable. > > As a followup thing, we now have two hard-coded magical -5's in > kthread.c. It'd be nice to add a #define for this. Done. > It'd be nicer to work out
Re: [PATCH] kthread: always create the kernel threads with normal priority
On Mon, 7 Jan 2008 12:22:51 +0100 "Remy Bohmer" <[EMAIL PROTECTED]> wrote: > Hello Michal and Andrew, > > > Let's not make the decision for the user. Just allow the > > administrator to change kthreadd's priority safely if he chooses to > > do it. Ensure that the kernel threads are created with the usual > > nice level even if kthreadd's priority is changed from the default. > > Last year, I posted a patchset (that was meant for Preempt-RT at that > time) to be able to prioritise the interrupt-handler-threads (which > are kthreads) and softirq-threads from the kernel commandline. See > http://lkml.org/lkml/2007/12/19/208 > > Maybe we can find a way to use a similar mechanism as I used in my > patchset for the priorities of the remaining kthreads. > I do not like the way of forcing userland to change the priorities, > because that would require a userland with the chrt tool installed, > and that is not that practical for embedded systems (in which there > could be cases that there is no userland at all, or the init-process > is the whole embedded application). In that case an option to do it on > the kernel commandline is more practical. > > I propose this kernel cmd-line option: > kthread_pmap=somethread:50,otherthread:12,34 I see. kthreadd would look up the priority for itself and kthread_create would consult the map for all other kernel threads. That should work. Your sirq_pmap would not be needed anymore, as kthread_pmap could be used for softirq threads too, right? Michal -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] kthread: always create the kernel threads with normal priority
Hello Michal and Andrew, > Let's not make the decision for the user. Just allow the administrator to > change kthreadd's priority safely if he chooses to do it. Ensure that the > kernel threads are created with the usual nice level even if kthreadd's > priority is changed from the default. Last year, I posted a patchset (that was meant for Preempt-RT at that time) to be able to prioritise the interrupt-handler-threads (which are kthreads) and softirq-threads from the kernel commandline. See http://lkml.org/lkml/2007/12/19/208 Maybe we can find a way to use a similar mechanism as I used in my patchset for the priorities of the remaining kthreads. I do not like the way of forcing userland to change the priorities, because that would require a userland with the chrt tool installed, and that is not that practical for embedded systems (in which there could be cases that there is no userland at all, or the init-process is the whole embedded application). In that case an option to do it on the kernel commandline is more practical. I propose this kernel cmd-line option: kthread_pmap=somethread:50,otherthread:12,34 Then threads can be started as SCHED_NORMAL, and when overruled inside the kthread-sources itself, or by the kernel commandline, the user can set them to something else. What do you think of this? (notice that I am reworking the review comments I received on this patch-series right now, and that I can take such change into account immediately) Kind Regards, Remy 2008/1/7, Michal Schmidt <[EMAIL PROTECTED]>: > On Sat, 22 Dec 2007 01:30:21 -0800 > Andrew Morton <[EMAIL PROTECTED]> wrote: > > > On Mon, 17 Dec 2007 23:43:14 +0100 Michal Schmidt > > <[EMAIL PROTECTED]> wrote: > > > > > kthreadd, the creator of other kernel threads, runs as a normal > > > priority task. This is a potential for priority inversion when a > > > task wants to spawn a high-priority kernel thread. A middle priority > > > SCHED_FIFO task can block kthreadd's execution indefinitely and thus > > > prevent the timely creation of the high-priority kernel thread. > > > > > > This causes a practical problem. When a runaway real-time task is > > > eating 100% CPU and we attempt to put the CPU offline, sometimes we > > > block while waiting for the creation of the highest-priority > > > "kstopmachine" thread. > > > > > > The fix is to run kthreadd with the highest possible SCHED_FIFO > > > priority. Its children must still run as slightly negatively reniced > > > SCHED_NORMAL tasks. > > > > Did you hit this problem with the stock kernel, or have you been > > working on other stuff? > > This was with RHEL5 and with current Fedora kernels. > > > A locked-up SCHED_FIFO process will cause kernel threads all sorts of > > problems. You've hit one instance, but there will be others. > > (pdflush stops working, for one). > > > > The general approach we've taken to this is "don't do that". Yes, we > > could boost lots of kernel threads in the way which this patch does > > but this actually takes control *away* from userspace. Userspace no > > longer has the ability to guarantee itself minimum possible latency > > without getting preempted by kernel threads. > > > > And yes, giving userspace this minimum-latency capability does imply > > that userspace has a responsibility to not 100% starve kernel > > threads. It's a reasonable compromise, I think? > > You're right. We should not run kthreadd with SCHED_FIFO by default. > But the user should be able to change it using chrt if he wants to > avoid this particular problem. So how about this instead?: > > > > kthreadd, the creator of other kernel threads, runs as a normal priority task. > This is a potential for priority inversion when a task wants to spawn a > high-priority kernel thread. A middle priority SCHED_FIFO task can block > kthreadd's execution indefinitely and thus prevent the timely creation of the > high-priority kernel thread. > > This causes a practical problem. When a runaway real-time task is eating 100% > CPU and we attempt to put the CPU offline, sometimes we block while waiting > for > the creation of the highest-priority "kstopmachine" thread. > > This could be solved by always running kthreadd with the highest possible > SCHED_FIFO priority, but that would be undesirable policy decision in the > kernel. kthreadd would cause unwanted latencies even for the realtime users > who > know what they're doing. > > Let's not make the decision for the user. Just allow the administrator to > change kthreadd's priority safely if he chooses to do it. Ensure that the > kernel threads are created with the usual nice level even if kthreadd's > priority is changed from the default. > > Signed-off-by: Michal Schmidt <[EMAIL PROTECTED]> > --- > kernel/kthread.c | 11 +++ > 1 files changed, 11 insertions(+), 0 deletions(-) > > diff --git a/kernel/kthread.c b/kernel/kthread.c > index dcfe724..e832a85 100644 > --- a/kernel/kthread.c > +++ b/kernel/kthread.c > @@ -94,10 +94,21
Re: [PATCH] kthread: always create the kernel threads with normal priority
> > This causes a practical problem. When a runaway real-time task is > > eating 100% CPU and we attempt to put the CPU offline, sometimes we > > block while waiting for the creation of the highest-priority > > "kstopmachine" thread. sched-devel.git has new mechanisms against runaway RT tasks. There's a new RLIMIT_RTTIME rlimit - if an RT task exceeds that rlimit then it is sent SIGXCPU. there's also a new group scheduling extension that is driven via a sysctl: /proc/sys/kernel/sched_rt_ratio this way if a user has a runaway RT task, other users (and root) will still have some CPU time left. (in Peter's latest patchset that is replaced via rt_runtime_ns - but this is a detail) so instead of the never-ending arms race of kernel thread priorities against RT task priorities, we are going towards making RT tasks safer on a policy level. Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] kthread: always create the kernel threads with normal priority
On Mon, 7 Jan 2008 11:06:03 +0100 Michal Schmidt <[EMAIL PROTECTED]> wrote: > On Sat, 22 Dec 2007 01:30:21 -0800 > Andrew Morton <[EMAIL PROTECTED]> wrote: > > > On Mon, 17 Dec 2007 23:43:14 +0100 Michal Schmidt > > <[EMAIL PROTECTED]> wrote: > > > > > kthreadd, the creator of other kernel threads, runs as a normal > > > priority task. This is a potential for priority inversion when a > > > task wants to spawn a high-priority kernel thread. A middle priority > > > SCHED_FIFO task can block kthreadd's execution indefinitely and thus > > > prevent the timely creation of the high-priority kernel thread. > > > > > > This causes a practical problem. When a runaway real-time task is > > > eating 100% CPU and we attempt to put the CPU offline, sometimes we > > > block while waiting for the creation of the highest-priority > > > "kstopmachine" thread. > > > > > > The fix is to run kthreadd with the highest possible SCHED_FIFO > > > priority. Its children must still run as slightly negatively reniced > > > SCHED_NORMAL tasks. > > > > Did you hit this problem with the stock kernel, or have you been > > working on other stuff? > > This was with RHEL5 and with current Fedora kernels. > > > A locked-up SCHED_FIFO process will cause kernel threads all sorts of > > problems. You've hit one instance, but there will be others. > > (pdflush stops working, for one). > > > > The general approach we've taken to this is "don't do that". Yes, we > > could boost lots of kernel threads in the way which this patch does > > but this actually takes control *away* from userspace. Userspace no > > longer has the ability to guarantee itself minimum possible latency > > without getting preempted by kernel threads. > > > > And yes, giving userspace this minimum-latency capability does imply > > that userspace has a responsibility to not 100% starve kernel > > threads. It's a reasonable compromise, I think? > > You're right. We should not run kthreadd with SCHED_FIFO by default. > But the user should be able to change it using chrt if he wants to > avoid this particular problem. So how about this instead?: > > > > kthreadd, the creator of other kernel threads, runs as a normal priority task. > This is a potential for priority inversion when a task wants to spawn a > high-priority kernel thread. A middle priority SCHED_FIFO task can block > kthreadd's execution indefinitely and thus prevent the timely creation of the > high-priority kernel thread. > > This causes a practical problem. When a runaway real-time task is eating 100% > CPU and we attempt to put the CPU offline, sometimes we block while waiting > for > the creation of the highest-priority "kstopmachine" thread. > > This could be solved by always running kthreadd with the highest possible > SCHED_FIFO priority, but that would be undesirable policy decision in the > kernel. kthreadd would cause unwanted latencies even for the realtime users > who > know what they're doing. > > Let's not make the decision for the user. Just allow the administrator to > change kthreadd's priority safely if he chooses to do it. Ensure that the > kernel threads are created with the usual nice level even if kthreadd's > priority is changed from the default. > > Signed-off-by: Michal Schmidt <[EMAIL PROTECTED]> > --- > kernel/kthread.c | 11 +++ > 1 files changed, 11 insertions(+), 0 deletions(-) > > diff --git a/kernel/kthread.c b/kernel/kthread.c > index dcfe724..e832a85 100644 > --- a/kernel/kthread.c > +++ b/kernel/kthread.c > @@ -94,10 +94,21 @@ static void create_kthread(struct kthread_create_info > *create) > if (pid < 0) { > create->result = ERR_PTR(pid); > } else { > + struct sched_param param = { .sched_priority = 0 }; > wait_for_completion(>started); > read_lock(_lock); > create->result = find_task_by_pid(pid); > read_unlock(_lock); > + /* > + * root may want to change our (kthreadd's) priority to > + * realtime to solve a corner case priority inversion problem > + * (a realtime task consuming 100% CPU blocking the creation of > + * kernel threads). The kernel thread should not inherit the > + * higher priority. Let's always create it with the usual nice > + * level. > + */ > + sched_setscheduler(create->result, SCHED_NORMAL, ); > + set_user_nice(create->result, -5); > } > complete(>done); > } Seems reasonable. As a followup thing, we now have two hard-coded magical -5's in kthread.c. It'd be nice to add a #define for this. It'd be nicer to work out where on earth that -5 came from too ;) Readers might wonder why kthreadd children disinherit kthreadd's policy and priority, but retain its cpus_allowed (and whatever other stuff root could have altered?) -- To unsubscribe from this list: send the line "unsubscribe
[PATCH] kthread: always create the kernel threads with normal priority
On Sat, 22 Dec 2007 01:30:21 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > On Mon, 17 Dec 2007 23:43:14 +0100 Michal Schmidt > <[EMAIL PROTECTED]> wrote: > > > kthreadd, the creator of other kernel threads, runs as a normal > > priority task. This is a potential for priority inversion when a > > task wants to spawn a high-priority kernel thread. A middle priority > > SCHED_FIFO task can block kthreadd's execution indefinitely and thus > > prevent the timely creation of the high-priority kernel thread. > > > > This causes a practical problem. When a runaway real-time task is > > eating 100% CPU and we attempt to put the CPU offline, sometimes we > > block while waiting for the creation of the highest-priority > > "kstopmachine" thread. > > > > The fix is to run kthreadd with the highest possible SCHED_FIFO > > priority. Its children must still run as slightly negatively reniced > > SCHED_NORMAL tasks. > > Did you hit this problem with the stock kernel, or have you been > working on other stuff? This was with RHEL5 and with current Fedora kernels. > A locked-up SCHED_FIFO process will cause kernel threads all sorts of > problems. You've hit one instance, but there will be others. > (pdflush stops working, for one). > > The general approach we've taken to this is "don't do that". Yes, we > could boost lots of kernel threads in the way which this patch does > but this actually takes control *away* from userspace. Userspace no > longer has the ability to guarantee itself minimum possible latency > without getting preempted by kernel threads. > > And yes, giving userspace this minimum-latency capability does imply > that userspace has a responsibility to not 100% starve kernel > threads. It's a reasonable compromise, I think? You're right. We should not run kthreadd with SCHED_FIFO by default. But the user should be able to change it using chrt if he wants to avoid this particular problem. So how about this instead?: kthreadd, the creator of other kernel threads, runs as a normal priority task. This is a potential for priority inversion when a task wants to spawn a high-priority kernel thread. A middle priority SCHED_FIFO task can block kthreadd's execution indefinitely and thus prevent the timely creation of the high-priority kernel thread. This causes a practical problem. When a runaway real-time task is eating 100% CPU and we attempt to put the CPU offline, sometimes we block while waiting for the creation of the highest-priority "kstopmachine" thread. This could be solved by always running kthreadd with the highest possible SCHED_FIFO priority, but that would be undesirable policy decision in the kernel. kthreadd would cause unwanted latencies even for the realtime users who know what they're doing. Let's not make the decision for the user. Just allow the administrator to change kthreadd's priority safely if he chooses to do it. Ensure that the kernel threads are created with the usual nice level even if kthreadd's priority is changed from the default. Signed-off-by: Michal Schmidt <[EMAIL PROTECTED]> --- kernel/kthread.c | 11 +++ 1 files changed, 11 insertions(+), 0 deletions(-) diff --git a/kernel/kthread.c b/kernel/kthread.c index dcfe724..e832a85 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -94,10 +94,21 @@ static void create_kthread(struct kthread_create_info *create) if (pid < 0) { create->result = ERR_PTR(pid); } else { + struct sched_param param = { .sched_priority = 0 }; wait_for_completion(>started); read_lock(_lock); create->result = find_task_by_pid(pid); read_unlock(_lock); + /* +* root may want to change our (kthreadd's) priority to +* realtime to solve a corner case priority inversion problem +* (a realtime task consuming 100% CPU blocking the creation of +* kernel threads). The kernel thread should not inherit the +* higher priority. Let's always create it with the usual nice +* level. +*/ + sched_setscheduler(create->result, SCHED_NORMAL, ); + set_user_nice(create->result, -5); } complete(>done); } -- 1.5.3.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] kthread: always create the kernel threads with normal priority
This causes a practical problem. When a runaway real-time task is eating 100% CPU and we attempt to put the CPU offline, sometimes we block while waiting for the creation of the highest-priority kstopmachine thread. sched-devel.git has new mechanisms against runaway RT tasks. There's a new RLIMIT_RTTIME rlimit - if an RT task exceeds that rlimit then it is sent SIGXCPU. there's also a new group scheduling extension that is driven via a sysctl: /proc/sys/kernel/sched_rt_ratio this way if a user has a runaway RT task, other users (and root) will still have some CPU time left. (in Peter's latest patchset that is replaced via rt_runtime_ns - but this is a detail) so instead of the never-ending arms race of kernel thread priorities against RT task priorities, we are going towards making RT tasks safer on a policy level. Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] kthread: always create the kernel threads with normal priority
Hello Michal and Andrew, Let's not make the decision for the user. Just allow the administrator to change kthreadd's priority safely if he chooses to do it. Ensure that the kernel threads are created with the usual nice level even if kthreadd's priority is changed from the default. Last year, I posted a patchset (that was meant for Preempt-RT at that time) to be able to prioritise the interrupt-handler-threads (which are kthreads) and softirq-threads from the kernel commandline. See http://lkml.org/lkml/2007/12/19/208 Maybe we can find a way to use a similar mechanism as I used in my patchset for the priorities of the remaining kthreads. I do not like the way of forcing userland to change the priorities, because that would require a userland with the chrt tool installed, and that is not that practical for embedded systems (in which there could be cases that there is no userland at all, or the init-process is the whole embedded application). In that case an option to do it on the kernel commandline is more practical. I propose this kernel cmd-line option: kthread_pmap=somethread:50,otherthread:12,34 Then threads can be started as SCHED_NORMAL, and when overruled inside the kthread-sources itself, or by the kernel commandline, the user can set them to something else. What do you think of this? (notice that I am reworking the review comments I received on this patch-series right now, and that I can take such change into account immediately) Kind Regards, Remy 2008/1/7, Michal Schmidt [EMAIL PROTECTED]: On Sat, 22 Dec 2007 01:30:21 -0800 Andrew Morton [EMAIL PROTECTED] wrote: On Mon, 17 Dec 2007 23:43:14 +0100 Michal Schmidt [EMAIL PROTECTED] wrote: kthreadd, the creator of other kernel threads, runs as a normal priority task. This is a potential for priority inversion when a task wants to spawn a high-priority kernel thread. A middle priority SCHED_FIFO task can block kthreadd's execution indefinitely and thus prevent the timely creation of the high-priority kernel thread. This causes a practical problem. When a runaway real-time task is eating 100% CPU and we attempt to put the CPU offline, sometimes we block while waiting for the creation of the highest-priority kstopmachine thread. The fix is to run kthreadd with the highest possible SCHED_FIFO priority. Its children must still run as slightly negatively reniced SCHED_NORMAL tasks. Did you hit this problem with the stock kernel, or have you been working on other stuff? This was with RHEL5 and with current Fedora kernels. A locked-up SCHED_FIFO process will cause kernel threads all sorts of problems. You've hit one instance, but there will be others. (pdflush stops working, for one). The general approach we've taken to this is don't do that. Yes, we could boost lots of kernel threads in the way which this patch does but this actually takes control *away* from userspace. Userspace no longer has the ability to guarantee itself minimum possible latency without getting preempted by kernel threads. And yes, giving userspace this minimum-latency capability does imply that userspace has a responsibility to not 100% starve kernel threads. It's a reasonable compromise, I think? You're right. We should not run kthreadd with SCHED_FIFO by default. But the user should be able to change it using chrt if he wants to avoid this particular problem. So how about this instead?: kthreadd, the creator of other kernel threads, runs as a normal priority task. This is a potential for priority inversion when a task wants to spawn a high-priority kernel thread. A middle priority SCHED_FIFO task can block kthreadd's execution indefinitely and thus prevent the timely creation of the high-priority kernel thread. This causes a practical problem. When a runaway real-time task is eating 100% CPU and we attempt to put the CPU offline, sometimes we block while waiting for the creation of the highest-priority kstopmachine thread. This could be solved by always running kthreadd with the highest possible SCHED_FIFO priority, but that would be undesirable policy decision in the kernel. kthreadd would cause unwanted latencies even for the realtime users who know what they're doing. Let's not make the decision for the user. Just allow the administrator to change kthreadd's priority safely if he chooses to do it. Ensure that the kernel threads are created with the usual nice level even if kthreadd's priority is changed from the default. Signed-off-by: Michal Schmidt [EMAIL PROTECTED] --- kernel/kthread.c | 11 +++ 1 files changed, 11 insertions(+), 0 deletions(-) diff --git a/kernel/kthread.c b/kernel/kthread.c index dcfe724..e832a85 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -94,10 +94,21 @@ static void create_kthread(struct kthread_create_info *create) if (pid 0) { create-result = ERR_PTR(pid);
Re: [PATCH] kthread: always create the kernel threads with normal priority
On Mon, 7 Jan 2008 12:22:51 +0100 Remy Bohmer [EMAIL PROTECTED] wrote: Hello Michal and Andrew, Let's not make the decision for the user. Just allow the administrator to change kthreadd's priority safely if he chooses to do it. Ensure that the kernel threads are created with the usual nice level even if kthreadd's priority is changed from the default. Last year, I posted a patchset (that was meant for Preempt-RT at that time) to be able to prioritise the interrupt-handler-threads (which are kthreads) and softirq-threads from the kernel commandline. See http://lkml.org/lkml/2007/12/19/208 Maybe we can find a way to use a similar mechanism as I used in my patchset for the priorities of the remaining kthreads. I do not like the way of forcing userland to change the priorities, because that would require a userland with the chrt tool installed, and that is not that practical for embedded systems (in which there could be cases that there is no userland at all, or the init-process is the whole embedded application). In that case an option to do it on the kernel commandline is more practical. I propose this kernel cmd-line option: kthread_pmap=somethread:50,otherthread:12,34 I see. kthreadd would look up the priority for itself and kthread_create would consult the map for all other kernel threads. That should work. Your sirq_pmap would not be needed anymore, as kthread_pmap could be used for softirq threads too, right? Michal -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] kthread: always create the kernel threads with normal priority
On Mon, 7 Jan 2008 02:25:13 -0800 Andrew Morton [EMAIL PROTECTED] wrote: On Mon, 7 Jan 2008 11:06:03 +0100 Michal Schmidt [EMAIL PROTECTED] wrote: On Sat, 22 Dec 2007 01:30:21 -0800 Andrew Morton [EMAIL PROTECTED] wrote: On Mon, 17 Dec 2007 23:43:14 +0100 Michal Schmidt [EMAIL PROTECTED] wrote: kthreadd, the creator of other kernel threads, runs as a normal priority task. This is a potential for priority inversion when a task wants to spawn a high-priority kernel thread. A middle priority SCHED_FIFO task can block kthreadd's execution indefinitely and thus prevent the timely creation of the high-priority kernel thread. This causes a practical problem. When a runaway real-time task is eating 100% CPU and we attempt to put the CPU offline, sometimes we block while waiting for the creation of the highest-priority kstopmachine thread. The fix is to run kthreadd with the highest possible SCHED_FIFO priority. Its children must still run as slightly negatively reniced SCHED_NORMAL tasks. Did you hit this problem with the stock kernel, or have you been working on other stuff? This was with RHEL5 and with current Fedora kernels. A locked-up SCHED_FIFO process will cause kernel threads all sorts of problems. You've hit one instance, but there will be others. (pdflush stops working, for one). The general approach we've taken to this is don't do that. Yes, we could boost lots of kernel threads in the way which this patch does but this actually takes control *away* from userspace. Userspace no longer has the ability to guarantee itself minimum possible latency without getting preempted by kernel threads. And yes, giving userspace this minimum-latency capability does imply that userspace has a responsibility to not 100% starve kernel threads. It's a reasonable compromise, I think? You're right. We should not run kthreadd with SCHED_FIFO by default. But the user should be able to change it using chrt if he wants to avoid this particular problem. So how about this instead?: kthreadd, the creator of other kernel threads, runs as a normal priority task. This is a potential for priority inversion when a task wants to spawn a high-priority kernel thread. A middle priority SCHED_FIFO task can block kthreadd's execution indefinitely and thus prevent the timely creation of the high-priority kernel thread. This causes a practical problem. When a runaway real-time task is eating 100% CPU and we attempt to put the CPU offline, sometimes we block while waiting for the creation of the highest-priority kstopmachine thread. This could be solved by always running kthreadd with the highest possible SCHED_FIFO priority, but that would be undesirable policy decision in the kernel. kthreadd would cause unwanted latencies even for the realtime users who know what they're doing. Let's not make the decision for the user. Just allow the administrator to change kthreadd's priority safely if he chooses to do it. Ensure that the kernel threads are created with the usual nice level even if kthreadd's priority is changed from the default. Signed-off-by: Michal Schmidt [EMAIL PROTECTED] --- kernel/kthread.c | 11 +++ 1 files changed, 11 insertions(+), 0 deletions(-) diff --git a/kernel/kthread.c b/kernel/kthread.c index dcfe724..e832a85 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -94,10 +94,21 @@ static void create_kthread(struct kthread_create_info *create) if (pid 0) { create-result = ERR_PTR(pid); } else { + struct sched_param param = { .sched_priority = 0 }; wait_for_completion(create-started); read_lock(tasklist_lock); create-result = find_task_by_pid(pid); read_unlock(tasklist_lock); + /* +* root may want to change our (kthreadd's) priority to +* realtime to solve a corner case priority inversion problem +* (a realtime task consuming 100% CPU blocking the creation of +* kernel threads). The kernel thread should not inherit the +* higher priority. Let's always create it with the usual nice +* level. +*/ + sched_setscheduler(create-result, SCHED_NORMAL, param); + set_user_nice(create-result, -5); } complete(create-done); } Seems reasonable. As a followup thing, we now have two hard-coded magical -5's in kthread.c. It'd be nice to add a #define for this. Done. It'd be nicer to work out where on earth that -5 came from too ;) Readers might wonder why kthreadd children disinherit kthreadd's policy and priority, but retain its cpus_allowed (and whatever other stuff root could have altered?) It's a good idea to reset the CPU mask too.
Re: [PATCH] kthread: always create the kernel threads with normal priority
Hello Michal, Maybe we can find a way to use a similar mechanism as I used in my patchset for the priorities of the remaining kthreads. I do not like the way of forcing userland to change the priorities, because that would require a userland with the chrt tool installed, and that is not that practical for embedded systems (in which there could be cases that there is no userland at all, or the init-process is the whole embedded application). In that case an option to do it on the kernel commandline is more practical. I propose this kernel cmd-line option: kthread_pmap=somethread:50,otherthread:12,34 I see. kthreadd would look up the priority for itself and kthread_create would consult the map for all other kernel threads. That should work. Your sirq_pmap would not be needed anymore, as kthread_pmap could be used for softirq threads too, right? That is correct. The soft-irqs are just ordinary kernel-threads, but irq_pmap is still needed, to set the priority of a certain interrupt handler. In this case it also possible to set the prio of the IRQ-kthreads as well as the prio of a certain interrupt handler. This might give some conflicts, and I have to check how to resolve these. Kind Regards, Remy 2008/1/7, Michal Schmidt [EMAIL PROTECTED]: On Mon, 7 Jan 2008 12:22:51 +0100 Remy Bohmer [EMAIL PROTECTED] wrote: Hello Michal and Andrew, Let's not make the decision for the user. Just allow the administrator to change kthreadd's priority safely if he chooses to do it. Ensure that the kernel threads are created with the usual nice level even if kthreadd's priority is changed from the default. Last year, I posted a patchset (that was meant for Preempt-RT at that time) to be able to prioritise the interrupt-handler-threads (which are kthreads) and softirq-threads from the kernel commandline. See http://lkml.org/lkml/2007/12/19/208 Maybe we can find a way to use a similar mechanism as I used in my patchset for the priorities of the remaining kthreads. I do not like the way of forcing userland to change the priorities, because that would require a userland with the chrt tool installed, and that is not that practical for embedded systems (in which there could be cases that there is no userland at all, or the init-process is the whole embedded application). In that case an option to do it on the kernel commandline is more practical. I propose this kernel cmd-line option: kthread_pmap=somethread:50,otherthread:12,34 I see. kthreadd would look up the priority for itself and kthread_create would consult the map for all other kernel threads. That should work. Your sirq_pmap would not be needed anymore, as kthread_pmap could be used for softirq threads too, right? Michal -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] kthread: always create the kernel threads with normal priority
On Mon, 7 Jan 2008 12:09:04 +0100 Ingo Molnar [EMAIL PROTECTED] wrote: This causes a practical problem. When a runaway real-time task is eating 100% CPU and we attempt to put the CPU offline, sometimes we block while waiting for the creation of the highest-priority kstopmachine thread. sched-devel.git has new mechanisms against runaway RT tasks. There's a new RLIMIT_RTTIME rlimit - if an RT task exceeds that rlimit then it is sent SIGXCPU. Is that total RT CPU time or elapsed time since last schedule()? If the former, it is not useful for this problem. there's also a new group scheduling extension that is driven via a sysctl: /proc/sys/kernel/sched_rt_ratio this way if a user has a runaway RT task, other users (and root) will still have some CPU time left. (in Peter's latest patchset that is replaced via rt_runtime_ns - but this is a detail) Doesn't this make the RT task non-RT? Would need to understand more details to tell. so instead of the never-ending arms race of kernel thread priorities against RT task priorities, we are going towards making RT tasks safer on a policy level. Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] kthread: always create the kernel threads with normal priority
On Mon, 2008-01-07 at 09:29 -0800, Andrew Morton wrote: On Mon, 7 Jan 2008 12:09:04 +0100 Ingo Molnar [EMAIL PROTECTED] wrote: This causes a practical problem. When a runaway real-time task is eating 100% CPU and we attempt to put the CPU offline, sometimes we block while waiting for the creation of the highest-priority kstopmachine thread. sched-devel.git has new mechanisms against runaway RT tasks. There's a new RLIMIT_RTTIME rlimit - if an RT task exceeds that rlimit then it is sent SIGXCPU. Is that total RT CPU time or elapsed time since last schedule()? If the former, it is not useful for this problem. there's also a new group scheduling extension that is driven via a sysctl: /proc/sys/kernel/sched_rt_ratio this way if a user has a runaway RT task, other users (and root) will still have some CPU time left. (in Peter's latest patchset that is replaced via rt_runtime_ns - but this is a detail) Doesn't this make the RT task non-RT? Would need to understand more details to tell. Its an artifact of rt group scheduling. Each group will have to specify a period and runtime limit therein (and the normalized sum thereof must not exceed the total time available - otherwise the set is not schedulable). So say we have two groups A and B. A has a period of 2 seconds and a runtime limit of 1, that gives him an avg of 50% cpu time. If B then has a period of 1 second with a runtime limit of .25s (avg 25%) the total time required to schedule the realtime groups would be 75% on average. Without group scheduling everything is considered one group but we still have the period and runtime limits. So as long as the realtime cpu usage fits within the given limits it acts as before. Once it exceeds its limit it will be capped hard - which is ok, since it exceeded its hard limit, and realtime applications are supposed to be deterministic and thus be able to tell how much time they'd require. [ If only this model were true, but its a model frequently used and quite accepted ] signature.asc Description: This is a digitally signed message part