Re: [PATCH sched-devel 0/7] CPU isolation extensions
Hi Peter, Sorry for delay in reply. Please, wrap your emails at 78 - most mailers can do this. Done. On Fri, 2008-02-22 at 14:05 -0800, Max Krasnyanskiy wrote: Peter Zijlstra wrote: On Thu, 2008-02-21 at 18:38 -0800, Max Krasnyanskiy wrote: List of commits cpuisol: Make cpu isolation configrable and export isolated map cpu_isolated_map was a bad hack when it was introduced, I feel we should deprecate it and fully integrate the functionality into cpusets. That would give a much more flexible end-result. That's not not currently possible and will introduce a lot of complexity. I'm pretty sure you missied the discussion I had with Paul (you were cc'ed on that btw). In fact I provided the link to that discussion in the original email. Here it is again: http://marc.info/?l=linux-kernel=120180692331461=2 I read it, I just firmly disagree. Basically the problem is very simple. CPU isolation needs a simple/efficient way to check if CPU N is isolated. I'm not seeing the need for that outside of setting up the various states. That is, once all the affinities are set up, you'd hardly ever would (or should - imho) need to know if a particular CPU is isolated or not. Unless I'm missing something that's only possible for a very static system. What I mean is that yes you could go and set irq affinity, apps affinity, workqueue thread affinity, etc not to run on the isolated cpus. It works _until_ something changes, at which point the system needs to know that it's not supposed to touch CPU N. For example new IRQ is registered, new workqueue is created (fs mounted, net network interface is created, etc), new kthread is started, etc. Sure we can introduce default affinity masks for irqs, workqueues, etc. But that's essentially just duplicating cpu_isolated_map. cpuset/cgroup APIs are not designed for that. In other to figure out if a CPU N is isolated one has to iterate through all cpusets and checks their cpu maps. That requires several levels of locking (cgroup and cpuset). The other issue is that cpusets are a bit too dynamic (see the thread above for more details) we'd need notified mechanisms to notify subsystems when a CPUs become isolated. Again more complexity. Since I integrated cpu isolation with cpu hotplug it's already addressed in a nice simple way. I guess you have another definition of nice than I do. No, not really. Lets talk specifics. My goal was not to introduce a bunch of new functionality and rewrite workqueues and stuff, instead I wanted to integrated with existing mechanisms. CPU maps are used everywhere and exporting cpu_isolated_map was a natural way to make other parts of the kernel aware of the isolated CPUs. Please take a look at that discussion. I do not think it's worth the effort to put this into cpusets. cpu_isolated_map is very clean and simple concept and integrates nicely with the rest of the cpu maps. ie It's very much the same concept and API as cpu_online_map, etc. I'm thinking cpu_isolated_map is a very dirty hack. If we want to integrate this stuff with cpusets I think the best approach would be is to have cpusets update the cpu_isolated_map just like it currently updates scheduler domains. CPU-sets can already isolate cpus by either creating a cpu outside of any set, or a set with a single cpu not shared by any other sets. This only works for user-space. As I mentioned about for full CPU isolation various kernel subsystems need to be aware of that CPUs are isolated in order to avoid activities on them. Yes, hence the proposed system flag to handle the kernel bits like unbounded kernel threads and IRQs. I do not see a specific proposals here. The funny part that we're not even disagreeing on the high level. Yes It'd be nice to have such a flag ;-) But how will genirq subsystem, for example, be aware of that flag ? ie How would it know that by default it is not supposed to route irqs to the CPUs in the cpusets with that flag ? As I explained above setting affinity for existing irqs is not enough. Same for workqueus or any other susbsytem that wants to run per-cpu threads and stuff. This also allows for isolated groups, there are good reasons to isolate groups, esp. now that we have a stronger RT balancer. SMP and hard RT are not exclusive. A design that does not take that into account is too rigid. You're thinking scheduling only. Paul had the same confusion ;-) I'm not, I'm thinking it ought to allow for it. One way I can think of how to support groups and allow for RT balancer is this: Make scheduler ignore cpu_isolated_map and give cpusets full control of the scheduler domains. Use cpu_isolated_map to only for hw irq and other kernel sub-systems. That way cpusets could mark cpus in the group as isolated to get rid of the kernel activity and build sched domain such that tasks get balanced in it. The thing I do not like about it is that there is no way to boot the system with CPU N isolated
Re: [PATCH sched-devel 0/7] CPU isolation extensions
Hi Peter, Sorry for delay in reply. Please, wrap your emails at 78 - most mailers can do this. Done. On Fri, 2008-02-22 at 14:05 -0800, Max Krasnyanskiy wrote: Peter Zijlstra wrote: On Thu, 2008-02-21 at 18:38 -0800, Max Krasnyanskiy wrote: List of commits cpuisol: Make cpu isolation configrable and export isolated map cpu_isolated_map was a bad hack when it was introduced, I feel we should deprecate it and fully integrate the functionality into cpusets. That would give a much more flexible end-result. That's not not currently possible and will introduce a lot of complexity. I'm pretty sure you missied the discussion I had with Paul (you were cc'ed on that btw). In fact I provided the link to that discussion in the original email. Here it is again: http://marc.info/?l=linux-kernelm=120180692331461w=2 I read it, I just firmly disagree. Basically the problem is very simple. CPU isolation needs a simple/efficient way to check if CPU N is isolated. I'm not seeing the need for that outside of setting up the various states. That is, once all the affinities are set up, you'd hardly ever would (or should - imho) need to know if a particular CPU is isolated or not. Unless I'm missing something that's only possible for a very static system. What I mean is that yes you could go and set irq affinity, apps affinity, workqueue thread affinity, etc not to run on the isolated cpus. It works _until_ something changes, at which point the system needs to know that it's not supposed to touch CPU N. For example new IRQ is registered, new workqueue is created (fs mounted, net network interface is created, etc), new kthread is started, etc. Sure we can introduce default affinity masks for irqs, workqueues, etc. But that's essentially just duplicating cpu_isolated_map. cpuset/cgroup APIs are not designed for that. In other to figure out if a CPU N is isolated one has to iterate through all cpusets and checks their cpu maps. That requires several levels of locking (cgroup and cpuset). The other issue is that cpusets are a bit too dynamic (see the thread above for more details) we'd need notified mechanisms to notify subsystems when a CPUs become isolated. Again more complexity. Since I integrated cpu isolation with cpu hotplug it's already addressed in a nice simple way. I guess you have another definition of nice than I do. No, not really. Lets talk specifics. My goal was not to introduce a bunch of new functionality and rewrite workqueues and stuff, instead I wanted to integrated with existing mechanisms. CPU maps are used everywhere and exporting cpu_isolated_map was a natural way to make other parts of the kernel aware of the isolated CPUs. Please take a look at that discussion. I do not think it's worth the effort to put this into cpusets. cpu_isolated_map is very clean and simple concept and integrates nicely with the rest of the cpu maps. ie It's very much the same concept and API as cpu_online_map, etc. I'm thinking cpu_isolated_map is a very dirty hack. If we want to integrate this stuff with cpusets I think the best approach would be is to have cpusets update the cpu_isolated_map just like it currently updates scheduler domains. CPU-sets can already isolate cpus by either creating a cpu outside of any set, or a set with a single cpu not shared by any other sets. This only works for user-space. As I mentioned about for full CPU isolation various kernel subsystems need to be aware of that CPUs are isolated in order to avoid activities on them. Yes, hence the proposed system flag to handle the kernel bits like unbounded kernel threads and IRQs. I do not see a specific proposals here. The funny part that we're not even disagreeing on the high level. Yes It'd be nice to have such a flag ;-) But how will genirq subsystem, for example, be aware of that flag ? ie How would it know that by default it is not supposed to route irqs to the CPUs in the cpusets with that flag ? As I explained above setting affinity for existing irqs is not enough. Same for workqueus or any other susbsytem that wants to run per-cpu threads and stuff. This also allows for isolated groups, there are good reasons to isolate groups, esp. now that we have a stronger RT balancer. SMP and hard RT are not exclusive. A design that does not take that into account is too rigid. You're thinking scheduling only. Paul had the same confusion ;-) I'm not, I'm thinking it ought to allow for it. One way I can think of how to support groups and allow for RT balancer is this: Make scheduler ignore cpu_isolated_map and give cpusets full control of the scheduler domains. Use cpu_isolated_map to only for hw irq and other kernel sub-systems. That way cpusets could mark cpus in the group as isolated to get rid of the kernel activity and build sched domain such that tasks get balanced in it. The thing I do not like about it is that there is no way to boot the system with CPU N
Re: [PATCH sched-devel 0/7] CPU isolation extensions
Please, wrap your emails at 78 - most mailers can do this. On Fri, 2008-02-22 at 14:05 -0800, Max Krasnyanskiy wrote: > Peter Zijlstra wrote: > > On Thu, 2008-02-21 at 18:38 -0800, Max Krasnyanskiy wrote: > > > >> List of commits > >>cpuisol: Make cpu isolation configrable and export isolated map > > > > cpu_isolated_map was a bad hack when it was introduced, I feel we should > > deprecate it and fully integrate the functionality into cpusets. That would > > give a much more flexible end-result. > That's not not currently possible and will introduce a lot of complexity. > I'm pretty sure you missied the discussion I had with Paul (you were cc'ed on > that btw). > In fact I provided the link to that discussion in the original email. > Here it is again: > http://marc.info/?l=linux-kernel=120180692331461=2 I read it, I just firmly disagree. > Basically the problem is very simple. CPU isolation needs a simple/efficient > way to check > if CPU N is isolated. I'm not seeing the need for that outside of setting up the various states. That is, once all the affinities are set up, you'd hardly ever would (or should - imho) need to know if a particular CPU is isolated or not. > cpuset/cgroup APIs are not designed for that. In other to figure > out if a CPU N is isolated one has to iterate through all cpusets and checks > their cpu maps. > That requires several levels of locking (cgroup and cpuset). > The other issue is that cpusets are a bit too dynamic (see the thread above > for more details) > we'd need notified mechanisms to notify subsystems when a CPUs become > isolated. Again more > complexity. Since I integrated cpu isolation with cpu hotplug it's already > addressed in a nice > simple way. I guess you have another definition of nice than I do. > Please take a look at that discussion. I do not think it's worth the effort > to put this into > cpusets. cpu_isolated_map is very clean and simple concept and integrates > nicely with the > rest of the cpu maps. ie It's very much the same concept and API as > cpu_online_map, etc. I'm thinking cpu_isolated_map is a very dirty hack. > If we want to integrate this stuff with cpusets I think the best approach > would be is to > have cpusets update the cpu_isolated_map just like it currently updates > scheduler domains. > > > CPU-sets can already isolate cpus by either creating a cpu outside of any > > set, > > or a set with a single cpu not shared by any other sets. > This only works for user-space. As I mentioned about for full CPU isolation > various kernel > subsystems need to be aware of that CPUs are isolated in order to avoid > activities on them. Yes, hence the proposed system flag to handle the kernel bits like unbounded kernel threads and IRQs. > > This also allows for isolated groups, there are good reasons to isolate > > groups, > > esp. now that we have a stronger RT balancer. SMP and hard RT are not > > exclusive. A design that does not take that into account is too rigid. > You're thinking scheduling only. Paul had the same confusion ;-) I'm not, I'm thinking it ought to allow for it. > As I explained before I'm redefining (or proposing to redefine) CPU isolation > to > > 1. Isolated CPU(s) must not be subject to scheduler load balancing > Users must explicitly bind threads in order to run on those CPU(s). I'm thinking that should be optional, load-balancing might make sense under some scenarios. > 2. By default interrupts must not be routed to the isolated CPU(s) > User must route interrupts (if any) to those CPUs explicitly. That is what I allowed for by the system flag. > 3. In general kernel subsystems must avoid activity on the isolated CPU(s) as > much as possible > Includes workqueues, per CPU threads, etc. > This feature is configurable and is disabled by default. How am I refuting any of these points? I'm very clear on what you want, I'm just saying I want to get there differently. > >>cpuisol: Do not route IRQs to the CPUs isolated at boot > > > >>From the diffstat you're not touching the genirq stuff, but instead hack a > > single architecture to support this feature. Sounds like an ill designed > > hack. > Ah, good point. This patches started before genirq was merged and I did not > realize > that there is a way to set default irq affinity with genirq. > I'll definitely take a look at that. I said so before, but good to see you've picked this up. I didn't know if there was a genirq way to set smp affinities - but if there wasn't this was a good moment to introduce that. > > A better approach would be to add a flag to the cpuset infrastructure that > > says > > whether its a system set or not. A system set would be one that services the > > general purpose OS and would include things like the IRQ affinity and > > unbound > > kernel threads (including unbound workqueues - or single workqueues). This > > flag > > would default to on, and by switching it off for the root
Re: [PATCH sched-devel 0/7] CPU isolation extensions
Please, wrap your emails at 78 - most mailers can do this. On Fri, 2008-02-22 at 14:05 -0800, Max Krasnyanskiy wrote: Peter Zijlstra wrote: On Thu, 2008-02-21 at 18:38 -0800, Max Krasnyanskiy wrote: List of commits cpuisol: Make cpu isolation configrable and export isolated map cpu_isolated_map was a bad hack when it was introduced, I feel we should deprecate it and fully integrate the functionality into cpusets. That would give a much more flexible end-result. That's not not currently possible and will introduce a lot of complexity. I'm pretty sure you missied the discussion I had with Paul (you were cc'ed on that btw). In fact I provided the link to that discussion in the original email. Here it is again: http://marc.info/?l=linux-kernelm=120180692331461w=2 I read it, I just firmly disagree. Basically the problem is very simple. CPU isolation needs a simple/efficient way to check if CPU N is isolated. I'm not seeing the need for that outside of setting up the various states. That is, once all the affinities are set up, you'd hardly ever would (or should - imho) need to know if a particular CPU is isolated or not. cpuset/cgroup APIs are not designed for that. In other to figure out if a CPU N is isolated one has to iterate through all cpusets and checks their cpu maps. That requires several levels of locking (cgroup and cpuset). The other issue is that cpusets are a bit too dynamic (see the thread above for more details) we'd need notified mechanisms to notify subsystems when a CPUs become isolated. Again more complexity. Since I integrated cpu isolation with cpu hotplug it's already addressed in a nice simple way. I guess you have another definition of nice than I do. Please take a look at that discussion. I do not think it's worth the effort to put this into cpusets. cpu_isolated_map is very clean and simple concept and integrates nicely with the rest of the cpu maps. ie It's very much the same concept and API as cpu_online_map, etc. I'm thinking cpu_isolated_map is a very dirty hack. If we want to integrate this stuff with cpusets I think the best approach would be is to have cpusets update the cpu_isolated_map just like it currently updates scheduler domains. CPU-sets can already isolate cpus by either creating a cpu outside of any set, or a set with a single cpu not shared by any other sets. This only works for user-space. As I mentioned about for full CPU isolation various kernel subsystems need to be aware of that CPUs are isolated in order to avoid activities on them. Yes, hence the proposed system flag to handle the kernel bits like unbounded kernel threads and IRQs. This also allows for isolated groups, there are good reasons to isolate groups, esp. now that we have a stronger RT balancer. SMP and hard RT are not exclusive. A design that does not take that into account is too rigid. You're thinking scheduling only. Paul had the same confusion ;-) I'm not, I'm thinking it ought to allow for it. As I explained before I'm redefining (or proposing to redefine) CPU isolation to 1. Isolated CPU(s) must not be subject to scheduler load balancing Users must explicitly bind threads in order to run on those CPU(s). I'm thinking that should be optional, load-balancing might make sense under some scenarios. 2. By default interrupts must not be routed to the isolated CPU(s) User must route interrupts (if any) to those CPUs explicitly. That is what I allowed for by the system flag. 3. In general kernel subsystems must avoid activity on the isolated CPU(s) as much as possible Includes workqueues, per CPU threads, etc. This feature is configurable and is disabled by default. How am I refuting any of these points? I'm very clear on what you want, I'm just saying I want to get there differently. cpuisol: Do not route IRQs to the CPUs isolated at boot From the diffstat you're not touching the genirq stuff, but instead hack a single architecture to support this feature. Sounds like an ill designed hack. Ah, good point. This patches started before genirq was merged and I did not realize that there is a way to set default irq affinity with genirq. I'll definitely take a look at that. I said so before, but good to see you've picked this up. I didn't know if there was a genirq way to set smp affinities - but if there wasn't this was a good moment to introduce that. A better approach would be to add a flag to the cpuset infrastructure that says whether its a system set or not. A system set would be one that services the general purpose OS and would include things like the IRQ affinity and unbound kernel threads (including unbound workqueues - or single workqueues). This flag would default to on, and by switching it off for the root set, and a select subset you would push the System away from those cpus, thereby isolating them. You're
Re: [PATCH sched-devel 0/7] CPU isolation extensions
Peter Zijlstra wrote: On Fri, 2008-02-22 at 08:38 -0500, Mark Hounschell wrote: List of commits cpuisol: Make cpu isolation configrable and export isolated map cpu_isolated_map was a bad hack when it was introduced, I feel we should deprecate it and fully integrate the functionality into cpusets. That would give a much more flexible end-result. CPU-sets can already isolate cpus by either creating a cpu outside of any set, or a set with a single cpu not shared by any other sets. Peter, what about when I am NOT using cpusets and are disabled in my config but I still want to use this? Then you enable it? I'm with Mark on this one. For example if I have two core machine I do not need cpusets to manage them. Plus like I explained in prev email cpuset is higher level API. We can think of a way to integrated them if needed. cpuisol: Do not schedule workqueues on the isolated CPUs (per-cpu workqueues, the single ones are treated in the previous section) I still strongly disagree with this approach. Workqueues are passive, they don't do anything unless work is provided to them. By blindly not starting them you handicap the system and services that rely on them. Have things changed since since my first bad encounter with Workqueues. I am referring to this thread. http://kerneltrap.org/mailarchive/linux-kernel/2007/5/29/97039 Just means you get to fix those problems. By blindly not starting them you introduce others. Please give me an example of what you have in mind. Also if you look at the patch (which I've now posted properly) it's not just not starting them. I also redirected all future scheduled work to non-isolated CPU. ie If work is scheduled on the isolated CPU this work is treated as if the work queue is single threaded. As I explained before most subsystem do not care which CPU actually gets to execute the work. Oprofile is the only one I know of that breaks because it cannot collect the stats from the isolated CPUs. I'm thinking of a different solution for oprofile, maybe collection samples through IPIs or something. Max -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH sched-devel 0/7] CPU isolation extensions
Mark Hounschell wrote: Peter Zijlstra wrote: On Thu, 2008-02-21 at 18:38 -0800, Max Krasnyanskiy wrote: As you suggested I'm sending CPU isolation patches for review/inclusion into sched-devel tree. They are against 2.6.25-rc2. You can also pull them from my GIT tree at git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git master Post patches! I can't review a git tree.. Max, could you also post them for 2.6.24.2 stable please. Thanks Will do. Max -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH sched-devel 0/7] CPU isolation extensions
Peter Zijlstra wrote: On Thu, 2008-02-21 at 18:38 -0800, Max Krasnyanskiy wrote: As you suggested I'm sending CPU isolation patches for review/inclusion into sched-devel tree. They are against 2.6.25-rc2. You can also pull them from my GIT tree at git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git master Post patches! I can't review a git tree.. I did. But it looks like I screwed the --cc list. I just resent them. List of commits cpuisol: Make cpu isolation configrable and export isolated map cpu_isolated_map was a bad hack when it was introduced, I feel we should deprecate it and fully integrate the functionality into cpusets. That would give a much more flexible end-result. That's not not currently possible and will introduce a lot of complexity. I'm pretty sure you missied the discussion I had with Paul (you were cc'ed on that btw). In fact I provided the link to that discussion in the original email. Here it is again: http://marc.info/?l=linux-kernel=120180692331461=2 Basically the problem is very simple. CPU isolation needs a simple/efficient way to check if CPU N is isolated. cpuset/cgroup APIs are not designed for that. In other to figure out if a CPU N is isolated one has to iterate through all cpusets and checks their cpu maps. That requires several levels of locking (cgroup and cpuset). The other issue is that cpusets are a bit too dynamic (see the thread above for more details) we'd need notified mechanisms to notify subsystems when a CPUs become isolated. Again more complexity. Since I integrated cpu isolation with cpu hotplug it's already addressed in a nice simple way. Please take a look at that discussion. I do not think it's worth the effort to put this into cpusets. cpu_isolated_map is very clean and simple concept and integrates nicely with the rest of the cpu maps. ie It's very much the same concept and API as cpu_online_map, etc. If we want to integrate this stuff with cpusets I think the best approach would be is to have cpusets update the cpu_isolated_map just like it currently updates scheduler domains. CPU-sets can already isolate cpus by either creating a cpu outside of any set, or a set with a single cpu not shared by any other sets. This only works for user-space. As I mentioned about for full CPU isolation various kernel subsystems need to be aware of that CPUs are isolated in order to avoid activities on them. This also allows for isolated groups, there are good reasons to isolate groups, esp. now that we have a stronger RT balancer. SMP and hard RT are not exclusive. A design that does not take that into account is too rigid. You're thinking scheduling only. Paul had the same confusion ;-) As I explained before I'm redefining (or proposing to redefine) CPU isolation to 1. Isolated CPU(s) must not be subject to scheduler load balancing Users must explicitly bind threads in order to run on those CPU(s). 2. By default interrupts must not be routed to the isolated CPU(s) User must route interrupts (if any) to those CPUs explicitly. 3. In general kernel subsystems must avoid activity on the isolated CPU(s) as much as possible Includes workqueues, per CPU threads, etc. This feature is configurable and is disabled by default. Only #1 has to do with the scheduling. The rest has _nothing_ to do with it. cpuisol: Do not route IRQs to the CPUs isolated at boot From the diffstat you're not touching the genirq stuff, but instead hack a single architecture to support this feature. Sounds like an ill designed hack. Ah, good point. This patches started before genirq was merged and I did not realize that there is a way to set default irq affinity with genirq. I'll definitely take a look at that. A better approach would be to add a flag to the cpuset infrastructure that says whether its a system set or not. A system set would be one that services the general purpose OS and would include things like the IRQ affinity and unbound kernel threads (including unbound workqueues - or single workqueues). This flag would default to on, and by switching it off for the root set, and a select subset you would push the System away from those cpus, thereby isolating them. You're talking about very high level API. I'm totally all for it. What the patches deal with is actual low level stuff that is needed to do the "push the system away from those cpus". As I mentioned above we can have cpuset update cpu_isolated_map for example. cpuisol: Do not schedule workqueues on the isolated CPUs (per-cpu workqueues, the single ones are treated in the previous section) I still strongly disagree with this approach. Workqueues are passive, they don't do anything unless work is provided to them. By blindly not starting them you handicap the system and services that rely on them. Oh boy, back to square one. I covered this already. I even started a thread on that and explained what this is and why its needed.
Re: [PATCH sched-devel 0/7] CPU isolation extensions
Dmitry Adamushko wrote: Hi Max, [ ... ] Last patch to the stop machine is potentially unsafe and is marked as experimental. Unfortunately it's currently the only option that allows dynamic module insertion/removal for above scenarios. I'm puzzled by the following part (can be a misunderstanding from my side) +config CPUISOL_STOPMACHINE + bool "Do not halt isolated CPUs with Stop Machine (EXPERIMENTAL)" + depends on CPUISOL && STOP_MACHINE && EXPERIMENTAL + help + If this option is enabled kernel will not halt isolated CPUs + when Stop Machine is triggered. Stop Machine is currently only + used by the module insertion and removal. this "only" part. What about e.g. a 'cpu hotplug' case (_cpu_down())? (or we should abstract it a bit to the point that e.g. a cpu can be considered as 'a module'? :-) My bad. I forgot to update that text. As you and other folks pointed out stopmachine is used in a few other places besides module loading. We had a discussion about this awhile ago. I just forgot to update the text. Will do. Max -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH sched-devel 0/7] CPU isolation extensions
On Fri, 2008-02-22 at 08:38 -0500, Mark Hounschell wrote: > >> List of commits > >>cpuisol: Make cpu isolation configrable and export isolated map > > > > cpu_isolated_map was a bad hack when it was introduced, I feel we should > > deprecate it and fully integrate the functionality into cpusets. That would > > give a much more flexible end-result. > > > > CPU-sets can already isolate cpus by either creating a cpu outside of any > > set, > > or a set with a single cpu not shared by any other sets. > > > > Peter, what about when I am NOT using cpusets and are disabled in my config > but > I still want to use this? Then you enable it? > >>cpuisol: Do not schedule workqueues on the isolated CPUs > > > > (per-cpu workqueues, the single ones are treated in the previous section) > > > > I still strongly disagree with this approach. Workqueues are passive, they > > don't do anything unless work is provided to them. By blindly not starting > > them > > you handicap the system and services that rely on them. > > > > Have things changed since since my first bad encounter with Workqueues. > I am referring to this thread. > > http://kerneltrap.org/mailarchive/linux-kernel/2007/5/29/97039 Just means you get to fix those problems. By blindly not starting them you introduce others. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH sched-devel 0/7] CPU isolation extensions
Peter Zijlstra wrote: > On Thu, 2008-02-21 at 18:38 -0800, Max Krasnyanskiy wrote: > >> As you suggested I'm sending CPU isolation patches for review/inclusion into >> sched-devel tree. They are against 2.6.25-rc2. >> You can also pull them from my GIT tree at >> git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git >> master > > Post patches! I can't review a git tree.. > Max, could you also post them for 2.6.24.2 stable please. Thanks >> Diffstat: >> b/Documentation/ABI/testing/sysfs-devices-system-cpu | 41 ++ >> b/Documentation/cpu-isolation.txt| 114 >> ++- >> b/arch/x86/Kconfig |1 >> b/arch/x86/kernel/genapic_flat_64.c |5 >> b/drivers/base/cpu.c | 48 >> b/include/linux/cpumask.h|3 >> b/kernel/Kconfig.cpuisol | 15 ++ >> b/kernel/Makefile|4 >> b/kernel/cpu.c | 49 >> b/kernel/sched.c | 37 -- >> b/kernel/stop_machine.c |9 + >> b/kernel/workqueue.c | 31 +++-- >> kernel/Kconfig.cpuisol | 56 ++--- >> kernel/cpu.c | 16 +- >> 14 files changed, 356 insertions(+), 73 deletions(-) >> >> List of commits >>cpuisol: Make cpu isolation configrable and export isolated map > > cpu_isolated_map was a bad hack when it was introduced, I feel we should > deprecate it and fully integrate the functionality into cpusets. That would > give a much more flexible end-result. > > CPU-sets can already isolate cpus by either creating a cpu outside of any set, > or a set with a single cpu not shared by any other sets. > Peter, what about when I am NOT using cpusets and are disabled in my config but I still want to use this? > This also allows for isolated groups, there are good reasons to isolate > groups, > esp. now that we have a stronger RT balancer. SMP and hard RT are not > exclusive. A design that does not take that into account is too rigid. > >>cpuisol: Do not route IRQs to the CPUs isolated at boot > >>From the diffstat you're not touching the genirq stuff, but instead hack a > single architecture to support this feature. Sounds like an ill designed hack. > > A better approach would be to add a flag to the cpuset infrastructure that > says > whether its a system set or not. A system set would be one that services the > general purpose OS and would include things like the IRQ affinity and unbound > kernel threads (including unbound workqueues - or single workqueues). This > flag > would default to on, and by switching it off for the root set, and a select > subset you would push the System away from those cpus, thereby isolating them. > >>cpuisol: Do not schedule workqueues on the isolated CPUs > > (per-cpu workqueues, the single ones are treated in the previous section) > > I still strongly disagree with this approach. Workqueues are passive, they > don't do anything unless work is provided to them. By blindly not starting > them > you handicap the system and services that rely on them. > Have things changed since since my first bad encounter with Workqueues. I am referring to this thread. http://kerneltrap.org/mailarchive/linux-kernel/2007/5/29/97039 > (you even acknowledged this problem, by saying it breaks oprofile for instance > - still trying to push a change that knowingly breaks a lot of stuff is bad > manners on lkml and not acceptable for mainline) > > The way to do this is to avoid the generation of work, not the execution of > it. > >>cpuisol: Move on-stack array used for boot cmd parsing into __initdata >>cpuisol: Documentation updates >>cpuisol: Minor updates to the Kconfig options > > No idea about these patches,... > >>cpuisol: Do not halt isolated CPUs with Stop Machine > > Very strong NACK on this one, it breaks a lot of functionality in non-obvious > ways, as has been pointed out to you numerous times. Such patches are just not > acceptable for mainline - full stop. > > Mark -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH sched-devel 0/7] CPU isolation extensions
On Thu, 2008-02-21 at 18:38 -0800, Max Krasnyanskiy wrote: > As you suggested I'm sending CPU isolation patches for review/inclusion into > sched-devel tree. They are against 2.6.25-rc2. > You can also pull them from my GIT tree at > git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git > master Post patches! I can't review a git tree.. > Diffstat: > b/Documentation/ABI/testing/sysfs-devices-system-cpu | 41 ++ > b/Documentation/cpu-isolation.txt| 114 > ++- > b/arch/x86/Kconfig |1 > b/arch/x86/kernel/genapic_flat_64.c |5 > b/drivers/base/cpu.c | 48 > b/include/linux/cpumask.h|3 > b/kernel/Kconfig.cpuisol | 15 ++ > b/kernel/Makefile|4 > b/kernel/cpu.c | 49 > b/kernel/sched.c | 37 -- > b/kernel/stop_machine.c |9 + > b/kernel/workqueue.c | 31 +++-- > kernel/Kconfig.cpuisol | 56 ++--- > kernel/cpu.c | 16 +- > 14 files changed, 356 insertions(+), 73 deletions(-) > > List of commits >cpuisol: Make cpu isolation configrable and export isolated map cpu_isolated_map was a bad hack when it was introduced, I feel we should deprecate it and fully integrate the functionality into cpusets. That would give a much more flexible end-result. CPU-sets can already isolate cpus by either creating a cpu outside of any set, or a set with a single cpu not shared by any other sets. This also allows for isolated groups, there are good reasons to isolate groups, esp. now that we have a stronger RT balancer. SMP and hard RT are not exclusive. A design that does not take that into account is too rigid. >cpuisol: Do not route IRQs to the CPUs isolated at boot >From the diffstat you're not touching the genirq stuff, but instead hack a single architecture to support this feature. Sounds like an ill designed hack. A better approach would be to add a flag to the cpuset infrastructure that says whether its a system set or not. A system set would be one that services the general purpose OS and would include things like the IRQ affinity and unbound kernel threads (including unbound workqueues - or single workqueues). This flag would default to on, and by switching it off for the root set, and a select subset you would push the System away from those cpus, thereby isolating them. >cpuisol: Do not schedule workqueues on the isolated CPUs (per-cpu workqueues, the single ones are treated in the previous section) I still strongly disagree with this approach. Workqueues are passive, they don't do anything unless work is provided to them. By blindly not starting them you handicap the system and services that rely on them. (you even acknowledged this problem, by saying it breaks oprofile for instance - still trying to push a change that knowingly breaks a lot of stuff is bad manners on lkml and not acceptable for mainline) The way to do this is to avoid the generation of work, not the execution of it. >cpuisol: Move on-stack array used for boot cmd parsing into __initdata >cpuisol: Documentation updates >cpuisol: Minor updates to the Kconfig options No idea about these patches,... >cpuisol: Do not halt isolated CPUs with Stop Machine Very strong NACK on this one, it breaks a lot of functionality in non-obvious ways, as has been pointed out to you numerous times. Such patches are just not acceptable for mainline - full stop. Please, the ideas are not bad and have been suggested in the context of -rt a number of times, its just the execution. Design features so that they are flexible and can be used in ways you never imagined them. But more importantly so that they don't knowingly break tons of stuff. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH sched-devel 0/7] CPU isolation extensions
Hi Max, > [ ... ] > Last patch to the stop machine is potentially unsafe and is marked as > experimental. Unfortunately > it's currently the only option that allows dynamic module insertion/removal > for above scenarios. I'm puzzled by the following part (can be a misunderstanding from my side) +config CPUISOL_STOPMACHINE + bool "Do not halt isolated CPUs with Stop Machine (EXPERIMENTAL)" + depends on CPUISOL && STOP_MACHINE && EXPERIMENTAL + help + If this option is enabled kernel will not halt isolated CPUs + when Stop Machine is triggered. Stop Machine is currently only + used by the module insertion and removal. this "only" part. What about e.g. a 'cpu hotplug' case (_cpu_down())? (or we should abstract it a bit to the point that e.g. a cpu can be considered as 'a module'? :-) -- Best regards, Dmitry Adamushko -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH sched-devel 0/7] CPU isolation extensions
Hi Max, [ ... ] Last patch to the stop machine is potentially unsafe and is marked as experimental. Unfortunately it's currently the only option that allows dynamic module insertion/removal for above scenarios. I'm puzzled by the following part (can be a misunderstanding from my side) +config CPUISOL_STOPMACHINE + bool Do not halt isolated CPUs with Stop Machine (EXPERIMENTAL) + depends on CPUISOL STOP_MACHINE EXPERIMENTAL + help + If this option is enabled kernel will not halt isolated CPUs + when Stop Machine is triggered. Stop Machine is currently only + used by the module insertion and removal. this only part. What about e.g. a 'cpu hotplug' case (_cpu_down())? (or we should abstract it a bit to the point that e.g. a cpu can be considered as 'a module'? :-) -- Best regards, Dmitry Adamushko -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH sched-devel 0/7] CPU isolation extensions
On Thu, 2008-02-21 at 18:38 -0800, Max Krasnyanskiy wrote: As you suggested I'm sending CPU isolation patches for review/inclusion into sched-devel tree. They are against 2.6.25-rc2. You can also pull them from my GIT tree at git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git master Post patches! I can't review a git tree.. Diffstat: b/Documentation/ABI/testing/sysfs-devices-system-cpu | 41 ++ b/Documentation/cpu-isolation.txt| 114 ++- b/arch/x86/Kconfig |1 b/arch/x86/kernel/genapic_flat_64.c |5 b/drivers/base/cpu.c | 48 b/include/linux/cpumask.h|3 b/kernel/Kconfig.cpuisol | 15 ++ b/kernel/Makefile|4 b/kernel/cpu.c | 49 b/kernel/sched.c | 37 -- b/kernel/stop_machine.c |9 + b/kernel/workqueue.c | 31 +++-- kernel/Kconfig.cpuisol | 56 ++--- kernel/cpu.c | 16 +- 14 files changed, 356 insertions(+), 73 deletions(-) List of commits cpuisol: Make cpu isolation configrable and export isolated map cpu_isolated_map was a bad hack when it was introduced, I feel we should deprecate it and fully integrate the functionality into cpusets. That would give a much more flexible end-result. CPU-sets can already isolate cpus by either creating a cpu outside of any set, or a set with a single cpu not shared by any other sets. This also allows for isolated groups, there are good reasons to isolate groups, esp. now that we have a stronger RT balancer. SMP and hard RT are not exclusive. A design that does not take that into account is too rigid. cpuisol: Do not route IRQs to the CPUs isolated at boot From the diffstat you're not touching the genirq stuff, but instead hack a single architecture to support this feature. Sounds like an ill designed hack. A better approach would be to add a flag to the cpuset infrastructure that says whether its a system set or not. A system set would be one that services the general purpose OS and would include things like the IRQ affinity and unbound kernel threads (including unbound workqueues - or single workqueues). This flag would default to on, and by switching it off for the root set, and a select subset you would push the System away from those cpus, thereby isolating them. cpuisol: Do not schedule workqueues on the isolated CPUs (per-cpu workqueues, the single ones are treated in the previous section) I still strongly disagree with this approach. Workqueues are passive, they don't do anything unless work is provided to them. By blindly not starting them you handicap the system and services that rely on them. (you even acknowledged this problem, by saying it breaks oprofile for instance - still trying to push a change that knowingly breaks a lot of stuff is bad manners on lkml and not acceptable for mainline) The way to do this is to avoid the generation of work, not the execution of it. cpuisol: Move on-stack array used for boot cmd parsing into __initdata cpuisol: Documentation updates cpuisol: Minor updates to the Kconfig options No idea about these patches,... cpuisol: Do not halt isolated CPUs with Stop Machine Very strong NACK on this one, it breaks a lot of functionality in non-obvious ways, as has been pointed out to you numerous times. Such patches are just not acceptable for mainline - full stop. Please, the ideas are not bad and have been suggested in the context of -rt a number of times, its just the execution. Design features so that they are flexible and can be used in ways you never imagined them. But more importantly so that they don't knowingly break tons of stuff. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH sched-devel 0/7] CPU isolation extensions
Peter Zijlstra wrote: On Thu, 2008-02-21 at 18:38 -0800, Max Krasnyanskiy wrote: As you suggested I'm sending CPU isolation patches for review/inclusion into sched-devel tree. They are against 2.6.25-rc2. You can also pull them from my GIT tree at git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git master Post patches! I can't review a git tree.. Max, could you also post them for 2.6.24.2 stable please. Thanks Diffstat: b/Documentation/ABI/testing/sysfs-devices-system-cpu | 41 ++ b/Documentation/cpu-isolation.txt| 114 ++- b/arch/x86/Kconfig |1 b/arch/x86/kernel/genapic_flat_64.c |5 b/drivers/base/cpu.c | 48 b/include/linux/cpumask.h|3 b/kernel/Kconfig.cpuisol | 15 ++ b/kernel/Makefile|4 b/kernel/cpu.c | 49 b/kernel/sched.c | 37 -- b/kernel/stop_machine.c |9 + b/kernel/workqueue.c | 31 +++-- kernel/Kconfig.cpuisol | 56 ++--- kernel/cpu.c | 16 +- 14 files changed, 356 insertions(+), 73 deletions(-) List of commits cpuisol: Make cpu isolation configrable and export isolated map cpu_isolated_map was a bad hack when it was introduced, I feel we should deprecate it and fully integrate the functionality into cpusets. That would give a much more flexible end-result. CPU-sets can already isolate cpus by either creating a cpu outside of any set, or a set with a single cpu not shared by any other sets. Peter, what about when I am NOT using cpusets and are disabled in my config but I still want to use this? This also allows for isolated groups, there are good reasons to isolate groups, esp. now that we have a stronger RT balancer. SMP and hard RT are not exclusive. A design that does not take that into account is too rigid. cpuisol: Do not route IRQs to the CPUs isolated at boot From the diffstat you're not touching the genirq stuff, but instead hack a single architecture to support this feature. Sounds like an ill designed hack. A better approach would be to add a flag to the cpuset infrastructure that says whether its a system set or not. A system set would be one that services the general purpose OS and would include things like the IRQ affinity and unbound kernel threads (including unbound workqueues - or single workqueues). This flag would default to on, and by switching it off for the root set, and a select subset you would push the System away from those cpus, thereby isolating them. cpuisol: Do not schedule workqueues on the isolated CPUs (per-cpu workqueues, the single ones are treated in the previous section) I still strongly disagree with this approach. Workqueues are passive, they don't do anything unless work is provided to them. By blindly not starting them you handicap the system and services that rely on them. Have things changed since since my first bad encounter with Workqueues. I am referring to this thread. http://kerneltrap.org/mailarchive/linux-kernel/2007/5/29/97039 (you even acknowledged this problem, by saying it breaks oprofile for instance - still trying to push a change that knowingly breaks a lot of stuff is bad manners on lkml and not acceptable for mainline) The way to do this is to avoid the generation of work, not the execution of it. cpuisol: Move on-stack array used for boot cmd parsing into __initdata cpuisol: Documentation updates cpuisol: Minor updates to the Kconfig options No idea about these patches,... cpuisol: Do not halt isolated CPUs with Stop Machine Very strong NACK on this one, it breaks a lot of functionality in non-obvious ways, as has been pointed out to you numerous times. Such patches are just not acceptable for mainline - full stop. Mark -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH sched-devel 0/7] CPU isolation extensions
On Fri, 2008-02-22 at 08:38 -0500, Mark Hounschell wrote: List of commits cpuisol: Make cpu isolation configrable and export isolated map cpu_isolated_map was a bad hack when it was introduced, I feel we should deprecate it and fully integrate the functionality into cpusets. That would give a much more flexible end-result. CPU-sets can already isolate cpus by either creating a cpu outside of any set, or a set with a single cpu not shared by any other sets. Peter, what about when I am NOT using cpusets and are disabled in my config but I still want to use this? Then you enable it? cpuisol: Do not schedule workqueues on the isolated CPUs (per-cpu workqueues, the single ones are treated in the previous section) I still strongly disagree with this approach. Workqueues are passive, they don't do anything unless work is provided to them. By blindly not starting them you handicap the system and services that rely on them. Have things changed since since my first bad encounter with Workqueues. I am referring to this thread. http://kerneltrap.org/mailarchive/linux-kernel/2007/5/29/97039 Just means you get to fix those problems. By blindly not starting them you introduce others. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH sched-devel 0/7] CPU isolation extensions
Dmitry Adamushko wrote: Hi Max, [ ... ] Last patch to the stop machine is potentially unsafe and is marked as experimental. Unfortunately it's currently the only option that allows dynamic module insertion/removal for above scenarios. I'm puzzled by the following part (can be a misunderstanding from my side) +config CPUISOL_STOPMACHINE + bool Do not halt isolated CPUs with Stop Machine (EXPERIMENTAL) + depends on CPUISOL STOP_MACHINE EXPERIMENTAL + help + If this option is enabled kernel will not halt isolated CPUs + when Stop Machine is triggered. Stop Machine is currently only + used by the module insertion and removal. this only part. What about e.g. a 'cpu hotplug' case (_cpu_down())? (or we should abstract it a bit to the point that e.g. a cpu can be considered as 'a module'? :-) My bad. I forgot to update that text. As you and other folks pointed out stopmachine is used in a few other places besides module loading. We had a discussion about this awhile ago. I just forgot to update the text. Will do. Max -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH sched-devel 0/7] CPU isolation extensions
Peter Zijlstra wrote: On Thu, 2008-02-21 at 18:38 -0800, Max Krasnyanskiy wrote: As you suggested I'm sending CPU isolation patches for review/inclusion into sched-devel tree. They are against 2.6.25-rc2. You can also pull them from my GIT tree at git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git master Post patches! I can't review a git tree.. I did. But it looks like I screwed the --cc list. I just resent them. List of commits cpuisol: Make cpu isolation configrable and export isolated map cpu_isolated_map was a bad hack when it was introduced, I feel we should deprecate it and fully integrate the functionality into cpusets. That would give a much more flexible end-result. That's not not currently possible and will introduce a lot of complexity. I'm pretty sure you missied the discussion I had with Paul (you were cc'ed on that btw). In fact I provided the link to that discussion in the original email. Here it is again: http://marc.info/?l=linux-kernelm=120180692331461w=2 Basically the problem is very simple. CPU isolation needs a simple/efficient way to check if CPU N is isolated. cpuset/cgroup APIs are not designed for that. In other to figure out if a CPU N is isolated one has to iterate through all cpusets and checks their cpu maps. That requires several levels of locking (cgroup and cpuset). The other issue is that cpusets are a bit too dynamic (see the thread above for more details) we'd need notified mechanisms to notify subsystems when a CPUs become isolated. Again more complexity. Since I integrated cpu isolation with cpu hotplug it's already addressed in a nice simple way. Please take a look at that discussion. I do not think it's worth the effort to put this into cpusets. cpu_isolated_map is very clean and simple concept and integrates nicely with the rest of the cpu maps. ie It's very much the same concept and API as cpu_online_map, etc. If we want to integrate this stuff with cpusets I think the best approach would be is to have cpusets update the cpu_isolated_map just like it currently updates scheduler domains. CPU-sets can already isolate cpus by either creating a cpu outside of any set, or a set with a single cpu not shared by any other sets. This only works for user-space. As I mentioned about for full CPU isolation various kernel subsystems need to be aware of that CPUs are isolated in order to avoid activities on them. This also allows for isolated groups, there are good reasons to isolate groups, esp. now that we have a stronger RT balancer. SMP and hard RT are not exclusive. A design that does not take that into account is too rigid. You're thinking scheduling only. Paul had the same confusion ;-) As I explained before I'm redefining (or proposing to redefine) CPU isolation to 1. Isolated CPU(s) must not be subject to scheduler load balancing Users must explicitly bind threads in order to run on those CPU(s). 2. By default interrupts must not be routed to the isolated CPU(s) User must route interrupts (if any) to those CPUs explicitly. 3. In general kernel subsystems must avoid activity on the isolated CPU(s) as much as possible Includes workqueues, per CPU threads, etc. This feature is configurable and is disabled by default. Only #1 has to do with the scheduling. The rest has _nothing_ to do with it. cpuisol: Do not route IRQs to the CPUs isolated at boot From the diffstat you're not touching the genirq stuff, but instead hack a single architecture to support this feature. Sounds like an ill designed hack. Ah, good point. This patches started before genirq was merged and I did not realize that there is a way to set default irq affinity with genirq. I'll definitely take a look at that. A better approach would be to add a flag to the cpuset infrastructure that says whether its a system set or not. A system set would be one that services the general purpose OS and would include things like the IRQ affinity and unbound kernel threads (including unbound workqueues - or single workqueues). This flag would default to on, and by switching it off for the root set, and a select subset you would push the System away from those cpus, thereby isolating them. You're talking about very high level API. I'm totally all for it. What the patches deal with is actual low level stuff that is needed to do the push the system away from those cpus. As I mentioned above we can have cpuset update cpu_isolated_map for example. cpuisol: Do not schedule workqueues on the isolated CPUs (per-cpu workqueues, the single ones are treated in the previous section) I still strongly disagree with this approach. Workqueues are passive, they don't do anything unless work is provided to them. By blindly not starting them you handicap the system and services that rely on them. Oh boy, back to square one. I covered this already. I even started a thread on that and explained what this is and why its needed.
Re: [PATCH sched-devel 0/7] CPU isolation extensions
Mark Hounschell wrote: Peter Zijlstra wrote: On Thu, 2008-02-21 at 18:38 -0800, Max Krasnyanskiy wrote: As you suggested I'm sending CPU isolation patches for review/inclusion into sched-devel tree. They are against 2.6.25-rc2. You can also pull them from my GIT tree at git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git master Post patches! I can't review a git tree.. Max, could you also post them for 2.6.24.2 stable please. Thanks Will do. Max -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH sched-devel 0/7] CPU isolation extensions
Peter Zijlstra wrote: On Fri, 2008-02-22 at 08:38 -0500, Mark Hounschell wrote: List of commits cpuisol: Make cpu isolation configrable and export isolated map cpu_isolated_map was a bad hack when it was introduced, I feel we should deprecate it and fully integrate the functionality into cpusets. That would give a much more flexible end-result. CPU-sets can already isolate cpus by either creating a cpu outside of any set, or a set with a single cpu not shared by any other sets. Peter, what about when I am NOT using cpusets and are disabled in my config but I still want to use this? Then you enable it? I'm with Mark on this one. For example if I have two core machine I do not need cpusets to manage them. Plus like I explained in prev email cpuset is higher level API. We can think of a way to integrated them if needed. cpuisol: Do not schedule workqueues on the isolated CPUs (per-cpu workqueues, the single ones are treated in the previous section) I still strongly disagree with this approach. Workqueues are passive, they don't do anything unless work is provided to them. By blindly not starting them you handicap the system and services that rely on them. Have things changed since since my first bad encounter with Workqueues. I am referring to this thread. http://kerneltrap.org/mailarchive/linux-kernel/2007/5/29/97039 Just means you get to fix those problems. By blindly not starting them you introduce others. Please give me an example of what you have in mind. Also if you look at the patch (which I've now posted properly) it's not just not starting them. I also redirected all future scheduled work to non-isolated CPU. ie If work is scheduled on the isolated CPU this work is treated as if the work queue is single threaded. As I explained before most subsystem do not care which CPU actually gets to execute the work. Oprofile is the only one I know of that breaks because it cannot collect the stats from the isolated CPUs. I'm thinking of a different solution for oprofile, maybe collection samples through IPIs or something. Max -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH sched-devel 0/7] CPU isolation extensions
Ingo, As you suggested I'm sending CPU isolation patches for review/inclusion into sched-devel tree. They are against 2.6.25-rc2. You can also pull them from my GIT tree at git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git master Diffstat: b/Documentation/ABI/testing/sysfs-devices-system-cpu | 41 ++ b/Documentation/cpu-isolation.txt| 114 ++- b/arch/x86/Kconfig |1 b/arch/x86/kernel/genapic_flat_64.c |5 b/drivers/base/cpu.c | 48 b/include/linux/cpumask.h|3 b/kernel/Kconfig.cpuisol | 15 ++ b/kernel/Makefile|4 b/kernel/cpu.c | 49 b/kernel/sched.c | 37 -- b/kernel/stop_machine.c |9 + b/kernel/workqueue.c | 31 +++-- kernel/Kconfig.cpuisol | 56 ++--- kernel/cpu.c | 16 +- 14 files changed, 356 insertions(+), 73 deletions(-) List of commits cpuisol: Make cpu isolation configrable and export isolated map cpuisol: Do not route IRQs to the CPUs isolated at boot cpuisol: Do not schedule workqueues on the isolated CPUs cpuisol: Move on-stack array used for boot cmd parsing into __initdata cpuisol: Documentation updates cpuisol: Minor updates to the Kconfig options cpuisol: Do not halt isolated CPUs with Stop Machine This patch series extends CPU isolation support. The primary idea here is to be able to use some CPU cores as the dedicated engines for running user-space code with minimal kernel overhead/intervention, think of it as an SPE in the Cell processor. I'd like to be able to run a CPU intensive (%100) RT task on one of the processors without adversely affecting or being affected by the other system activities. System activities here include _kernel_ activities as well. I'm personally using this for hard realtime purposes. With CPU isolation it's very easy to achieve single digit usec worst case and around 200 nsec average response times on off-the-shelf multi- processor/core systems (vanilla kernel plus these patches) even under extreme system load. I'm working with legal folks on releasing hard RT user-space framework for that. I believe with the current multi-core CPU trend we will see more and more applications that explore this capability: RT gaming engines, simulators, hard RT apps, etc. Hence the proposal is to extend current CPU isolation feature. The new definition of the CPU isolation would be: --- 1. Isolated CPU(s) must not be subject to scheduler load balancing Users must explicitly bind threads in order to run on those CPU(s). 2. By default interrupts must not be routed to the isolated CPU(s) User must route interrupts (if any) to those CPUs explicitly. 3. In general kernel subsystems must avoid activity on the isolated CPU(s) as much as possible Includes workqueues, per CPU threads, etc. This feature is configurable and is disabled by default. --- I've been maintaining this stuff since around 2.6.18 and it's been running in production environment for a couple of years now. It's been tested on all kinds of machines, from NUMA boxes like HP xw9300/9400 to tiny uTCA boards like Mercury AXA110. The messiest part used to be SLAB garbage collector changes. With the new SLUB all that mess goes away (ie no changes necessary). Also CFS seems to handle CPU hotplug much better than O(1) did (ie domains are recomputed dynamically) so that isolation can be done at any time (via sysfs). So this seems like a good time to merge. We've had scheduler support for CPU isolation ever since O(1) scheduler went it. In other words #1 is already supported. These patches do not change/affect that functionality in any way. #2 is trivial one liner change to the IRQ init code. #3 is addressed by a couple of separate patches. The main problem here is that RT thread can prevent kernel threads from running and machine gets stuck because other CPUs are waiting for those threads to run and report back. Folks involved in the scheduler/cpuset development provided a lot of feedback on the first series of patches. I believe I managed to explain and clarify every aspect. Paul Jackson initially suggested to implement #2 and #3 using cpusets subsystem. Paul and I looked at it more closely and determined that exporting cpu_isolated_map instead is a better option. Details here http://marc.info/?l=linux-kernel=120180692331461=2 Last patch to the stop machine is potentially unsafe and is marked as experimental. Unfortunately it's currently the only option that allows dynamic module insertion/removal for above scenarios. From the previous discussions it's the only
[PATCH sched-devel 0/7] CPU isolation extensions
Ingo, As you suggested I'm sending CPU isolation patches for review/inclusion into sched-devel tree. They are against 2.6.25-rc2. You can also pull them from my GIT tree at git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git master Diffstat: b/Documentation/ABI/testing/sysfs-devices-system-cpu | 41 ++ b/Documentation/cpu-isolation.txt| 114 ++- b/arch/x86/Kconfig |1 b/arch/x86/kernel/genapic_flat_64.c |5 b/drivers/base/cpu.c | 48 b/include/linux/cpumask.h|3 b/kernel/Kconfig.cpuisol | 15 ++ b/kernel/Makefile|4 b/kernel/cpu.c | 49 b/kernel/sched.c | 37 -- b/kernel/stop_machine.c |9 + b/kernel/workqueue.c | 31 +++-- kernel/Kconfig.cpuisol | 56 ++--- kernel/cpu.c | 16 +- 14 files changed, 356 insertions(+), 73 deletions(-) List of commits cpuisol: Make cpu isolation configrable and export isolated map cpuisol: Do not route IRQs to the CPUs isolated at boot cpuisol: Do not schedule workqueues on the isolated CPUs cpuisol: Move on-stack array used for boot cmd parsing into __initdata cpuisol: Documentation updates cpuisol: Minor updates to the Kconfig options cpuisol: Do not halt isolated CPUs with Stop Machine This patch series extends CPU isolation support. The primary idea here is to be able to use some CPU cores as the dedicated engines for running user-space code with minimal kernel overhead/intervention, think of it as an SPE in the Cell processor. I'd like to be able to run a CPU intensive (%100) RT task on one of the processors without adversely affecting or being affected by the other system activities. System activities here include _kernel_ activities as well. I'm personally using this for hard realtime purposes. With CPU isolation it's very easy to achieve single digit usec worst case and around 200 nsec average response times on off-the-shelf multi- processor/core systems (vanilla kernel plus these patches) even under extreme system load. I'm working with legal folks on releasing hard RT user-space framework for that. I believe with the current multi-core CPU trend we will see more and more applications that explore this capability: RT gaming engines, simulators, hard RT apps, etc. Hence the proposal is to extend current CPU isolation feature. The new definition of the CPU isolation would be: --- 1. Isolated CPU(s) must not be subject to scheduler load balancing Users must explicitly bind threads in order to run on those CPU(s). 2. By default interrupts must not be routed to the isolated CPU(s) User must route interrupts (if any) to those CPUs explicitly. 3. In general kernel subsystems must avoid activity on the isolated CPU(s) as much as possible Includes workqueues, per CPU threads, etc. This feature is configurable and is disabled by default. --- I've been maintaining this stuff since around 2.6.18 and it's been running in production environment for a couple of years now. It's been tested on all kinds of machines, from NUMA boxes like HP xw9300/9400 to tiny uTCA boards like Mercury AXA110. The messiest part used to be SLAB garbage collector changes. With the new SLUB all that mess goes away (ie no changes necessary). Also CFS seems to handle CPU hotplug much better than O(1) did (ie domains are recomputed dynamically) so that isolation can be done at any time (via sysfs). So this seems like a good time to merge. We've had scheduler support for CPU isolation ever since O(1) scheduler went it. In other words #1 is already supported. These patches do not change/affect that functionality in any way. #2 is trivial one liner change to the IRQ init code. #3 is addressed by a couple of separate patches. The main problem here is that RT thread can prevent kernel threads from running and machine gets stuck because other CPUs are waiting for those threads to run and report back. Folks involved in the scheduler/cpuset development provided a lot of feedback on the first series of patches. I believe I managed to explain and clarify every aspect. Paul Jackson initially suggested to implement #2 and #3 using cpusets subsystem. Paul and I looked at it more closely and determined that exporting cpu_isolated_map instead is a better option. Details here http://marc.info/?l=linux-kernelm=120180692331461w=2 Last patch to the stop machine is potentially unsafe and is marked as experimental. Unfortunately it's currently the only option that allows dynamic module insertion/removal for above scenarios. From the previous discussions it's the only