Re: [PATCH sched-devel 0/7] CPU isolation extensions

2008-02-25 Thread Max Krasnyanskiy

Hi Peter,

Sorry for delay in reply.


Please, wrap your emails at 78 - most mailers can do this.

Done.


On Fri, 2008-02-22 at 14:05 -0800, Max Krasnyanskiy wrote:

Peter Zijlstra wrote:

On Thu, 2008-02-21 at 18:38 -0800, Max Krasnyanskiy wrote:




List of commits
   cpuisol: Make cpu isolation configrable and export isolated map
 
cpu_isolated_map was a bad hack when it was introduced, I feel we should

deprecate it and fully integrate the functionality into cpusets. That would
give a much more flexible end-result.

That's not not currently possible and will introduce a lot of complexity.
I'm pretty sure you missied the discussion I had with Paul (you were cc'ed on 
that btw).
In fact I provided the link to that discussion in the original email.
Here it is again:
http://marc.info/?l=linux-kernel=120180692331461=2


I read it, I just firmly disagree.

Basically the problem is very simple. CPU isolation needs a simple/efficient way to check 
if CPU N is isolated.


I'm not seeing the need for that outside of setting up the various
states. That is, once all the affinities are set up, you'd hardly ever
would (or should - imho) need to know if a particular CPU is isolated or not.
Unless I'm missing something that's only possible for a very static system. 
What I mean is that yes you could go and set irq affinity, apps affinity, 
workqueue thread affinity, etc not to run on the isolated cpus. It works 
_until_ something changes, at which point the system needs to know that it's 
not supposed to touch CPU N.
For example new IRQ is registered, new workqueue is created (fs mounted, net 
network interface is created, etc), new kthread is started, etc.
Sure we can introduce default affinity masks for irqs, workqueues, etc. But 
that's essentially just duplicating cpu_isolated_map.



 cpuset/cgroup APIs are not designed for that. In other to figure
out if a CPU N is isolated one has to iterate through all cpusets and checks 
their cpu maps.
That requires several levels of locking (cgroup and cpuset).



The other issue is that cpusets are a bit too dynamic (see the thread above for 
more details)
we'd need notified mechanisms to notify subsystems when a CPUs become isolated. 
Again more
complexity. Since I integrated cpu isolation with cpu hotplug it's already 
addressed in a nice
simple way.


I guess you have another definition of nice than I do.

No, not really.
Lets talk specifics. My goal was not to introduce a bunch of new functionality 
and rewrite workqueues and stuff, instead I wanted to integrated with existing 
mechanisms. CPU maps are used everywhere and exporting cpu_isolated_map was a 
natural way to make other parts of the kernel aware of the isolated CPUs.



Please take a look at that discussion. I do not think it's worth the effort to 
put this into
cpusets. cpu_isolated_map is very clean and simple concept and integrates nicely with the 
rest of the cpu maps. ie It's very much the same concept and API as cpu_online_map, etc.


I'm thinking cpu_isolated_map is a very dirty hack.

If we want to integrate this stuff with cpusets I think the best approach would be is to 
have cpusets update the cpu_isolated_map just like it currently updates scheduler domains. 
 

CPU-sets can already isolate cpus by either creating a cpu outside of any set,
or a set with a single cpu not shared by any other sets.
This only works for user-space. As I mentioned about for full CPU isolation various kernel 
subsystems need to be aware of that CPUs are isolated in order to avoid activities on them.


Yes, hence the proposed system flag to handle the kernel bits like
unbounded kernel threads and IRQs.
I do not see a specific proposals here. The funny part that we're not even 
disagreeing on the high level. Yes It'd be nice to have such a flag ;-)

But how will genirq subsystem, for example, be aware of that flag ?
ie How would it know that by default it is not supposed to route irqs to the 
CPUs in the cpusets with that flag ?

As I explained above setting affinity for existing irqs is not enough.
Same for workqueus or any other susbsytem that wants to run per-cpu threads 
and stuff.



This also allows for isolated groups, there are good reasons to isolate groups,
esp. now that we have a stronger RT balancer. SMP and hard RT are not
exclusive. A design that does not take that into account is too rigid.



You're thinking scheduling only. Paul had the same confusion ;-)


I'm not, I'm thinking it ought to allow for it.
One way I can think of how to support groups and allow for RT balancer is 
this: Make scheduler ignore cpu_isolated_map and give cpusets full control of 
the scheduler domains. Use cpu_isolated_map to only for hw irq and other 
kernel sub-systems. That way cpusets could mark cpus in the group as isolated 
to get rid of the  kernel activity and build sched domain such that tasks get 
balanced in it.
The thing I do not like about it is that there is no way to boot the system 
with CPU N isolated 

Re: [PATCH sched-devel 0/7] CPU isolation extensions

2008-02-25 Thread Max Krasnyanskiy

Hi Peter,

Sorry for delay in reply.


Please, wrap your emails at 78 - most mailers can do this.

Done.


On Fri, 2008-02-22 at 14:05 -0800, Max Krasnyanskiy wrote:

Peter Zijlstra wrote:

On Thu, 2008-02-21 at 18:38 -0800, Max Krasnyanskiy wrote:




List of commits
   cpuisol: Make cpu isolation configrable and export isolated map
 
cpu_isolated_map was a bad hack when it was introduced, I feel we should

deprecate it and fully integrate the functionality into cpusets. That would
give a much more flexible end-result.

That's not not currently possible and will introduce a lot of complexity.
I'm pretty sure you missied the discussion I had with Paul (you were cc'ed on 
that btw).
In fact I provided the link to that discussion in the original email.
Here it is again:
http://marc.info/?l=linux-kernelm=120180692331461w=2


I read it, I just firmly disagree.

Basically the problem is very simple. CPU isolation needs a simple/efficient way to check 
if CPU N is isolated.


I'm not seeing the need for that outside of setting up the various
states. That is, once all the affinities are set up, you'd hardly ever
would (or should - imho) need to know if a particular CPU is isolated or not.
Unless I'm missing something that's only possible for a very static system. 
What I mean is that yes you could go and set irq affinity, apps affinity, 
workqueue thread affinity, etc not to run on the isolated cpus. It works 
_until_ something changes, at which point the system needs to know that it's 
not supposed to touch CPU N.
For example new IRQ is registered, new workqueue is created (fs mounted, net 
network interface is created, etc), new kthread is started, etc.
Sure we can introduce default affinity masks for irqs, workqueues, etc. But 
that's essentially just duplicating cpu_isolated_map.



 cpuset/cgroup APIs are not designed for that. In other to figure
out if a CPU N is isolated one has to iterate through all cpusets and checks 
their cpu maps.
That requires several levels of locking (cgroup and cpuset).



The other issue is that cpusets are a bit too dynamic (see the thread above for 
more details)
we'd need notified mechanisms to notify subsystems when a CPUs become isolated. 
Again more
complexity. Since I integrated cpu isolation with cpu hotplug it's already 
addressed in a nice
simple way.


I guess you have another definition of nice than I do.

No, not really.
Lets talk specifics. My goal was not to introduce a bunch of new functionality 
and rewrite workqueues and stuff, instead I wanted to integrated with existing 
mechanisms. CPU maps are used everywhere and exporting cpu_isolated_map was a 
natural way to make other parts of the kernel aware of the isolated CPUs.



Please take a look at that discussion. I do not think it's worth the effort to 
put this into
cpusets. cpu_isolated_map is very clean and simple concept and integrates nicely with the 
rest of the cpu maps. ie It's very much the same concept and API as cpu_online_map, etc.


I'm thinking cpu_isolated_map is a very dirty hack.

If we want to integrate this stuff with cpusets I think the best approach would be is to 
have cpusets update the cpu_isolated_map just like it currently updates scheduler domains. 
 

CPU-sets can already isolate cpus by either creating a cpu outside of any set,
or a set with a single cpu not shared by any other sets.
This only works for user-space. As I mentioned about for full CPU isolation various kernel 
subsystems need to be aware of that CPUs are isolated in order to avoid activities on them.


Yes, hence the proposed system flag to handle the kernel bits like
unbounded kernel threads and IRQs.
I do not see a specific proposals here. The funny part that we're not even 
disagreeing on the high level. Yes It'd be nice to have such a flag ;-)

But how will genirq subsystem, for example, be aware of that flag ?
ie How would it know that by default it is not supposed to route irqs to the 
CPUs in the cpusets with that flag ?

As I explained above setting affinity for existing irqs is not enough.
Same for workqueus or any other susbsytem that wants to run per-cpu threads 
and stuff.



This also allows for isolated groups, there are good reasons to isolate groups,
esp. now that we have a stronger RT balancer. SMP and hard RT are not
exclusive. A design that does not take that into account is too rigid.



You're thinking scheduling only. Paul had the same confusion ;-)


I'm not, I'm thinking it ought to allow for it.
One way I can think of how to support groups and allow for RT balancer is 
this: Make scheduler ignore cpu_isolated_map and give cpusets full control of 
the scheduler domains. Use cpu_isolated_map to only for hw irq and other 
kernel sub-systems. That way cpusets could mark cpus in the group as isolated 
to get rid of the  kernel activity and build sched domain such that tasks get 
balanced in it.
The thing I do not like about it is that there is no way to boot the system 
with CPU N 

Re: [PATCH sched-devel 0/7] CPU isolation extensions

2008-02-23 Thread Peter Zijlstra

Please, wrap your emails at 78 - most mailers can do this.

On Fri, 2008-02-22 at 14:05 -0800, Max Krasnyanskiy wrote:
> Peter Zijlstra wrote:
> > On Thu, 2008-02-21 at 18:38 -0800, Max Krasnyanskiy wrote:
> > 

> >> List of commits
> >>cpuisol: Make cpu isolation configrable and export isolated map
> >  
> > cpu_isolated_map was a bad hack when it was introduced, I feel we should
> > deprecate it and fully integrate the functionality into cpusets. That would
> > give a much more flexible end-result.
> That's not not currently possible and will introduce a lot of complexity.
> I'm pretty sure you missied the discussion I had with Paul (you were cc'ed on 
> that btw).
> In fact I provided the link to that discussion in the original email.
> Here it is again:
>   http://marc.info/?l=linux-kernel=120180692331461=2

I read it, I just firmly disagree.

> Basically the problem is very simple. CPU isolation needs a simple/efficient 
> way to check 
> if CPU N is isolated.

I'm not seeing the need for that outside of setting up the various
states. That is, once all the affinities are set up, you'd hardly ever
would (or should - imho) need to know if a particular CPU is isolated or
not.

>  cpuset/cgroup APIs are not designed for that. In other to figure
> out if a CPU N is isolated one has to iterate through all cpusets and checks 
> their cpu maps.
> That requires several levels of locking (cgroup and cpuset).

> The other issue is that cpusets are a bit too dynamic (see the thread above 
> for more details)
> we'd need notified mechanisms to notify subsystems when a CPUs become 
> isolated. Again more
> complexity. Since I integrated cpu isolation with cpu hotplug it's already 
> addressed in a nice
> simple way.

I guess you have another definition of nice than I do.

> Please take a look at that discussion. I do not think it's worth the effort 
> to put this into
> cpusets. cpu_isolated_map is very clean and simple concept and integrates 
> nicely with the 
> rest of the cpu maps. ie It's very much the same concept and API as 
> cpu_online_map, etc.

I'm thinking cpu_isolated_map is a very dirty hack.

> If we want to integrate this stuff with cpusets I think the best approach 
> would be is to 
> have cpusets update the cpu_isolated_map just like it currently updates 
> scheduler domains. 
>  
> > CPU-sets can already isolate cpus by either creating a cpu outside of any 
> > set,
> > or a set with a single cpu not shared by any other sets.
> This only works for user-space. As I mentioned about for full CPU isolation 
> various kernel 
> subsystems need to be aware of that CPUs are isolated in order to avoid 
> activities on them.

Yes, hence the proposed system flag to handle the kernel bits like
unbounded kernel threads and IRQs.

> > This also allows for isolated groups, there are good reasons to isolate 
> > groups,
> > esp. now that we have a stronger RT balancer. SMP and hard RT are not
> > exclusive. A design that does not take that into account is too rigid.

> You're thinking scheduling only. Paul had the same confusion ;-)

I'm not, I'm thinking it ought to allow for it.

> As I explained before I'm redefining (or proposing to redefine) CPU isolation 
> to
> 
> 1. Isolated CPU(s) must not be subject to scheduler load balancing
>  Users must explicitly bind threads in order to run on those CPU(s).

I'm thinking that should be optional, load-balancing might make sense
under some scenarios.

> 2. By default interrupts must not be routed to the isolated CPU(s)
>  User must route interrupts (if any) to those CPUs explicitly.

That is what I allowed for by the system flag.

> 3. In general kernel subsystems must avoid activity on the isolated CPU(s) as 
> much as possible
>  Includes workqueues, per CPU threads, etc.
>  This feature is configurable and is disabled by default.

How am I refuting any of these points? I'm very clear on what you want,
I'm just saying I want to get there differently.

> >>cpuisol: Do not route IRQs to the CPUs isolated at boot
> > 
> >>From the diffstat you're not touching the genirq stuff, but instead hack a
> > single architecture to support this feature. Sounds like an ill designed 
> > hack.
> Ah, good point. This patches started before genirq was merged and I did not 
> realize
> that there is a way to set default irq affinity with genirq.
> I'll definitely take a look at that.

I said so before, but good to see you've picked this up. I didn't know
if there was a genirq way to set smp affinities - but if there wasn't
this was a good moment to introduce that.

> > A better approach would be to add a flag to the cpuset infrastructure that 
> > says
> > whether its a system set or not. A system set would be one that services the
> > general purpose OS and would include things like the IRQ affinity and 
> > unbound
> > kernel threads (including unbound workqueues - or single workqueues). This 
> > flag
> > would default to on, and by switching it off for the root 

Re: [PATCH sched-devel 0/7] CPU isolation extensions

2008-02-23 Thread Peter Zijlstra

Please, wrap your emails at 78 - most mailers can do this.

On Fri, 2008-02-22 at 14:05 -0800, Max Krasnyanskiy wrote:
 Peter Zijlstra wrote:
  On Thu, 2008-02-21 at 18:38 -0800, Max Krasnyanskiy wrote:
  

  List of commits
 cpuisol: Make cpu isolation configrable and export isolated map
   
  cpu_isolated_map was a bad hack when it was introduced, I feel we should
  deprecate it and fully integrate the functionality into cpusets. That would
  give a much more flexible end-result.
 That's not not currently possible and will introduce a lot of complexity.
 I'm pretty sure you missied the discussion I had with Paul (you were cc'ed on 
 that btw).
 In fact I provided the link to that discussion in the original email.
 Here it is again:
   http://marc.info/?l=linux-kernelm=120180692331461w=2

I read it, I just firmly disagree.

 Basically the problem is very simple. CPU isolation needs a simple/efficient 
 way to check 
 if CPU N is isolated.

I'm not seeing the need for that outside of setting up the various
states. That is, once all the affinities are set up, you'd hardly ever
would (or should - imho) need to know if a particular CPU is isolated or
not.

  cpuset/cgroup APIs are not designed for that. In other to figure
 out if a CPU N is isolated one has to iterate through all cpusets and checks 
 their cpu maps.
 That requires several levels of locking (cgroup and cpuset).

 The other issue is that cpusets are a bit too dynamic (see the thread above 
 for more details)
 we'd need notified mechanisms to notify subsystems when a CPUs become 
 isolated. Again more
 complexity. Since I integrated cpu isolation with cpu hotplug it's already 
 addressed in a nice
 simple way.

I guess you have another definition of nice than I do.

 Please take a look at that discussion. I do not think it's worth the effort 
 to put this into
 cpusets. cpu_isolated_map is very clean and simple concept and integrates 
 nicely with the 
 rest of the cpu maps. ie It's very much the same concept and API as 
 cpu_online_map, etc.

I'm thinking cpu_isolated_map is a very dirty hack.

 If we want to integrate this stuff with cpusets I think the best approach 
 would be is to 
 have cpusets update the cpu_isolated_map just like it currently updates 
 scheduler domains. 
  
  CPU-sets can already isolate cpus by either creating a cpu outside of any 
  set,
  or a set with a single cpu not shared by any other sets.
 This only works for user-space. As I mentioned about for full CPU isolation 
 various kernel 
 subsystems need to be aware of that CPUs are isolated in order to avoid 
 activities on them.

Yes, hence the proposed system flag to handle the kernel bits like
unbounded kernel threads and IRQs.

  This also allows for isolated groups, there are good reasons to isolate 
  groups,
  esp. now that we have a stronger RT balancer. SMP and hard RT are not
  exclusive. A design that does not take that into account is too rigid.

 You're thinking scheduling only. Paul had the same confusion ;-)

I'm not, I'm thinking it ought to allow for it.

 As I explained before I'm redefining (or proposing to redefine) CPU isolation 
 to
 
 1. Isolated CPU(s) must not be subject to scheduler load balancing
  Users must explicitly bind threads in order to run on those CPU(s).

I'm thinking that should be optional, load-balancing might make sense
under some scenarios.

 2. By default interrupts must not be routed to the isolated CPU(s)
  User must route interrupts (if any) to those CPUs explicitly.

That is what I allowed for by the system flag.

 3. In general kernel subsystems must avoid activity on the isolated CPU(s) as 
 much as possible
  Includes workqueues, per CPU threads, etc.
  This feature is configurable and is disabled by default.

How am I refuting any of these points? I'm very clear on what you want,
I'm just saying I want to get there differently.

 cpuisol: Do not route IRQs to the CPUs isolated at boot
  
 From the diffstat you're not touching the genirq stuff, but instead hack a
  single architecture to support this feature. Sounds like an ill designed 
  hack.
 Ah, good point. This patches started before genirq was merged and I did not 
 realize
 that there is a way to set default irq affinity with genirq.
 I'll definitely take a look at that.

I said so before, but good to see you've picked this up. I didn't know
if there was a genirq way to set smp affinities - but if there wasn't
this was a good moment to introduce that.

  A better approach would be to add a flag to the cpuset infrastructure that 
  says
  whether its a system set or not. A system set would be one that services the
  general purpose OS and would include things like the IRQ affinity and 
  unbound
  kernel threads (including unbound workqueues - or single workqueues). This 
  flag
  would default to on, and by switching it off for the root set, and a select
  subset you would push the System away from those cpus, thereby isolating 
  them.
 You're 

Re: [PATCH sched-devel 0/7] CPU isolation extensions

2008-02-22 Thread Max Krasnyanskiy

Peter Zijlstra wrote:

On Fri, 2008-02-22 at 08:38 -0500, Mark Hounschell wrote:


List of commits
   cpuisol: Make cpu isolation configrable and export isolated map
 
cpu_isolated_map was a bad hack when it was introduced, I feel we should

deprecate it and fully integrate the functionality into cpusets. That would
give a much more flexible end-result.

CPU-sets can already isolate cpus by either creating a cpu outside of any set,
or a set with a single cpu not shared by any other sets.


Peter, what about when I am NOT using cpusets and are disabled in my config but
I still want to use this?


Then you enable it?

I'm with Mark on this one. For example if I have two core machine I do not need 
cpusets
to manage them.
Plus like I explained in prev email cpuset is higher level API. We can think of 
a way to
integrated them if needed.


   cpuisol: Do not schedule workqueues on the isolated CPUs
 
(per-cpu workqueues, the single ones are treated in the previous section)


I still strongly disagree with this approach. Workqueues are passive, they
don't do anything unless work is provided to them. By blindly not starting them
you handicap the system and services that rely on them.


Have things changed since since my first bad encounter with Workqueues.
I am referring to this thread. 

http://kerneltrap.org/mailarchive/linux-kernel/2007/5/29/97039 


Just means you get to fix those problems. By blindly not starting them
you introduce others.


Please give me an example of what you have in mind.
Also if you look at the patch (which I've now posted properly) it's not just 
not starting them.
I also redirected all future scheduled work to non-isolated CPU. ie If work is 
scheduled on the
isolated CPU this work is treated as if the work queue is single threaded. As I 
explained before
most subsystem do not care which CPU actually gets to execute the work. 
Oprofile is the only
one I know of that breaks because it cannot collect the stats from the isolated 
CPUs. I'm thinking
of a different solution for oprofile, maybe collection samples through IPIs or 
something.

Max
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH sched-devel 0/7] CPU isolation extensions

2008-02-22 Thread Max Krasnyanskiy

Mark Hounschell wrote:

Peter Zijlstra wrote:

On Thu, 2008-02-21 at 18:38 -0800, Max Krasnyanskiy wrote:

As you suggested I'm sending CPU isolation patches for review/inclusion into 
sched-devel tree. They are against 2.6.25-rc2.

You can also pull them from my GIT tree at
git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git 
master
 
Post patches! I can't review a git tree..
 

Max, could you also post them for 2.6.24.2 stable please. Thanks

Will do.

Max
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH sched-devel 0/7] CPU isolation extensions

2008-02-22 Thread Max Krasnyanskiy

Peter Zijlstra wrote:

On Thu, 2008-02-21 at 18:38 -0800, Max Krasnyanskiy wrote:

As you suggested I'm sending CPU isolation patches for review/inclusion into 
sched-devel tree. They are against 2.6.25-rc2.

You can also pull them from my GIT tree at
git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git 
master
 
Post patches! I can't review a git tree..

I did. But it looks like I screwed the --cc list. I just resent them.
 

List of commits
   cpuisol: Make cpu isolation configrable and export isolated map
 
cpu_isolated_map was a bad hack when it was introduced, I feel we should

deprecate it and fully integrate the functionality into cpusets. That would
give a much more flexible end-result.

That's not not currently possible and will introduce a lot of complexity.
I'm pretty sure you missied the discussion I had with Paul (you were cc'ed on 
that btw).
In fact I provided the link to that discussion in the original email.
Here it is again:
http://marc.info/?l=linux-kernel=120180692331461=2

Basically the problem is very simple. CPU isolation needs a simple/efficient way to check 
if CPU N is isolated. cpuset/cgroup APIs are not designed for that. In other to figure

out if a CPU N is isolated one has to iterate through all cpusets and checks 
their cpu maps.
That requires several levels of locking (cgroup and cpuset).
The other issue is that cpusets are a bit too dynamic (see the thread above for 
more details)
we'd need notified mechanisms to notify subsystems when a CPUs become isolated. 
Again more
complexity. Since I integrated cpu isolation with cpu hotplug it's already 
addressed in a nice
simple way.
Please take a look at that discussion. I do not think it's worth the effort to 
put this into
cpusets. cpu_isolated_map is very clean and simple concept and integrates nicely with the 
rest of the cpu maps. ie It's very much the same concept and API as cpu_online_map, etc.


If we want to integrate this stuff with cpusets I think the best approach would be is to 
have cpusets update the cpu_isolated_map just like it currently updates scheduler domains. 


CPU-sets can already isolate cpus by either creating a cpu outside of any set,
or a set with a single cpu not shared by any other sets.
This only works for user-space. As I mentioned about for full CPU isolation various kernel 
subsystems need to be aware of that CPUs are isolated in order to avoid activities on them.



This also allows for isolated groups, there are good reasons to isolate groups,
esp. now that we have a stronger RT balancer. SMP and hard RT are not
exclusive. A design that does not take that into account is too rigid.

You're thinking scheduling only. Paul had the same confusion ;-)
As I explained before I'm redefining (or proposing to redefine) CPU isolation to

1. Isolated CPU(s) must not be subject to scheduler load balancing
Users must explicitly bind threads in order to run on those CPU(s).

2. By default interrupts must not be routed to the isolated CPU(s)
User must route interrupts (if any) to those CPUs explicitly.

3. In general kernel subsystems must avoid activity on the isolated CPU(s) as 
much as possible
Includes workqueues, per CPU threads, etc.
This feature is configurable and is disabled by default.

Only #1 has to do with the scheduling. The rest has _nothing_ to do with it.


   cpuisol: Do not route IRQs to the CPUs isolated at boot



From the diffstat you're not touching the genirq stuff, but instead hack a

single architecture to support this feature. Sounds like an ill designed hack.

Ah, good point. This patches started before genirq was merged and I did not 
realize
that there is a way to set default irq affinity with genirq.
I'll definitely take a look at that.


A better approach would be to add a flag to the cpuset infrastructure that says
whether its a system set or not. A system set would be one that services the
general purpose OS and would include things like the IRQ affinity and unbound
kernel threads (including unbound workqueues - or single workqueues). This flag
would default to on, and by switching it off for the root set, and a select
subset you would push the System away from those cpus, thereby isolating them.
You're talking about very high level API. I'm totally all for it. What the patches deal 
with is actual low level stuff that is needed to do the "push the system away from those cpus".

As I mentioned above we can have cpuset update cpu_isolated_map for example.


   cpuisol: Do not schedule workqueues on the isolated CPUs
 
(per-cpu workqueues, the single ones are treated in the previous section)


I still strongly disagree with this approach. Workqueues are passive, they
don't do anything unless work is provided to them. By blindly not starting them
you handicap the system and services that rely on them.

Oh boy, back to square one. I covered this already.
I even started a thread on that and explained what this is and why its needed.

Re: [PATCH sched-devel 0/7] CPU isolation extensions

2008-02-22 Thread Max Krasnyanskiy

Dmitry Adamushko wrote:

Hi Max,


 [ ... ]
 Last patch to the stop machine is potentially unsafe and is marked as 
experimental. Unfortunately
 it's currently the only option that allows dynamic module insertion/removal 
for above scenarios.


I'm puzzled by the following part (can be a misunderstanding from my side)

+config CPUISOL_STOPMACHINE
+   bool "Do not halt isolated CPUs with Stop Machine (EXPERIMENTAL)"
+   depends on CPUISOL && STOP_MACHINE && EXPERIMENTAL
+   help
+ If this option is enabled kernel will not halt isolated CPUs
+ when Stop Machine is triggered. Stop Machine is currently only
+ used by the module insertion and removal.

this "only" part. What about e.g. a 'cpu hotplug' case (_cpu_down())?
(or we should abstract it a bit to the point that e.g. a cpu can be
considered as 'a module'? :-)


My bad. I forgot to update that text. As you and other folks pointed out
stopmachine is used in a few other places besides module loading. We had 
a discussion about this awhile ago. I just forgot to update the text. 
Will do.


Max
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH sched-devel 0/7] CPU isolation extensions

2008-02-22 Thread Peter Zijlstra

On Fri, 2008-02-22 at 08:38 -0500, Mark Hounschell wrote:

> >> List of commits
> >>cpuisol: Make cpu isolation configrable and export isolated map
> >  
> > cpu_isolated_map was a bad hack when it was introduced, I feel we should
> > deprecate it and fully integrate the functionality into cpusets. That would
> > give a much more flexible end-result.
> > 
> > CPU-sets can already isolate cpus by either creating a cpu outside of any 
> > set,
> > or a set with a single cpu not shared by any other sets.
> > 
> 
> Peter, what about when I am NOT using cpusets and are disabled in my config 
> but
> I still want to use this?

Then you enable it?

> >>cpuisol: Do not schedule workqueues on the isolated CPUs
> >  
> > (per-cpu workqueues, the single ones are treated in the previous section)
> > 
> > I still strongly disagree with this approach. Workqueues are passive, they
> > don't do anything unless work is provided to them. By blindly not starting 
> > them
> > you handicap the system and services that rely on them.
> > 
> 
> Have things changed since since my first bad encounter with Workqueues.
> I am referring to this thread. 
> 
> http://kerneltrap.org/mailarchive/linux-kernel/2007/5/29/97039 

Just means you get to fix those problems. By blindly not starting them
you introduce others.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH sched-devel 0/7] CPU isolation extensions

2008-02-22 Thread Mark Hounschell
Peter Zijlstra wrote:
> On Thu, 2008-02-21 at 18:38 -0800, Max Krasnyanskiy wrote:
> 
>> As you suggested I'm sending CPU isolation patches for review/inclusion into 
>> sched-devel tree. They are against 2.6.25-rc2.
>> You can also pull them from my GIT tree at
>>  git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git 
>> master
>  
> Post patches! I can't review a git tree..
>  

Max, could you also post them for 2.6.24.2 stable please. Thanks


>> Diffstat:
>>  b/Documentation/ABI/testing/sysfs-devices-system-cpu |   41 ++
>>  b/Documentation/cpu-isolation.txt|  114 
>> ++-
>>  b/arch/x86/Kconfig   |1 
>>  b/arch/x86/kernel/genapic_flat_64.c  |5 
>>  b/drivers/base/cpu.c |   48 
>>  b/include/linux/cpumask.h|3 
>>  b/kernel/Kconfig.cpuisol |   15 ++
>>  b/kernel/Makefile|4 
>>  b/kernel/cpu.c   |   49 
>>  b/kernel/sched.c |   37 --
>>  b/kernel/stop_machine.c  |9 +
>>  b/kernel/workqueue.c |   31 +++--
>>  kernel/Kconfig.cpuisol   |   56 ++---
>>  kernel/cpu.c |   16 +-
>>  14 files changed, 356 insertions(+), 73 deletions(-)
>>
>> List of commits
>>cpuisol: Make cpu isolation configrable and export isolated map
>  
> cpu_isolated_map was a bad hack when it was introduced, I feel we should
> deprecate it and fully integrate the functionality into cpusets. That would
> give a much more flexible end-result.
> 
> CPU-sets can already isolate cpus by either creating a cpu outside of any set,
> or a set with a single cpu not shared by any other sets.
> 

Peter, what about when I am NOT using cpusets and are disabled in my config but
I still want to use this?

> This also allows for isolated groups, there are good reasons to isolate 
> groups,
> esp. now that we have a stronger RT balancer. SMP and hard RT are not
> exclusive. A design that does not take that into account is too rigid.
> 
>>cpuisol: Do not route IRQs to the CPUs isolated at boot
> 
>>From the diffstat you're not touching the genirq stuff, but instead hack a
> single architecture to support this feature. Sounds like an ill designed hack.
> 
> A better approach would be to add a flag to the cpuset infrastructure that 
> says
> whether its a system set or not. A system set would be one that services the
> general purpose OS and would include things like the IRQ affinity and unbound
> kernel threads (including unbound workqueues - or single workqueues). This 
> flag
> would default to on, and by switching it off for the root set, and a select
> subset you would push the System away from those cpus, thereby isolating them.
> 
>>cpuisol: Do not schedule workqueues on the isolated CPUs
>  
> (per-cpu workqueues, the single ones are treated in the previous section)
> 
> I still strongly disagree with this approach. Workqueues are passive, they
> don't do anything unless work is provided to them. By blindly not starting 
> them
> you handicap the system and services that rely on them.
> 

Have things changed since since my first bad encounter with Workqueues.
I am referring to this thread. 

http://kerneltrap.org/mailarchive/linux-kernel/2007/5/29/97039 

> (you even acknowledged this problem, by saying it breaks oprofile for instance
> - still trying to push a change that knowingly breaks a lot of stuff is bad
> manners on lkml and not acceptable for mainline)
> 
> The way to do this is to avoid the generation of work, not the execution of 
> it.
> 
>>cpuisol: Move on-stack array used for boot cmd parsing into __initdata
>>cpuisol: Documentation updates
>>cpuisol: Minor updates to the Kconfig options
>  
> No idea about these patches,... 
> 
>>cpuisol: Do not halt isolated CPUs with Stop Machine
> 
> Very strong NACK on this one, it breaks a lot of functionality in non-obvious
> ways, as has been pointed out to you numerous times. Such patches are just not
> acceptable for mainline - full stop.
> 
> 

Mark
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH sched-devel 0/7] CPU isolation extensions

2008-02-22 Thread Peter Zijlstra

On Thu, 2008-02-21 at 18:38 -0800, Max Krasnyanskiy wrote:

> As you suggested I'm sending CPU isolation patches for review/inclusion into 
> sched-devel tree. They are against 2.6.25-rc2.
> You can also pull them from my GIT tree at
>   git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git 
> master
 
Post patches! I can't review a git tree..
 
> Diffstat:
>  b/Documentation/ABI/testing/sysfs-devices-system-cpu |   41 ++
>  b/Documentation/cpu-isolation.txt|  114 
> ++-
>  b/arch/x86/Kconfig   |1 
>  b/arch/x86/kernel/genapic_flat_64.c  |5 
>  b/drivers/base/cpu.c |   48 
>  b/include/linux/cpumask.h|3 
>  b/kernel/Kconfig.cpuisol |   15 ++
>  b/kernel/Makefile|4 
>  b/kernel/cpu.c   |   49 
>  b/kernel/sched.c |   37 --
>  b/kernel/stop_machine.c  |9 +
>  b/kernel/workqueue.c |   31 +++--
>  kernel/Kconfig.cpuisol   |   56 ++---
>  kernel/cpu.c |   16 +-
>  14 files changed, 356 insertions(+), 73 deletions(-)
> 
> List of commits
>cpuisol: Make cpu isolation configrable and export isolated map
 
cpu_isolated_map was a bad hack when it was introduced, I feel we should
deprecate it and fully integrate the functionality into cpusets. That would
give a much more flexible end-result.

CPU-sets can already isolate cpus by either creating a cpu outside of any set,
or a set with a single cpu not shared by any other sets.

This also allows for isolated groups, there are good reasons to isolate groups,
esp. now that we have a stronger RT balancer. SMP and hard RT are not
exclusive. A design that does not take that into account is too rigid.

>cpuisol: Do not route IRQs to the CPUs isolated at boot

>From the diffstat you're not touching the genirq stuff, but instead hack a
single architecture to support this feature. Sounds like an ill designed hack.

A better approach would be to add a flag to the cpuset infrastructure that says
whether its a system set or not. A system set would be one that services the
general purpose OS and would include things like the IRQ affinity and unbound
kernel threads (including unbound workqueues - or single workqueues). This flag
would default to on, and by switching it off for the root set, and a select
subset you would push the System away from those cpus, thereby isolating them.

>cpuisol: Do not schedule workqueues on the isolated CPUs
 
(per-cpu workqueues, the single ones are treated in the previous section)

I still strongly disagree with this approach. Workqueues are passive, they
don't do anything unless work is provided to them. By blindly not starting them
you handicap the system and services that rely on them.

(you even acknowledged this problem, by saying it breaks oprofile for instance
- still trying to push a change that knowingly breaks a lot of stuff is bad
manners on lkml and not acceptable for mainline)

The way to do this is to avoid the generation of work, not the execution of it.

>cpuisol: Move on-stack array used for boot cmd parsing into __initdata
>cpuisol: Documentation updates
>cpuisol: Minor updates to the Kconfig options
 
No idea about these patches,... 

>cpuisol: Do not halt isolated CPUs with Stop Machine

Very strong NACK on this one, it breaks a lot of functionality in non-obvious
ways, as has been pointed out to you numerous times. Such patches are just not
acceptable for mainline - full stop.


Please, the ideas are not bad and have been suggested in the context of -rt a
number of times, its just the execution. Design features so that they are
flexible and can be used in ways you never imagined them. But more
importantly so that they don't knowingly break tons of stuff.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH sched-devel 0/7] CPU isolation extensions

2008-02-22 Thread Dmitry Adamushko
Hi Max,

>  [ ... ]
>  Last patch to the stop machine is potentially unsafe and is marked as 
> experimental. Unfortunately
>  it's currently the only option that allows dynamic module insertion/removal 
> for above scenarios.

I'm puzzled by the following part (can be a misunderstanding from my side)

+config CPUISOL_STOPMACHINE
+   bool "Do not halt isolated CPUs with Stop Machine (EXPERIMENTAL)"
+   depends on CPUISOL && STOP_MACHINE && EXPERIMENTAL
+   help
+ If this option is enabled kernel will not halt isolated CPUs
+ when Stop Machine is triggered. Stop Machine is currently only
+ used by the module insertion and removal.

this "only" part. What about e.g. a 'cpu hotplug' case (_cpu_down())?
(or we should abstract it a bit to the point that e.g. a cpu can be
considered as 'a module'? :-)


-- 
Best regards,
Dmitry Adamushko
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH sched-devel 0/7] CPU isolation extensions

2008-02-22 Thread Dmitry Adamushko
Hi Max,

  [ ... ]
  Last patch to the stop machine is potentially unsafe and is marked as 
 experimental. Unfortunately
  it's currently the only option that allows dynamic module insertion/removal 
 for above scenarios.

I'm puzzled by the following part (can be a misunderstanding from my side)

+config CPUISOL_STOPMACHINE
+   bool Do not halt isolated CPUs with Stop Machine (EXPERIMENTAL)
+   depends on CPUISOL  STOP_MACHINE  EXPERIMENTAL
+   help
+ If this option is enabled kernel will not halt isolated CPUs
+ when Stop Machine is triggered. Stop Machine is currently only
+ used by the module insertion and removal.

this only part. What about e.g. a 'cpu hotplug' case (_cpu_down())?
(or we should abstract it a bit to the point that e.g. a cpu can be
considered as 'a module'? :-)


-- 
Best regards,
Dmitry Adamushko
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH sched-devel 0/7] CPU isolation extensions

2008-02-22 Thread Peter Zijlstra

On Thu, 2008-02-21 at 18:38 -0800, Max Krasnyanskiy wrote:

 As you suggested I'm sending CPU isolation patches for review/inclusion into 
 sched-devel tree. They are against 2.6.25-rc2.
 You can also pull them from my GIT tree at
   git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git 
 master
 
Post patches! I can't review a git tree..
 
 Diffstat:
  b/Documentation/ABI/testing/sysfs-devices-system-cpu |   41 ++
  b/Documentation/cpu-isolation.txt|  114 
 ++-
  b/arch/x86/Kconfig   |1 
  b/arch/x86/kernel/genapic_flat_64.c  |5 
  b/drivers/base/cpu.c |   48 
  b/include/linux/cpumask.h|3 
  b/kernel/Kconfig.cpuisol |   15 ++
  b/kernel/Makefile|4 
  b/kernel/cpu.c   |   49 
  b/kernel/sched.c |   37 --
  b/kernel/stop_machine.c  |9 +
  b/kernel/workqueue.c |   31 +++--
  kernel/Kconfig.cpuisol   |   56 ++---
  kernel/cpu.c |   16 +-
  14 files changed, 356 insertions(+), 73 deletions(-)
 
 List of commits
cpuisol: Make cpu isolation configrable and export isolated map
 
cpu_isolated_map was a bad hack when it was introduced, I feel we should
deprecate it and fully integrate the functionality into cpusets. That would
give a much more flexible end-result.

CPU-sets can already isolate cpus by either creating a cpu outside of any set,
or a set with a single cpu not shared by any other sets.

This also allows for isolated groups, there are good reasons to isolate groups,
esp. now that we have a stronger RT balancer. SMP and hard RT are not
exclusive. A design that does not take that into account is too rigid.

cpuisol: Do not route IRQs to the CPUs isolated at boot

From the diffstat you're not touching the genirq stuff, but instead hack a
single architecture to support this feature. Sounds like an ill designed hack.

A better approach would be to add a flag to the cpuset infrastructure that says
whether its a system set or not. A system set would be one that services the
general purpose OS and would include things like the IRQ affinity and unbound
kernel threads (including unbound workqueues - or single workqueues). This flag
would default to on, and by switching it off for the root set, and a select
subset you would push the System away from those cpus, thereby isolating them.

cpuisol: Do not schedule workqueues on the isolated CPUs
 
(per-cpu workqueues, the single ones are treated in the previous section)

I still strongly disagree with this approach. Workqueues are passive, they
don't do anything unless work is provided to them. By blindly not starting them
you handicap the system and services that rely on them.

(you even acknowledged this problem, by saying it breaks oprofile for instance
- still trying to push a change that knowingly breaks a lot of stuff is bad
manners on lkml and not acceptable for mainline)

The way to do this is to avoid the generation of work, not the execution of it.

cpuisol: Move on-stack array used for boot cmd parsing into __initdata
cpuisol: Documentation updates
cpuisol: Minor updates to the Kconfig options
 
No idea about these patches,... 

cpuisol: Do not halt isolated CPUs with Stop Machine

Very strong NACK on this one, it breaks a lot of functionality in non-obvious
ways, as has been pointed out to you numerous times. Such patches are just not
acceptable for mainline - full stop.


Please, the ideas are not bad and have been suggested in the context of -rt a
number of times, its just the execution. Design features so that they are
flexible and can be used in ways you never imagined them. But more
importantly so that they don't knowingly break tons of stuff.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH sched-devel 0/7] CPU isolation extensions

2008-02-22 Thread Mark Hounschell
Peter Zijlstra wrote:
 On Thu, 2008-02-21 at 18:38 -0800, Max Krasnyanskiy wrote:
 
 As you suggested I'm sending CPU isolation patches for review/inclusion into 
 sched-devel tree. They are against 2.6.25-rc2.
 You can also pull them from my GIT tree at
  git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git 
 master
  
 Post patches! I can't review a git tree..
  

Max, could you also post them for 2.6.24.2 stable please. Thanks


 Diffstat:
  b/Documentation/ABI/testing/sysfs-devices-system-cpu |   41 ++
  b/Documentation/cpu-isolation.txt|  114 
 ++-
  b/arch/x86/Kconfig   |1 
  b/arch/x86/kernel/genapic_flat_64.c  |5 
  b/drivers/base/cpu.c |   48 
  b/include/linux/cpumask.h|3 
  b/kernel/Kconfig.cpuisol |   15 ++
  b/kernel/Makefile|4 
  b/kernel/cpu.c   |   49 
  b/kernel/sched.c |   37 --
  b/kernel/stop_machine.c  |9 +
  b/kernel/workqueue.c |   31 +++--
  kernel/Kconfig.cpuisol   |   56 ++---
  kernel/cpu.c |   16 +-
  14 files changed, 356 insertions(+), 73 deletions(-)

 List of commits
cpuisol: Make cpu isolation configrable and export isolated map
  
 cpu_isolated_map was a bad hack when it was introduced, I feel we should
 deprecate it and fully integrate the functionality into cpusets. That would
 give a much more flexible end-result.
 
 CPU-sets can already isolate cpus by either creating a cpu outside of any set,
 or a set with a single cpu not shared by any other sets.
 

Peter, what about when I am NOT using cpusets and are disabled in my config but
I still want to use this?

 This also allows for isolated groups, there are good reasons to isolate 
 groups,
 esp. now that we have a stronger RT balancer. SMP and hard RT are not
 exclusive. A design that does not take that into account is too rigid.
 
cpuisol: Do not route IRQs to the CPUs isolated at boot
 
From the diffstat you're not touching the genirq stuff, but instead hack a
 single architecture to support this feature. Sounds like an ill designed hack.
 
 A better approach would be to add a flag to the cpuset infrastructure that 
 says
 whether its a system set or not. A system set would be one that services the
 general purpose OS and would include things like the IRQ affinity and unbound
 kernel threads (including unbound workqueues - or single workqueues). This 
 flag
 would default to on, and by switching it off for the root set, and a select
 subset you would push the System away from those cpus, thereby isolating them.
 
cpuisol: Do not schedule workqueues on the isolated CPUs
  
 (per-cpu workqueues, the single ones are treated in the previous section)
 
 I still strongly disagree with this approach. Workqueues are passive, they
 don't do anything unless work is provided to them. By blindly not starting 
 them
 you handicap the system and services that rely on them.
 

Have things changed since since my first bad encounter with Workqueues.
I am referring to this thread. 

http://kerneltrap.org/mailarchive/linux-kernel/2007/5/29/97039 

 (you even acknowledged this problem, by saying it breaks oprofile for instance
 - still trying to push a change that knowingly breaks a lot of stuff is bad
 manners on lkml and not acceptable for mainline)
 
 The way to do this is to avoid the generation of work, not the execution of 
 it.
 
cpuisol: Move on-stack array used for boot cmd parsing into __initdata
cpuisol: Documentation updates
cpuisol: Minor updates to the Kconfig options
  
 No idea about these patches,... 
 
cpuisol: Do not halt isolated CPUs with Stop Machine
 
 Very strong NACK on this one, it breaks a lot of functionality in non-obvious
 ways, as has been pointed out to you numerous times. Such patches are just not
 acceptable for mainline - full stop.
 
 

Mark
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH sched-devel 0/7] CPU isolation extensions

2008-02-22 Thread Peter Zijlstra

On Fri, 2008-02-22 at 08:38 -0500, Mark Hounschell wrote:

  List of commits
 cpuisol: Make cpu isolation configrable and export isolated map
   
  cpu_isolated_map was a bad hack when it was introduced, I feel we should
  deprecate it and fully integrate the functionality into cpusets. That would
  give a much more flexible end-result.
  
  CPU-sets can already isolate cpus by either creating a cpu outside of any 
  set,
  or a set with a single cpu not shared by any other sets.
  
 
 Peter, what about when I am NOT using cpusets and are disabled in my config 
 but
 I still want to use this?

Then you enable it?

 cpuisol: Do not schedule workqueues on the isolated CPUs
   
  (per-cpu workqueues, the single ones are treated in the previous section)
  
  I still strongly disagree with this approach. Workqueues are passive, they
  don't do anything unless work is provided to them. By blindly not starting 
  them
  you handicap the system and services that rely on them.
  
 
 Have things changed since since my first bad encounter with Workqueues.
 I am referring to this thread. 
 
 http://kerneltrap.org/mailarchive/linux-kernel/2007/5/29/97039 

Just means you get to fix those problems. By blindly not starting them
you introduce others.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH sched-devel 0/7] CPU isolation extensions

2008-02-22 Thread Max Krasnyanskiy

Dmitry Adamushko wrote:

Hi Max,


 [ ... ]
 Last patch to the stop machine is potentially unsafe and is marked as 
experimental. Unfortunately
 it's currently the only option that allows dynamic module insertion/removal 
for above scenarios.


I'm puzzled by the following part (can be a misunderstanding from my side)

+config CPUISOL_STOPMACHINE
+   bool Do not halt isolated CPUs with Stop Machine (EXPERIMENTAL)
+   depends on CPUISOL  STOP_MACHINE  EXPERIMENTAL
+   help
+ If this option is enabled kernel will not halt isolated CPUs
+ when Stop Machine is triggered. Stop Machine is currently only
+ used by the module insertion and removal.

this only part. What about e.g. a 'cpu hotplug' case (_cpu_down())?
(or we should abstract it a bit to the point that e.g. a cpu can be
considered as 'a module'? :-)


My bad. I forgot to update that text. As you and other folks pointed out
stopmachine is used in a few other places besides module loading. We had 
a discussion about this awhile ago. I just forgot to update the text. 
Will do.


Max
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH sched-devel 0/7] CPU isolation extensions

2008-02-22 Thread Max Krasnyanskiy

Peter Zijlstra wrote:

On Thu, 2008-02-21 at 18:38 -0800, Max Krasnyanskiy wrote:

As you suggested I'm sending CPU isolation patches for review/inclusion into 
sched-devel tree. They are against 2.6.25-rc2.

You can also pull them from my GIT tree at
git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git 
master
 
Post patches! I can't review a git tree..

I did. But it looks like I screwed the --cc list. I just resent them.
 

List of commits
   cpuisol: Make cpu isolation configrable and export isolated map
 
cpu_isolated_map was a bad hack when it was introduced, I feel we should

deprecate it and fully integrate the functionality into cpusets. That would
give a much more flexible end-result.

That's not not currently possible and will introduce a lot of complexity.
I'm pretty sure you missied the discussion I had with Paul (you were cc'ed on 
that btw).
In fact I provided the link to that discussion in the original email.
Here it is again:
http://marc.info/?l=linux-kernelm=120180692331461w=2

Basically the problem is very simple. CPU isolation needs a simple/efficient way to check 
if CPU N is isolated. cpuset/cgroup APIs are not designed for that. In other to figure

out if a CPU N is isolated one has to iterate through all cpusets and checks 
their cpu maps.
That requires several levels of locking (cgroup and cpuset).
The other issue is that cpusets are a bit too dynamic (see the thread above for 
more details)
we'd need notified mechanisms to notify subsystems when a CPUs become isolated. 
Again more
complexity. Since I integrated cpu isolation with cpu hotplug it's already 
addressed in a nice
simple way.
Please take a look at that discussion. I do not think it's worth the effort to 
put this into
cpusets. cpu_isolated_map is very clean and simple concept and integrates nicely with the 
rest of the cpu maps. ie It's very much the same concept and API as cpu_online_map, etc.


If we want to integrate this stuff with cpusets I think the best approach would be is to 
have cpusets update the cpu_isolated_map just like it currently updates scheduler domains. 


CPU-sets can already isolate cpus by either creating a cpu outside of any set,
or a set with a single cpu not shared by any other sets.
This only works for user-space. As I mentioned about for full CPU isolation various kernel 
subsystems need to be aware of that CPUs are isolated in order to avoid activities on them.



This also allows for isolated groups, there are good reasons to isolate groups,
esp. now that we have a stronger RT balancer. SMP and hard RT are not
exclusive. A design that does not take that into account is too rigid.

You're thinking scheduling only. Paul had the same confusion ;-)
As I explained before I'm redefining (or proposing to redefine) CPU isolation to

1. Isolated CPU(s) must not be subject to scheduler load balancing
Users must explicitly bind threads in order to run on those CPU(s).

2. By default interrupts must not be routed to the isolated CPU(s)
User must route interrupts (if any) to those CPUs explicitly.

3. In general kernel subsystems must avoid activity on the isolated CPU(s) as 
much as possible
Includes workqueues, per CPU threads, etc.
This feature is configurable and is disabled by default.

Only #1 has to do with the scheduling. The rest has _nothing_ to do with it.


   cpuisol: Do not route IRQs to the CPUs isolated at boot



From the diffstat you're not touching the genirq stuff, but instead hack a

single architecture to support this feature. Sounds like an ill designed hack.

Ah, good point. This patches started before genirq was merged and I did not 
realize
that there is a way to set default irq affinity with genirq.
I'll definitely take a look at that.


A better approach would be to add a flag to the cpuset infrastructure that says
whether its a system set or not. A system set would be one that services the
general purpose OS and would include things like the IRQ affinity and unbound
kernel threads (including unbound workqueues - or single workqueues). This flag
would default to on, and by switching it off for the root set, and a select
subset you would push the System away from those cpus, thereby isolating them.
You're talking about very high level API. I'm totally all for it. What the patches deal 
with is actual low level stuff that is needed to do the push the system away from those cpus.

As I mentioned above we can have cpuset update cpu_isolated_map for example.


   cpuisol: Do not schedule workqueues on the isolated CPUs
 
(per-cpu workqueues, the single ones are treated in the previous section)


I still strongly disagree with this approach. Workqueues are passive, they
don't do anything unless work is provided to them. By blindly not starting them
you handicap the system and services that rely on them.

Oh boy, back to square one. I covered this already.
I even started a thread on that and explained what this is and why its needed.

Re: [PATCH sched-devel 0/7] CPU isolation extensions

2008-02-22 Thread Max Krasnyanskiy

Mark Hounschell wrote:

Peter Zijlstra wrote:

On Thu, 2008-02-21 at 18:38 -0800, Max Krasnyanskiy wrote:

As you suggested I'm sending CPU isolation patches for review/inclusion into 
sched-devel tree. They are against 2.6.25-rc2.

You can also pull them from my GIT tree at
git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git 
master
 
Post patches! I can't review a git tree..
 

Max, could you also post them for 2.6.24.2 stable please. Thanks

Will do.

Max
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH sched-devel 0/7] CPU isolation extensions

2008-02-22 Thread Max Krasnyanskiy

Peter Zijlstra wrote:

On Fri, 2008-02-22 at 08:38 -0500, Mark Hounschell wrote:


List of commits
   cpuisol: Make cpu isolation configrable and export isolated map
 
cpu_isolated_map was a bad hack when it was introduced, I feel we should

deprecate it and fully integrate the functionality into cpusets. That would
give a much more flexible end-result.

CPU-sets can already isolate cpus by either creating a cpu outside of any set,
or a set with a single cpu not shared by any other sets.


Peter, what about when I am NOT using cpusets and are disabled in my config but
I still want to use this?


Then you enable it?

I'm with Mark on this one. For example if I have two core machine I do not need 
cpusets
to manage them.
Plus like I explained in prev email cpuset is higher level API. We can think of 
a way to
integrated them if needed.


   cpuisol: Do not schedule workqueues on the isolated CPUs
 
(per-cpu workqueues, the single ones are treated in the previous section)


I still strongly disagree with this approach. Workqueues are passive, they
don't do anything unless work is provided to them. By blindly not starting them
you handicap the system and services that rely on them.


Have things changed since since my first bad encounter with Workqueues.
I am referring to this thread. 

http://kerneltrap.org/mailarchive/linux-kernel/2007/5/29/97039 


Just means you get to fix those problems. By blindly not starting them
you introduce others.


Please give me an example of what you have in mind.
Also if you look at the patch (which I've now posted properly) it's not just 
not starting them.
I also redirected all future scheduled work to non-isolated CPU. ie If work is 
scheduled on the
isolated CPU this work is treated as if the work queue is single threaded. As I 
explained before
most subsystem do not care which CPU actually gets to execute the work. 
Oprofile is the only
one I know of that breaks because it cannot collect the stats from the isolated 
CPUs. I'm thinking
of a different solution for oprofile, maybe collection samples through IPIs or 
something.

Max
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH sched-devel 0/7] CPU isolation extensions

2008-02-21 Thread Max Krasnyanskiy

Ingo,

As you suggested I'm sending CPU isolation patches for review/inclusion into 
sched-devel tree. They are against 2.6.25-rc2.

You can also pull them from my GIT tree at
git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git 
master

Diffstat:
b/Documentation/ABI/testing/sysfs-devices-system-cpu |   41 ++
b/Documentation/cpu-isolation.txt|  114 ++-
b/arch/x86/Kconfig   |1 
b/arch/x86/kernel/genapic_flat_64.c  |5 
b/drivers/base/cpu.c |   48 
b/include/linux/cpumask.h|3 
b/kernel/Kconfig.cpuisol |   15 ++
b/kernel/Makefile|4 
b/kernel/cpu.c   |   49 

b/kernel/sched.c |   37 --
b/kernel/stop_machine.c  |9 +
b/kernel/workqueue.c |   31 +++--
kernel/Kconfig.cpuisol   |   56 ++---
kernel/cpu.c |   16 +-
14 files changed, 356 insertions(+), 73 deletions(-)

List of commits
  cpuisol: Make cpu isolation configrable and export isolated map
  cpuisol: Do not route IRQs to the CPUs isolated at boot
  cpuisol: Do not schedule workqueues on the isolated CPUs
  cpuisol: Move on-stack array used for boot cmd parsing into __initdata
  cpuisol: Documentation updates
  cpuisol: Minor updates to the Kconfig options
  cpuisol: Do not halt isolated CPUs with Stop Machine

This patch series extends CPU isolation support.
The primary idea here is to be able to use some CPU cores as the dedicated 
engines for running
user-space code with minimal kernel overhead/intervention, think of it as an SPE in the 
Cell processor. I'd like to be able to run a CPU intensive (%100) RT task on one of the 
processors without adversely affecting or being affected by the other system activities. 
System activities here include _kernel_ activities as well. 

I'm personally using this for hard realtime purposes. With CPU isolation it's very easy to 
achieve single digit usec worst case and around 200 nsec average response times on off-the-shelf
multi- processor/core systems (vanilla kernel plus these patches) even under extreme system load. 
I'm working with legal folks on releasing hard RT user-space framework for that.
I believe with the current multi-core CPU trend we will see more and more applications that 
explore this capability: RT gaming engines, simulators, hard RT apps, etc.


Hence the proposal is to extend current CPU isolation feature.
The new definition of the CPU isolation would be:
---
1. Isolated CPU(s) must not be subject to scheduler load balancing
 Users must explicitly bind threads in order to run on those CPU(s).

2. By default interrupts must not be routed to the isolated CPU(s)
 User must route interrupts (if any) to those CPUs explicitly.

3. In general kernel subsystems must avoid activity on the isolated CPU(s) as 
much as possible
 Includes workqueues, per CPU threads, etc.
 This feature is configurable and is disabled by default.  
---


I've been maintaining this stuff since around 2.6.18 and it's been running in 
production
environment for a couple of years now. It's been tested on all kinds of 
machines, from NUMA
boxes like HP xw9300/9400 to tiny uTCA boards like Mercury AXA110.
The messiest part used to be SLAB garbage collector changes. With the new SLUB all that mess 
goes away (ie no changes necessary). Also CFS seems to handle CPU hotplug much better than O(1) 
did (ie domains are recomputed dynamically) so that isolation can be done at any time (via sysfs). 
So this seems like a good time to merge. 


We've had scheduler support for CPU isolation ever since O(1) scheduler went 
it. In other words
#1 is already supported. These patches do not change/affect that functionality in any way. 
#2 is trivial one liner change to the IRQ init code. 
#3 is addressed by a couple of separate patches. The main problem here is that RT thread can prevent

kernel threads from running and machine gets stuck because other CPUs are 
waiting for those threads
to run and report back.

Folks involved in the scheduler/cpuset development provided a lot of feedback 
on the first series
of patches. I believe I managed to explain and clarify every aspect. 
Paul Jackson initially suggested to implement #2 and #3 using cpusets subsystem. Paul and I looked 
at it more closely and determined that exporting cpu_isolated_map instead is a better option.

Details here
http://marc.info/?l=linux-kernel=120180692331461=2

Last patch to the stop machine is potentially unsafe and is marked as experimental. Unfortunately 
it's currently the only option that allows dynamic module insertion/removal for above scenarios. 

From the previous discussions it's the only 

[PATCH sched-devel 0/7] CPU isolation extensions

2008-02-21 Thread Max Krasnyanskiy

Ingo,

As you suggested I'm sending CPU isolation patches for review/inclusion into 
sched-devel tree. They are against 2.6.25-rc2.

You can also pull them from my GIT tree at
git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git 
master

Diffstat:
b/Documentation/ABI/testing/sysfs-devices-system-cpu |   41 ++
b/Documentation/cpu-isolation.txt|  114 ++-
b/arch/x86/Kconfig   |1 
b/arch/x86/kernel/genapic_flat_64.c  |5 
b/drivers/base/cpu.c |   48 
b/include/linux/cpumask.h|3 
b/kernel/Kconfig.cpuisol |   15 ++
b/kernel/Makefile|4 
b/kernel/cpu.c   |   49 

b/kernel/sched.c |   37 --
b/kernel/stop_machine.c  |9 +
b/kernel/workqueue.c |   31 +++--
kernel/Kconfig.cpuisol   |   56 ++---
kernel/cpu.c |   16 +-
14 files changed, 356 insertions(+), 73 deletions(-)

List of commits
  cpuisol: Make cpu isolation configrable and export isolated map
  cpuisol: Do not route IRQs to the CPUs isolated at boot
  cpuisol: Do not schedule workqueues on the isolated CPUs
  cpuisol: Move on-stack array used for boot cmd parsing into __initdata
  cpuisol: Documentation updates
  cpuisol: Minor updates to the Kconfig options
  cpuisol: Do not halt isolated CPUs with Stop Machine

This patch series extends CPU isolation support.
The primary idea here is to be able to use some CPU cores as the dedicated 
engines for running
user-space code with minimal kernel overhead/intervention, think of it as an SPE in the 
Cell processor. I'd like to be able to run a CPU intensive (%100) RT task on one of the 
processors without adversely affecting or being affected by the other system activities. 
System activities here include _kernel_ activities as well. 

I'm personally using this for hard realtime purposes. With CPU isolation it's very easy to 
achieve single digit usec worst case and around 200 nsec average response times on off-the-shelf
multi- processor/core systems (vanilla kernel plus these patches) even under extreme system load. 
I'm working with legal folks on releasing hard RT user-space framework for that.
I believe with the current multi-core CPU trend we will see more and more applications that 
explore this capability: RT gaming engines, simulators, hard RT apps, etc.


Hence the proposal is to extend current CPU isolation feature.
The new definition of the CPU isolation would be:
---
1. Isolated CPU(s) must not be subject to scheduler load balancing
 Users must explicitly bind threads in order to run on those CPU(s).

2. By default interrupts must not be routed to the isolated CPU(s)
 User must route interrupts (if any) to those CPUs explicitly.

3. In general kernel subsystems must avoid activity on the isolated CPU(s) as 
much as possible
 Includes workqueues, per CPU threads, etc.
 This feature is configurable and is disabled by default.  
---


I've been maintaining this stuff since around 2.6.18 and it's been running in 
production
environment for a couple of years now. It's been tested on all kinds of 
machines, from NUMA
boxes like HP xw9300/9400 to tiny uTCA boards like Mercury AXA110.
The messiest part used to be SLAB garbage collector changes. With the new SLUB all that mess 
goes away (ie no changes necessary). Also CFS seems to handle CPU hotplug much better than O(1) 
did (ie domains are recomputed dynamically) so that isolation can be done at any time (via sysfs). 
So this seems like a good time to merge. 


We've had scheduler support for CPU isolation ever since O(1) scheduler went 
it. In other words
#1 is already supported. These patches do not change/affect that functionality in any way. 
#2 is trivial one liner change to the IRQ init code. 
#3 is addressed by a couple of separate patches. The main problem here is that RT thread can prevent

kernel threads from running and machine gets stuck because other CPUs are 
waiting for those threads
to run and report back.

Folks involved in the scheduler/cpuset development provided a lot of feedback 
on the first series
of patches. I believe I managed to explain and clarify every aspect. 
Paul Jackson initially suggested to implement #2 and #3 using cpusets subsystem. Paul and I looked 
at it more closely and determined that exporting cpu_isolated_map instead is a better option.

Details here
http://marc.info/?l=linux-kernelm=120180692331461w=2

Last patch to the stop machine is potentially unsafe and is marked as experimental. Unfortunately 
it's currently the only option that allows dynamic module insertion/removal for above scenarios. 

From the previous discussions it's the only