Re: [git pull] CPU isolation extensions (updated)

2008-02-13 Thread Max Krasnyansky
Ingo Molnar wrote:
> * Max Krasnyansky <[EMAIL PROTECTED]> wrote:
> 
>> Ingo said a few different things (a bit too large to quote). 
> 
> [...]
>> And at the end he said:
>>> Also, i'd not mind some test-coverage in sched.git as well.
> 
>> I far as I know "do not mind" does not mean "must go to" ;-). [...]
> 
> the CPU isolation related patches have typically flown through 
> sched.git/sched-devel.git, so yes, you can take my "i'd not mind" 
> comment as "i'd not mind it at all". That's the tree that all the folks 
> who deal with this (such as Paul) are following. So lets go via the 
> normal contribution cycle and let this trickle through with all the 
> scheduler folks? I'd say 2.6.26 would be a tentative target, if it holds 
> up to scrutiny in sched-devel.git (both testing and review wise). And 
> because Andrew tracks sched-devel.git it will thus show up in -mm too.

Sounds good. Can you pull my tree then ? Or do you want me to resend the 
patches.
The tree is here:
git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git
Take the for-linus branch.
Or as I said please let me know and I'll resend the patches.

Thanx
Max
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions (updated)

2008-02-13 Thread Ingo Molnar

* Max Krasnyansky <[EMAIL PROTECTED]> wrote:

> Ingo said a few different things (a bit too large to quote). 

[...]
> And at the end he said:
> > Also, i'd not mind some test-coverage in sched.git as well.

> I far as I know "do not mind" does not mean "must go to" ;-). [...]

the CPU isolation related patches have typically flown through 
sched.git/sched-devel.git, so yes, you can take my "i'd not mind" 
comment as "i'd not mind it at all". That's the tree that all the folks 
who deal with this (such as Paul) are following. So lets go via the 
normal contribution cycle and let this trickle through with all the 
scheduler folks? I'd say 2.6.26 would be a tentative target, if it holds 
up to scrutiny in sched-devel.git (both testing and review wise). And 
because Andrew tracks sched-devel.git it will thus show up in -mm too.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions (updated)

2008-02-13 Thread Ingo Molnar

* Max Krasnyansky [EMAIL PROTECTED] wrote:

 Ingo said a few different things (a bit too large to quote). 

[...]
 And at the end he said:
  Also, i'd not mind some test-coverage in sched.git as well.

 I far as I know do not mind does not mean must go to ;-). [...]

the CPU isolation related patches have typically flown through 
sched.git/sched-devel.git, so yes, you can take my i'd not mind 
comment as i'd not mind it at all. That's the tree that all the folks 
who deal with this (such as Paul) are following. So lets go via the 
normal contribution cycle and let this trickle through with all the 
scheduler folks? I'd say 2.6.26 would be a tentative target, if it holds 
up to scrutiny in sched-devel.git (both testing and review wise). And 
because Andrew tracks sched-devel.git it will thus show up in -mm too.

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions (updated)

2008-02-13 Thread Max Krasnyansky
Ingo Molnar wrote:
 * Max Krasnyansky [EMAIL PROTECTED] wrote:
 
 Ingo said a few different things (a bit too large to quote). 
 
 [...]
 And at the end he said:
 Also, i'd not mind some test-coverage in sched.git as well.
 
 I far as I know do not mind does not mean must go to ;-). [...]
 
 the CPU isolation related patches have typically flown through 
 sched.git/sched-devel.git, so yes, you can take my i'd not mind 
 comment as i'd not mind it at all. That's the tree that all the folks 
 who deal with this (such as Paul) are following. So lets go via the 
 normal contribution cycle and let this trickle through with all the 
 scheduler folks? I'd say 2.6.26 would be a tentative target, if it holds 
 up to scrutiny in sched-devel.git (both testing and review wise). And 
 because Andrew tracks sched-devel.git it will thus show up in -mm too.

Sounds good. Can you pull my tree then ? Or do you want me to resend the 
patches.
The tree is here:
git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git
Take the for-linus branch.
Or as I said please let me know and I'll resend the patches.

Thanx
Max
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions (updated)

2008-02-09 Thread Max Krasnyansky
Paul Jackson wrote:
> Max wrote:
>> Linus, please pull CPU isolation extensions from
> 
> Did I miss something in this discussion?  I thought
> Ingo was quite clear, and Linus pretty clear too,
> that this patch should bake in *-mm or some such
> place for a bit first.
> 

Andrew said:
> The feature as a whole seems useful, and I don't actually oppose the merge
> based on what I see here.  As long as you're really sure that cpusets are
> inappropriate (and bear in mind that Paul has a track record of being wrong
> on this :)).  But I see a few glitches 

As far as I can understand Andrew is ok with the merge. And I addressed all 
his comments.

Linus said:
> Have these been in -mm and widely discussed etc? I'd like to start more 
> carefully, and (a) have that controversial last patch not merged initially 
> and (b) make sure everybody is on the same page wrt this all..

As far as I can understand Linus _asked_ whether it was in -mm or not and 
whether
everybody's on the same page. He did not say "this must be in -mm first".
I explained that it has not been in -mm, and who it was discussed with, and did 
a 
bunch more testing/investigation on the controversial patch and explained why I 
think 
it's not that controversial any more.

Ingo said a few different things (a bit too large to quote). 
- That it was not discussed. I explained that it was in fact discussed and 
provided
a bunch of pointers to the mail threads.
- That he thinks that cpuset is the way to do it. Again I explained why it's 
not.
And at the end he said:
> Also, i'd not mind some test-coverage in sched.git as well.

I far as I know "do not mind" does not mean "must go to" ;-). Also I replied 
that 
I did not mind either but I do not think that it has much (if anything) to do 
with
the scheduler.

Anyway. I think I mentioned that I did not mind -mm either. I think it's ready 
for
the mainline. But if people still strongly feel that it has to be in -mm that's 
fine.
Lets just do s/Linus/Andrew/ on the first line and move on. But if Linus pulls 
it now
even better ;-)
  
Andrew, Linus, I'll let you guys decide which tree it needs to go.

Max
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions (updated)

2008-02-09 Thread Paul Jackson
Max wrote:
> Linus, please pull CPU isolation extensions from

Did I miss something in this discussion?  I thought
Ingo was quite clear, and Linus pretty clear too,
that this patch should bake in *-mm or some such
place for a bit first.

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson <[EMAIL PROTECTED]> 1.940.382.4214
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[git pull] CPU isolation extensions (updated)

2008-02-09 Thread Max Krasnyansky
Linus, please pull CPU isolation extensions from

git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git for-linus
 
Diffstat:
 Documentation/ABI/testing/sysfs-devices-system-cpu |   41 +++
 Documentation/cpu-isolation.txt|  113 +
 arch/x86/Kconfig   |1 
 arch/x86/kernel/genapic_flat_64.c  |4 
 drivers/base/cpu.c |   48 
 include/linux/cpumask.h|3 
 kernel/Kconfig.cpuisol |   42 +++
 kernel/Makefile|4 
 kernel/cpu.c   |   54 ++
 kernel/sched.c |   36 --
 kernel/stop_machine.c  |8 +
 kernel/workqueue.c |   30 -
 12 files changed, 337 insertions(+), 47 deletions(-)

This addresses all Andrew's comments for the last submission. Details here:
   http://marc.info/?l=linux-kernel=120236394012766=2

There are no code changes since last time, besides minor fix for moving 
on-stack array 
to __initdata as suggested by Andrew. Other stuff is just documentation 
updates. 

List of commits
   cpuisol: Make cpu isolation configrable and export isolated map
   cpuisol: Do not route IRQs to the CPUs isolated at boot
   cpuisol: Do not schedule workqueues on the isolated CPUs
   cpuisol: Move on-stack array used for boot cmd parsing into __initdata
   cpuisol: Documentation updates
   cpuisol: Minor updates to the Kconfig options
   cpuisol: Do not halt isolated CPUs with Stop Machine

I suggested by Ingo I'm CC'ing everyone who is even remotely connected/affected 
;-)
 
Ingo, Peter - Scheduler.
   There are _no_ changes in this area besides moving cpu_*_map maps from 
kerne/sched.c 
   to kernel/cpu.c.

Paul - Cpuset
   Again there are _no_ changes in this area.
   For reasons why cpuset is not the right mechanism for cpu isolation see this 
thread
  http://marc.info/?l=linux-kernel=120180692331461=2

Rusty - Stop machine.
   After doing a bunch of testing last three days I actually downgraded stop 
machine 
   changes from [highly experimental] to simply [experimental]. Pleas see this 
thread 
   for more info: http://marc.info/?l=linux-kernel=120243837206248=2
   Short story is that I ran several insmod/rmmod workloads on live multi-core 
boxes 
   with stop machine _completely_ disabled and did no see any issues. Rusty did 
not get
   a chance to reply yet, I hopping that we'll be able to make "stop machine" 
completely
   optional for some configurations.

Gerg - ABI documentation.
   Nothing interesting here. I simply added 
Documentation/ABI/testing/sysfs-devices-system-cpu
   and documented some of the attributes exposed in there.
   Suggested by Andrew.

I believe this is ready for the inclusion and my impression is that Andrew is 
ok with that. 
Most changes are very simple and do not affect existing behavior. As I 
mentioned before I've 
been using Workqueue and StopMachine changes in production for a couple of 
years now and have 
high confidence in them. Yet they are marked as experimental for now, just to 
be safe.

My original explanation is included below.

btw I'll be out skiing/snow boarding for the next 4 days and will have sporadic 
email access.
Will do my best to address question/concerns (if any) during that time.

Thanx
Max

--
This patch series extends CPU isolation support. Yes, most people want to 
virtuallize 
CPUs these days and I want to isolate them  :) .

The primary idea here is to be able to use some CPU cores as the dedicated 
engines for running
user-space code with minimal kernel overhead/intervention, think of it as an 
SPE in the 
Cell processor. I'd like to be able to run a CPU intensive (%100) RT task on 
one of the 
processors without adversely affecting or being affected by the other system 
activities. 
System activities here include _kernel_ activities as well. 

I'm personally using this for hard realtime purposes. With CPU isolation it's 
very easy to 
achieve single digit usec worst case and around 200 nsec average response times 
on off-the-shelf
multi- processor/core systems (vanilla kernel plus these patches) even under 
extreme system load. 
I'm working with legal folks on releasing hard RT user-space framework for that.
I believe with the current multi-core CPU trend we will see more and more 
applications that 
explore this capability: RT gaming engines, simulators, hard RT apps, etc.

Hence the proposal is to extend current CPU isolation feature.
The new definition of the CPU isolation would be:
---
1. Isolated CPU(s) must not be subject to scheduler load balancing
  Users must explicitly bind threads in order to run on those CPU(s).

2. By default interrupts 

[git pull] CPU isolation extensions (updated)

2008-02-09 Thread Max Krasnyansky
Linus, please pull CPU isolation extensions from

git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git for-linus
 
Diffstat:
 Documentation/ABI/testing/sysfs-devices-system-cpu |   41 +++
 Documentation/cpu-isolation.txt|  113 +
 arch/x86/Kconfig   |1 
 arch/x86/kernel/genapic_flat_64.c  |4 
 drivers/base/cpu.c |   48 
 include/linux/cpumask.h|3 
 kernel/Kconfig.cpuisol |   42 +++
 kernel/Makefile|4 
 kernel/cpu.c   |   54 ++
 kernel/sched.c |   36 --
 kernel/stop_machine.c  |8 +
 kernel/workqueue.c |   30 -
 12 files changed, 337 insertions(+), 47 deletions(-)

This addresses all Andrew's comments for the last submission. Details here:
   http://marc.info/?l=linux-kernelm=120236394012766w=2

There are no code changes since last time, besides minor fix for moving 
on-stack array 
to __initdata as suggested by Andrew. Other stuff is just documentation 
updates. 

List of commits
   cpuisol: Make cpu isolation configrable and export isolated map
   cpuisol: Do not route IRQs to the CPUs isolated at boot
   cpuisol: Do not schedule workqueues on the isolated CPUs
   cpuisol: Move on-stack array used for boot cmd parsing into __initdata
   cpuisol: Documentation updates
   cpuisol: Minor updates to the Kconfig options
   cpuisol: Do not halt isolated CPUs with Stop Machine

I suggested by Ingo I'm CC'ing everyone who is even remotely connected/affected 
;-)
 
Ingo, Peter - Scheduler.
   There are _no_ changes in this area besides moving cpu_*_map maps from 
kerne/sched.c 
   to kernel/cpu.c.

Paul - Cpuset
   Again there are _no_ changes in this area.
   For reasons why cpuset is not the right mechanism for cpu isolation see this 
thread
  http://marc.info/?l=linux-kernelm=120180692331461w=2

Rusty - Stop machine.
   After doing a bunch of testing last three days I actually downgraded stop 
machine 
   changes from [highly experimental] to simply [experimental]. Pleas see this 
thread 
   for more info: http://marc.info/?l=linux-kernelm=120243837206248w=2
   Short story is that I ran several insmod/rmmod workloads on live multi-core 
boxes 
   with stop machine _completely_ disabled and did no see any issues. Rusty did 
not get
   a chance to reply yet, I hopping that we'll be able to make stop machine 
completely
   optional for some configurations.

Gerg - ABI documentation.
   Nothing interesting here. I simply added 
Documentation/ABI/testing/sysfs-devices-system-cpu
   and documented some of the attributes exposed in there.
   Suggested by Andrew.

I believe this is ready for the inclusion and my impression is that Andrew is 
ok with that. 
Most changes are very simple and do not affect existing behavior. As I 
mentioned before I've 
been using Workqueue and StopMachine changes in production for a couple of 
years now and have 
high confidence in them. Yet they are marked as experimental for now, just to 
be safe.

My original explanation is included below.

btw I'll be out skiing/snow boarding for the next 4 days and will have sporadic 
email access.
Will do my best to address question/concerns (if any) during that time.

Thanx
Max

--
This patch series extends CPU isolation support. Yes, most people want to 
virtuallize 
CPUs these days and I want to isolate them  :) .

The primary idea here is to be able to use some CPU cores as the dedicated 
engines for running
user-space code with minimal kernel overhead/intervention, think of it as an 
SPE in the 
Cell processor. I'd like to be able to run a CPU intensive (%100) RT task on 
one of the 
processors without adversely affecting or being affected by the other system 
activities. 
System activities here include _kernel_ activities as well. 

I'm personally using this for hard realtime purposes. With CPU isolation it's 
very easy to 
achieve single digit usec worst case and around 200 nsec average response times 
on off-the-shelf
multi- processor/core systems (vanilla kernel plus these patches) even under 
extreme system load. 
I'm working with legal folks on releasing hard RT user-space framework for that.
I believe with the current multi-core CPU trend we will see more and more 
applications that 
explore this capability: RT gaming engines, simulators, hard RT apps, etc.

Hence the proposal is to extend current CPU isolation feature.
The new definition of the CPU isolation would be:
---
1. Isolated CPU(s) must not be subject to scheduler load balancing
  Users must explicitly bind threads in order to run on those CPU(s).

2. By default 

Re: [git pull] CPU isolation extensions (updated)

2008-02-09 Thread Paul Jackson
Max wrote:
 Linus, please pull CPU isolation extensions from

Did I miss something in this discussion?  I thought
Ingo was quite clear, and Linus pretty clear too,
that this patch should bake in *-mm or some such
place for a bit first.

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson [EMAIL PROTECTED] 1.940.382.4214
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions (updated)

2008-02-09 Thread Max Krasnyansky
Paul Jackson wrote:
 Max wrote:
 Linus, please pull CPU isolation extensions from
 
 Did I miss something in this discussion?  I thought
 Ingo was quite clear, and Linus pretty clear too,
 that this patch should bake in *-mm or some such
 place for a bit first.
 

Andrew said:
 The feature as a whole seems useful, and I don't actually oppose the merge
 based on what I see here.  As long as you're really sure that cpusets are
 inappropriate (and bear in mind that Paul has a track record of being wrong
 on this :)).  But I see a few glitches 

As far as I can understand Andrew is ok with the merge. And I addressed all 
his comments.

Linus said:
 Have these been in -mm and widely discussed etc? I'd like to start more 
 carefully, and (a) have that controversial last patch not merged initially 
 and (b) make sure everybody is on the same page wrt this all..

As far as I can understand Linus _asked_ whether it was in -mm or not and 
whether
everybody's on the same page. He did not say this must be in -mm first.
I explained that it has not been in -mm, and who it was discussed with, and did 
a 
bunch more testing/investigation on the controversial patch and explained why I 
think 
it's not that controversial any more.

Ingo said a few different things (a bit too large to quote). 
- That it was not discussed. I explained that it was in fact discussed and 
provided
a bunch of pointers to the mail threads.
- That he thinks that cpuset is the way to do it. Again I explained why it's 
not.
And at the end he said:
 Also, i'd not mind some test-coverage in sched.git as well.

I far as I know do not mind does not mean must go to ;-). Also I replied 
that 
I did not mind either but I do not think that it has much (if anything) to do 
with
the scheduler.

Anyway. I think I mentioned that I did not mind -mm either. I think it's ready 
for
the mainline. But if people still strongly feel that it has to be in -mm that's 
fine.
Lets just do s/Linus/Andrew/ on the first line and move on. But if Linus pulls 
it now
even better ;-)
  
Andrew, Linus, I'll let you guys decide which tree it needs to go.

Max
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions

2008-02-07 Thread Max Krasnyansky
Hi Ingo,

Thanks for your reply.

> * Linus Torvalds <[EMAIL PROTECTED]> wrote:
> 
>> On Wed, 6 Feb 2008, Max Krasnyansky wrote:
>>> Linus, please pull CPU isolation extensions from
>>>
>>> git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git 
>>> for-linus
>> Have these been in -mm and widely discussed etc? I'd like to start 
>> more carefully, and (a) have that controversial last patch not merged 
>> initially and (b) make sure everybody is on the same page wrt this 
>> all..
> 
> no, they have not been under nearly enough testing and review - these 
> patches surfaced on lkml for the first time one week ago (!). 
Almost two weeks actually. Ok 1.8 :)

> I find the pull request totally premature, this stuff has not been discussed 
> and 
> agreed on _at all_.
Ingo, I may have the wrong impression but my impression is that you ignored all 
the 
other emails and just read Linus' reply. I do not believe this accusation is 
valid.
I apologize if my impression is incorrect.
Since the patches _do not_ change/affect existing scheduler/cpuset 
functionality I did 
not know who to CC in the first email that I sent. Luckily Peter picked it up 
and CC'ed 
a bunch of folks, including Paul, Steven and You.
All of them replied and had questions/concerns. As I mentioned before I believe 
I addressed
all of them.
 
> None of the people who maintain and have interest in 
> this code and participated in the (short) one-week discussion were 
> Cc:-ed to the pull request.
Ok. I did not realize I'm supposed to do that. 
Since I got no replies to the second round of patches (take 2), which again was 
CC'ed to
the same people that Peter CC'ed. I assumed that people are ok with it. That's 
what discussion 
on the first take ended with.

> I think these patches also need a buy-in from Peter Zijlstra and Paul 
> Jackson (or really good reasoning while any objections from them should 
> be overriden) - all of whom deal with the code affected by these changes 
> on a daily basis and have an interest in CPU isolation features.
See above. 
Following issues were raised:
1. Peter and Steven initially thought that workqueue isolation is not needed.
2. Paul thought that it should be implemented on top of cpusets.
3. Peter thought that stopmachine change is not safe.
There were a couple of other minor misunderstandings (for example Peter thought 
that I'm completely disallowing IRQs on isolated CPUs, which is obviously not
the case). I clarified all of them.

#1 I explained in the original thread and then followed up with concrete code 
example
of why it is needed.
http://marc.info/?l=linux-kernel=120217173001671=2
Got no replies so far. So I'm assuming folks are happy.

#2 I started a separate thread on that
http://marc.info/?l=linux-kernel=120180692331461=2
The conclusion was, well let me just quote exactly what Paul had said:

> Paul Jackson wrote:
>> Max wrote:
>>> Looks like I failed to explain what I'm trying to achieve. So let me try 
>>> again.
>> 
>> Well done.  I read through that, expecting to disagree or at least
>> to not understand at some point, and got all the way through nodding
>> my head in agreement.  Good.
>> 
>> Whether the earlier confusions were lack of clarity in the presentation,
>> or lack of competence in my brain ... well guess I don't want to ask that
>> question ;).


And #3 Peter did not agree with me but said that it's up to Linus or Andrew to 
decide
whether it's appropriate in mainline or not. I _clearly_ indicated that this 
part is
somewhat controversial and maybe dangerous, I'm _not_ trying to sneak something 
in. 
Andrew picked it up and I'm going to do some more investigation on whether it's 
really
not safe or is actually fine (about to send an email to Rusty).

> Generally i think that cpusets is actually the feature and API that 
> should be used (and extended) for CPU isolation - and we already 
> extended it recently in the direction of CPU isolation. Most enterprise 
> distros have cpusets enabled so it's in use. Also, cpusets has the 
> appeal of being commonly used in the "big honking boxes" arena, so 
> reusing the same concept for RT and virtualization stuff would be the 
> natural approach. It already ties in to the scheduler domains code 
> dynamically and is flexible and scalable. I resisted ad-hoc CPU 
> isolation patches in -rt for that reason. 
That's exactly what Paul proposed initially. I completely disagree with that 
but I did look 
at it in _detail_. 
Please take a look here for detailed explanation
http://marc.info/?l=linux-kernel=120180692331461=2
This email getting to long and I did not want to inline everything.

> Also, i'd not mind some test-coverage in sched.git as well.
I believe it has _nothing_ to do with the "scheduler" but I do not mind it 
being in that tree.
Please read this email on why it has nothing to do with the scheduler
http://marc.info/?l=linux-kernel=120210515323578=2
That's the email that convinced 

Re: [git pull] CPU isolation extensions

2008-02-07 Thread Ingo Molnar

* Linus Torvalds <[EMAIL PROTECTED]> wrote:

> On Wed, 6 Feb 2008, Max Krasnyansky wrote:
> >
> > Linus, please pull CPU isolation extensions from
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git 
> > for-linus
> 
> Have these been in -mm and widely discussed etc? I'd like to start 
> more carefully, and (a) have that controversial last patch not merged 
> initially and (b) make sure everybody is on the same page wrt this 
> all..

no, they have not been under nearly enough testing and review - these 
patches surfaced on lkml for the first time one week ago (!). I find the 
pull request totally premature, this stuff has not been discussed and 
agreed on _at all_. None of the people who maintain and have interest in 
this code and participated in the (short) one-week discussion were 
Cc:-ed to the pull request.

I think these patches also need a buy-in from Peter Zijlstra and Paul 
Jackson (or really good reasoning while any objections from them should 
be overriden) - all of whom deal with the code affected by these changes 
on a daily basis and have an interest in CPU isolation features.

Generally i think that cpusets is actually the feature and API that 
should be used (and extended) for CPU isolation - and we already 
extended it recently in the direction of CPU isolation. Most enterprise 
distros have cpusets enabled so it's in use. Also, cpusets has the 
appeal of being commonly used in the "big honking boxes" arena, so 
reusing the same concept for RT and virtualization stuff would be the 
natural approach. It already ties in to the scheduler domains code 
dynamically and is flexible and scalable. I resisted ad-hoc CPU 
isolation patches in -rt for that reason. Also, i'd not mind some 
test-coverage in sched.git as well.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions

2008-02-07 Thread Andrew Morton
On Thu, 07 Feb 2008 09:22:34 -0800 Max Krasnyansky <[EMAIL PROTECTED]> wrote:

> > - There are two separate and identical implementations of
> >   cpu_unusable(cpu).  Please do it once, in a header, preferably with C
> >   function, not macros.
> 
> Those are local versions that depend whether a feature is enabled or not.
> If CONFIG_CPUISOL_WORKQUEUE is disabled we want to cpu_unusable()
> in the workqueue.c to be a noop, and if it's enabled that macro resolve to 
> cpu_isolated(). 
> Same thing for the stopmachine.c. If CONFIG_CPUISOL_STOPMACHIN is disabled
> cpu_unusable() is a noop. 
> In other words cpu_isolated() is the one common macro that subsystem may
> want to stub out. 
> Do you see another way of doing this ?

ah, I missed that.  Yup, the implementation you have there looks OK.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions

2008-02-07 Thread Max Krasnyansky


Paul Jackson wrote:
> Max - Andrew wondered if the rt tree had seen the
> code or commented it on it.  What became of that?
I just replied to Andrew. It's not an RT feature per se.
And yes Peter CC'ed RT folks. You probably did not get a chance to read all 
replies.
They had some questions/concerns and stuff. I believe I answered/clarified all 
of them.

> My two cents isn't worth a plug nickel here, but
> I'm inclined to nod in agreement when Linus wants
> to see these patches get some more exposure before
> going into Linus's tree.  ... what's the hurry?
No hurry I guess. I did mentioned in the introductory email that I've been 
maintaining 
this stuff for awhile now. SLAB patches used to be messy, with new SLUB the 
mess goes away.
CFS handles CPU hotplug much better than O(1), cpu hotplug is needed to be able 
to change
isolated bit from sysfs. That's why I think it's a good time to merge.
I don't mind of course if we put this stuff in -mm first. Although first part 
of the patchset 
(ie exporting isolated map, sysfs interface, etc) seem very simple and totally 
not controversial.
Stop machine patch is really the only thing that may look suspicious. 
 
Max
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions

2008-02-07 Thread Max Krasnyansky
Andrew Morton wrote:
> On Thu, 7 Feb 2008 01:59:54 -0600 Paul Jackson <[EMAIL PROTECTED]> wrote:
> 
>> but hard real time is not my expertise
> 
> Speaking of which..  there is the -rt tree.  Have those people had a look
> at the feature, perhaps played with the code?

Peter Z. and Steven R. sent me some comments, I believe I explained and 
addressed them.
Ingo's been quite. Probably too busy.

btw It's not an RT feature per se. It certainly helps RT but removing all the 
latency
sources from isolated CPUs. But in general it's just "reducing kernel overhead 
on some CPUs"
kind of feature.

Max 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions

2008-02-07 Thread Paul Jackson
Max - Andrew wondered if the rt tree had seen the
code or commented it on it.  What became of that?

My two cents isn't worth a plug nickel here, but
I'm inclined to nod in agreement when Linus wants
to see these patches get some more exposure before
going into Linus's tree.  ... what's the hurry?

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson <[EMAIL PROTECTED]> 1.940.382.4214
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions

2008-02-07 Thread Max Krasnyansky
Paul Jackson wrote:
> Andrew wrote:
>>  (and bear in mind that Paul has a track record of being wrong
>>  on this :))
> 
> heh - I saw that .
> 
> Max - Andrew's about right, as usual.  You answered my initial
> questions on this patch set adequately, but hard real time is
> not my expertise, so in the final analysis, other than my saying
> I don't have any more objections, my input doesn't mean much
> either way.

I honestly think this one is no brainer and I do not think this one will hurt 
Paul's track record :).
Paul initially disagreed with me and that's when he was wrong ;-))
 
Andrew, I looked at this in detail and here is an explanation that 
I sent to Paul a few days ago (a bit shortened/updated version).


I thought some more about your proposal to use sched_load_balance flag in 
cpusets instead of extending 
cpu_isolated_map. I looked at the cpusets, cgroups and here are my thoughts on 
this.
Here is the list of issues with sched_load_balance flag from CPU isolation 
perspective:
-- 
(1) Boot time isolation is not possible. There is currently no way to setup a 
cpuset at
boot time. For example we won't be able to isolate cpus from irqs and 
workqueues at boot.
Not a major issue but still an inconvenience.

-- 
(2) There is currently no easy way to figure out what cpuset a cpu belongs to 
in order to query 
it's sched_load_balance flag. In order to do that we need a method that 
iterates all active cpusets 
and checks their cpus_allowed masks. This implies holding cgroup and cpuset 
mutexes. It's not clear 
whether it's ok to do that from the the contexts CPU isolation happens in 
(apic, sched, workqueue). 
It seems that cgroup/cpuset api is designed from top down access. ie adding a 
cpu to a set and then 
recomputing domains. Which makes perfect sense for the common cpuset usecase 
but is not what cpu 
isolation needs.
In other words I think it's much simpler and cleaner to use the 
cpu_isolated_map for isolation
purposes. No locks, no races, etc.

-- 
(3) cpusets are a bit too dynamic  :) . What I mean by this is that 
sched_load_balance flag
can be changed at any time without bringing a CPU offline. What that means is 
that we'll
need some notifier mechanisms for killing and restarting workqueue threads when 
that flag changes. 
Also we'd need some logic that makes sure that a user does not disable load 
balancing on all cpus 
because that effectively will kill workqueues on all the cpus.
This particular case is already handled very nicely in my patches. Isolated bit 
can be set
only when cpu is offline and it cannot be set on the first online cpu. 
Workqueus and other
subsystems already handle cpu hotplug events nicely and can easily ignore 
isolated cpus when
they come online.

--
#1 is probably unfixable. #2 and #3 can be fixed but at the expense of extra 
complexity across
the board. I seriously doubt that I'll be able to push that through the reviews 
;-).

Also personally I still think cpusets and cpu isolation attack two different 
problems. cpusets is about 
partitioning cpus and memory nodes, and managing tasks. Most of the 
cgroups/cpuset APIs are designed to 
deal with tasks. 
CPU isolation is much simpler and is at the lower layer. It deals with IRQs, 
kernel per cpu threads, etc. 
The only intersection I see is that both features affect scheduling domains. 
CPU isolation is again 
simple here it uses existing logic in sched.c it does not change anything in 
this area. 

-

Andrew, hopefully that clarifies it. Let me know if you're not convinced.

Max
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions

2008-02-07 Thread Max Krasnyansky
Hi Linus,

Linus Torvalds wrote:
> 
> On Wed, 6 Feb 2008, Max Krasnyansky wrote:
>> Linus, please pull CPU isolation extensions from
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git for-linus
> 
> Have these been in -mm and widely discussed etc? I'd like to start more 
> carefully, and (a) have that controversial last patch not merged initially 
> and (b) make sure everybody is on the same page wrt this all..

They've been discussed with RT/scheduler/cpuset folks.
Andrew is definitely in the loop. He just replied and asked for some fixes and
clarifications. He seems to be ok with merging this in general.

The last patch may not be as bad as I originally thought. We'll discuss it some
more with Andrew. I'll also check with Rusty who wrote the stopmachine in the 
first place. It actually seems like an overkill at this point. My impression is 
that it was supposed to be a safety net if some refcounting/locking is not 
fully 
safe and may not be needed or as critical anymore. 
I'm maybe wrong of course. So I'll find that out :)

Thanx
Max
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions

2008-02-07 Thread Max Krasnyansky
Andrew Morton wrote:
> On Wed, 06 Feb 2008 21:32:55 -0800 Max Krasnyansky <[EMAIL PROTECTED]> wrote:
> 
>> Linus, please pull CPU isolation extensions from
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git for-linus
> 
> The feature as a whole seems useful, and I don't actually oppose the merge
> based on what I see here.  
Awesome :) I think it's get more and more useful as people will start trying
to figure out what the heck there is supposed to do with the spare CPU cores.
I mean pretty soon most machines will have 4 cores and some will have 8.
One way to use those cores is the "dedicated engine" model.  

> As long as you're really sure that cpusets are
> inappropriate (and bear in mind that Paul has a track record of being wrong
> on this :)).
I'll cover this in a separate email with more details.
  
> But I see a few glitches
Good catches. Thanks for reviewing.

> - There are two separate and identical implementations of
>   cpu_unusable(cpu).  Please do it once, in a header, preferably with C
>   function, not macros.

Those are local versions that depend whether a feature is enabled or not.
If CONFIG_CPUISOL_WORKQUEUE is disabled we want to cpu_unusable()
in the workqueue.c to be a noop, and if it's enabled that macro resolve to 
cpu_isolated(). 
Same thing for the stopmachine.c. If CONFIG_CPUISOL_STOPMACHIN is disabled
cpu_unusable() is a noop. 
In other words cpu_isolated() is the one common macro that subsystem may
want to stub out. 
Do you see another way of doing this ?

> - The Kconfig help is a bit scraggly:
> 
> +config CPUISOL_STOPMACHINE
> + bool "Do not halt isolated CPUs with Stop Machine (HIGHLY EXPERIMENTAL)"
> + depends on CPUISOL && STOP_MACHINE && EXPERIMENTAL
> + help
> +  If this option is enabled kernel will not halt isolated CPUs when 
> Stop Machine
> 
> "the kernel"
> 
> text is too wide
Got it. Will fix asap.
 
> +  is triggered.
> +  Stop Machine is currently only used by the module insertion and 
> removal logic.
> +  Please note that at this point this feature is highly experimental 
> and maybe
> +  dangerous. It is not known to really brake anything but can 
> potentially 
> +  introduce an instability.
> 
> s/maybe/may be/
> s/brake/break/

Man, the typos are killing me :). Will fix.

> Neither this text, nor the changelog nor the code comments tell us what the
> potential instability with stopmachine *is*?  Or maybe I missed it.
That's the thing, we don't really know :). In real life does not seem to be a 
problem at all.
As I mentioned in prev emails. We've been running all kinds of machines with 
this enabled,
and inserting all kinds of modules left and right. Never seen any crashes or 
anything.
But the fact that stopmachine is supposed to halt all cpus during module 
insertion/removal
seems to imply that something bad may happen if some cpus are not halted. It 
may very well
turnout that it's no longer needed because our locking and refcounting handles 
this just fine.
I mean ideally we should not have to halt the entire box, it causes terrible 
latencies.
 
> - Adding new sysfs files without updating Documentation/ABI/ makes Greg cry.
Oh, did not know that. Will fix.

> 
> - Why is cpu_isolated_map exported to modules?  Just for api consistency, it 
> appears?
Yes. For consistency. We'd want cpu_isolated() to work everywhere.
 
> pre-existing problems:
> 
> - isolated_cpu_setup() has an on-stack array of NR_CPUS integers.  This
>   will consume 4k of stack on ia64 (at least).  We'll just squeak through
>   for a ittle while, but this needs to be fixed.  Just move it into
>   __initdata.
Will do.
 
> - isolated_cpu_setup() expects that the user can provide an up-to-1024
>   character kernel boot parameter.  Is this reasonable given cpu command
>   line limits, and given that NR_CPUS will surely grow beyond 1024 in the
>   future?
I'm thinking that is reasonable for now.

I'll fix and resend the patches asap.

Thanx
Max
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions

2008-02-07 Thread Linus Torvalds


On Wed, 6 Feb 2008, Max Krasnyansky wrote:
>
> Linus, please pull CPU isolation extensions from
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git for-linus

Have these been in -mm and widely discussed etc? I'd like to start more 
carefully, and (a) have that controversial last patch not merged initially 
and (b) make sure everybody is on the same page wrt this all..

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions

2008-02-07 Thread Andrew Morton
On Thu, 7 Feb 2008 01:59:54 -0600 Paul Jackson <[EMAIL PROTECTED]> wrote:

> but hard real time is
> not my expertise

Speaking of which..  there is the -rt tree.  Have those people had a look
at the feature, perhaps played with the code?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions

2008-02-07 Thread Paul Jackson
Andrew wrote:
>  (and bear in mind that Paul has a track record of being wrong
>  on this :))

heh - I saw that .

Max - Andrew's about right, as usual.  You answered my initial
questions on this patch set adequately, but hard real time is
not my expertise, so in the final analysis, other than my saying
I don't have any more objections, my input doesn't mean much
either way.

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson <[EMAIL PROTECTED]> 1.940.382.4214
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions

2008-02-07 Thread Andrew Morton
On Thu, 7 Feb 2008 01:59:54 -0600 Paul Jackson [EMAIL PROTECTED] wrote:

 but hard real time is
 not my expertise

Speaking of which..  there is the -rt tree.  Have those people had a look
at the feature, perhaps played with the code?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions

2008-02-07 Thread Paul Jackson
Andrew wrote:
  (and bear in mind that Paul has a track record of being wrong
  on this :))

heh - I saw that grin.

Max - Andrew's about right, as usual.  You answered my initial
questions on this patch set adequately, but hard real time is
not my expertise, so in the final analysis, other than my saying
I don't have any more objections, my input doesn't mean much
either way.

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson [EMAIL PROTECTED] 1.940.382.4214
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions

2008-02-07 Thread Max Krasnyansky
Andrew Morton wrote:
 On Thu, 7 Feb 2008 01:59:54 -0600 Paul Jackson [EMAIL PROTECTED] wrote:
 
 but hard real time is not my expertise
 
 Speaking of which..  there is the -rt tree.  Have those people had a look
 at the feature, perhaps played with the code?

Peter Z. and Steven R. sent me some comments, I believe I explained and 
addressed them.
Ingo's been quite. Probably too busy.

btw It's not an RT feature per se. It certainly helps RT but removing all the 
latency
sources from isolated CPUs. But in general it's just reducing kernel overhead 
on some CPUs
kind of feature.

Max 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions

2008-02-07 Thread Ingo Molnar

* Linus Torvalds [EMAIL PROTECTED] wrote:

 On Wed, 6 Feb 2008, Max Krasnyansky wrote:
 
  Linus, please pull CPU isolation extensions from
  
  git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git 
  for-linus
 
 Have these been in -mm and widely discussed etc? I'd like to start 
 more carefully, and (a) have that controversial last patch not merged 
 initially and (b) make sure everybody is on the same page wrt this 
 all..

no, they have not been under nearly enough testing and review - these 
patches surfaced on lkml for the first time one week ago (!). I find the 
pull request totally premature, this stuff has not been discussed and 
agreed on _at all_. None of the people who maintain and have interest in 
this code and participated in the (short) one-week discussion were 
Cc:-ed to the pull request.

I think these patches also need a buy-in from Peter Zijlstra and Paul 
Jackson (or really good reasoning while any objections from them should 
be overriden) - all of whom deal with the code affected by these changes 
on a daily basis and have an interest in CPU isolation features.

Generally i think that cpusets is actually the feature and API that 
should be used (and extended) for CPU isolation - and we already 
extended it recently in the direction of CPU isolation. Most enterprise 
distros have cpusets enabled so it's in use. Also, cpusets has the 
appeal of being commonly used in the big honking boxes arena, so 
reusing the same concept for RT and virtualization stuff would be the 
natural approach. It already ties in to the scheduler domains code 
dynamically and is flexible and scalable. I resisted ad-hoc CPU 
isolation patches in -rt for that reason. Also, i'd not mind some 
test-coverage in sched.git as well.

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions

2008-02-07 Thread Andrew Morton
On Thu, 07 Feb 2008 09:22:34 -0800 Max Krasnyansky [EMAIL PROTECTED] wrote:

  - There are two separate and identical implementations of
cpu_unusable(cpu).  Please do it once, in a header, preferably with C
function, not macros.
 
 Those are local versions that depend whether a feature is enabled or not.
 If CONFIG_CPUISOL_WORKQUEUE is disabled we want to cpu_unusable()
 in the workqueue.c to be a noop, and if it's enabled that macro resolve to 
 cpu_isolated(). 
 Same thing for the stopmachine.c. If CONFIG_CPUISOL_STOPMACHIN is disabled
 cpu_unusable() is a noop. 
 In other words cpu_isolated() is the one common macro that subsystem may
 want to stub out. 
 Do you see another way of doing this ?

ah, I missed that.  Yup, the implementation you have there looks OK.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions

2008-02-07 Thread Max Krasnyansky
Hi Linus,

Linus Torvalds wrote:
 
 On Wed, 6 Feb 2008, Max Krasnyansky wrote:
 Linus, please pull CPU isolation extensions from

 git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git for-linus
 
 Have these been in -mm and widely discussed etc? I'd like to start more 
 carefully, and (a) have that controversial last patch not merged initially 
 and (b) make sure everybody is on the same page wrt this all..

They've been discussed with RT/scheduler/cpuset folks.
Andrew is definitely in the loop. He just replied and asked for some fixes and
clarifications. He seems to be ok with merging this in general.

The last patch may not be as bad as I originally thought. We'll discuss it some
more with Andrew. I'll also check with Rusty who wrote the stopmachine in the 
first place. It actually seems like an overkill at this point. My impression is 
that it was supposed to be a safety net if some refcounting/locking is not 
fully 
safe and may not be needed or as critical anymore. 
I'm maybe wrong of course. So I'll find that out :)

Thanx
Max
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions

2008-02-07 Thread Paul Jackson
Max - Andrew wondered if the rt tree had seen the
code or commented it on it.  What became of that?

My two cents isn't worth a plug nickel here, but
I'm inclined to nod in agreement when Linus wants
to see these patches get some more exposure before
going into Linus's tree.  ... what's the hurry?

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson [EMAIL PROTECTED] 1.940.382.4214
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions

2008-02-07 Thread Max Krasnyansky


Paul Jackson wrote:
 Max - Andrew wondered if the rt tree had seen the
 code or commented it on it.  What became of that?
I just replied to Andrew. It's not an RT feature per se.
And yes Peter CC'ed RT folks. You probably did not get a chance to read all 
replies.
They had some questions/concerns and stuff. I believe I answered/clarified all 
of them.

 My two cents isn't worth a plug nickel here, but
 I'm inclined to nod in agreement when Linus wants
 to see these patches get some more exposure before
 going into Linus's tree.  ... what's the hurry?
No hurry I guess. I did mentioned in the introductory email that I've been 
maintaining 
this stuff for awhile now. SLAB patches used to be messy, with new SLUB the 
mess goes away.
CFS handles CPU hotplug much better than O(1), cpu hotplug is needed to be able 
to change
isolated bit from sysfs. That's why I think it's a good time to merge.
I don't mind of course if we put this stuff in -mm first. Although first part 
of the patchset 
(ie exporting isolated map, sysfs interface, etc) seem very simple and totally 
not controversial.
Stop machine patch is really the only thing that may look suspicious. 
 
Max
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions

2008-02-07 Thread Max Krasnyansky
Paul Jackson wrote:
 Andrew wrote:
  (and bear in mind that Paul has a track record of being wrong
  on this :))
 
 heh - I saw that grin.
 
 Max - Andrew's about right, as usual.  You answered my initial
 questions on this patch set adequately, but hard real time is
 not my expertise, so in the final analysis, other than my saying
 I don't have any more objections, my input doesn't mean much
 either way.

I honestly think this one is no brainer and I do not think this one will hurt 
Paul's track record :).
Paul initially disagreed with me and that's when he was wrong ;-))
 
Andrew, I looked at this in detail and here is an explanation that 
I sent to Paul a few days ago (a bit shortened/updated version).


I thought some more about your proposal to use sched_load_balance flag in 
cpusets instead of extending 
cpu_isolated_map. I looked at the cpusets, cgroups and here are my thoughts on 
this.
Here is the list of issues with sched_load_balance flag from CPU isolation 
perspective:
-- 
(1) Boot time isolation is not possible. There is currently no way to setup a 
cpuset at
boot time. For example we won't be able to isolate cpus from irqs and 
workqueues at boot.
Not a major issue but still an inconvenience.

-- 
(2) There is currently no easy way to figure out what cpuset a cpu belongs to 
in order to query 
it's sched_load_balance flag. In order to do that we need a method that 
iterates all active cpusets 
and checks their cpus_allowed masks. This implies holding cgroup and cpuset 
mutexes. It's not clear 
whether it's ok to do that from the the contexts CPU isolation happens in 
(apic, sched, workqueue). 
It seems that cgroup/cpuset api is designed from top down access. ie adding a 
cpu to a set and then 
recomputing domains. Which makes perfect sense for the common cpuset usecase 
but is not what cpu 
isolation needs.
In other words I think it's much simpler and cleaner to use the 
cpu_isolated_map for isolation
purposes. No locks, no races, etc.

-- 
(3) cpusets are a bit too dynamic  :) . What I mean by this is that 
sched_load_balance flag
can be changed at any time without bringing a CPU offline. What that means is 
that we'll
need some notifier mechanisms for killing and restarting workqueue threads when 
that flag changes. 
Also we'd need some logic that makes sure that a user does not disable load 
balancing on all cpus 
because that effectively will kill workqueues on all the cpus.
This particular case is already handled very nicely in my patches. Isolated bit 
can be set
only when cpu is offline and it cannot be set on the first online cpu. 
Workqueus and other
subsystems already handle cpu hotplug events nicely and can easily ignore 
isolated cpus when
they come online.

--
#1 is probably unfixable. #2 and #3 can be fixed but at the expense of extra 
complexity across
the board. I seriously doubt that I'll be able to push that through the reviews 
;-).

Also personally I still think cpusets and cpu isolation attack two different 
problems. cpusets is about 
partitioning cpus and memory nodes, and managing tasks. Most of the 
cgroups/cpuset APIs are designed to 
deal with tasks. 
CPU isolation is much simpler and is at the lower layer. It deals with IRQs, 
kernel per cpu threads, etc. 
The only intersection I see is that both features affect scheduling domains. 
CPU isolation is again 
simple here it uses existing logic in sched.c it does not change anything in 
this area. 

-

Andrew, hopefully that clarifies it. Let me know if you're not convinced.

Max
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] CPU isolation extensions

2008-02-07 Thread Max Krasnyansky
Hi Ingo,

Thanks for your reply.

 * Linus Torvalds [EMAIL PROTECTED] wrote:
 
 On Wed, 6 Feb 2008, Max Krasnyansky wrote:
 Linus, please pull CPU isolation extensions from

 git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git 
 for-linus
 Have these been in -mm and widely discussed etc? I'd like to start 
 more carefully, and (a) have that controversial last patch not merged 
 initially and (b) make sure everybody is on the same page wrt this 
 all..
 
 no, they have not been under nearly enough testing and review - these 
 patches surfaced on lkml for the first time one week ago (!). 
Almost two weeks actually. Ok 1.8 :)

 I find the pull request totally premature, this stuff has not been discussed 
 and 
 agreed on _at all_.
Ingo, I may have the wrong impression but my impression is that you ignored all 
the 
other emails and just read Linus' reply. I do not believe this accusation is 
valid.
I apologize if my impression is incorrect.
Since the patches _do not_ change/affect existing scheduler/cpuset 
functionality I did 
not know who to CC in the first email that I sent. Luckily Peter picked it up 
and CC'ed 
a bunch of folks, including Paul, Steven and You.
All of them replied and had questions/concerns. As I mentioned before I believe 
I addressed
all of them.
 
 None of the people who maintain and have interest in 
 this code and participated in the (short) one-week discussion were 
 Cc:-ed to the pull request.
Ok. I did not realize I'm supposed to do that. 
Since I got no replies to the second round of patches (take 2), which again was 
CC'ed to
the same people that Peter CC'ed. I assumed that people are ok with it. That's 
what discussion 
on the first take ended with.

 I think these patches also need a buy-in from Peter Zijlstra and Paul 
 Jackson (or really good reasoning while any objections from them should 
 be overriden) - all of whom deal with the code affected by these changes 
 on a daily basis and have an interest in CPU isolation features.
See above. 
Following issues were raised:
1. Peter and Steven initially thought that workqueue isolation is not needed.
2. Paul thought that it should be implemented on top of cpusets.
3. Peter thought that stopmachine change is not safe.
There were a couple of other minor misunderstandings (for example Peter thought 
that I'm completely disallowing IRQs on isolated CPUs, which is obviously not
the case). I clarified all of them.

#1 I explained in the original thread and then followed up with concrete code 
example
of why it is needed.
http://marc.info/?l=linux-kernelm=120217173001671w=2
Got no replies so far. So I'm assuming folks are happy.

#2 I started a separate thread on that
http://marc.info/?l=linux-kernelm=120180692331461w=2
The conclusion was, well let me just quote exactly what Paul had said:

 Paul Jackson wrote:
 Max wrote:
 Looks like I failed to explain what I'm trying to achieve. So let me try 
 again.
 
 Well done.  I read through that, expecting to disagree or at least
 to not understand at some point, and got all the way through nodding
 my head in agreement.  Good.
 
 Whether the earlier confusions were lack of clarity in the presentation,
 or lack of competence in my brain ... well guess I don't want to ask that
 question ;).


And #3 Peter did not agree with me but said that it's up to Linus or Andrew to 
decide
whether it's appropriate in mainline or not. I _clearly_ indicated that this 
part is
somewhat controversial and maybe dangerous, I'm _not_ trying to sneak something 
in. 
Andrew picked it up and I'm going to do some more investigation on whether it's 
really
not safe or is actually fine (about to send an email to Rusty).

 Generally i think that cpusets is actually the feature and API that 
 should be used (and extended) for CPU isolation - and we already 
 extended it recently in the direction of CPU isolation. Most enterprise 
 distros have cpusets enabled so it's in use. Also, cpusets has the 
 appeal of being commonly used in the big honking boxes arena, so 
 reusing the same concept for RT and virtualization stuff would be the 
 natural approach. It already ties in to the scheduler domains code 
 dynamically and is flexible and scalable. I resisted ad-hoc CPU 
 isolation patches in -rt for that reason. 
That's exactly what Paul proposed initially. I completely disagree with that 
but I did look 
at it in _detail_. 
Please take a look here for detailed explanation
http://marc.info/?l=linux-kernelm=120180692331461w=2
This email getting to long and I did not want to inline everything.

 Also, i'd not mind some test-coverage in sched.git as well.
I believe it has _nothing_ to do with the scheduler but I do not mind it 
being in that tree.
Please read this email on why it has nothing to do with the scheduler
http://marc.info/?l=linux-kernelm=120210515323578w=2
That's the email that convinced Paul.

To sum it up. It has been discussed with the right people. I do 

Re: [git pull] CPU isolation extensions

2008-02-06 Thread Andrew Morton
On Wed, 06 Feb 2008 21:32:55 -0800 Max Krasnyansky <[EMAIL PROTECTED]> wrote:

> Linus, please pull CPU isolation extensions from
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git for-linus

The feature as a whole seems useful, and I don't actually oppose the merge
based on what I see here.  As long as you're really sure that cpusets are
inappropriate (and bear in mind that Paul has a track record of being wrong
on this :)).  But I see a few glitches


- There are two separate and identical implementations of
  cpu_unusable(cpu).  Please do it once, in a header, preferably with C
  function, not macros.

- The Kconfig help is a bit scraggly:

+config CPUISOL_STOPMACHINE
+   bool "Do not halt isolated CPUs with Stop Machine (HIGHLY EXPERIMENTAL)"
+   depends on CPUISOL && STOP_MACHINE && EXPERIMENTAL
+   help
+  If this option is enabled kernel will not halt isolated CPUs when 
Stop Machine

"the kernel"

text is too wide

+  is triggered.
+  Stop Machine is currently only used by the module insertion and 
removal logic.
+  Please note that at this point this feature is highly experimental 
and maybe
+  dangerous. It is not known to really brake anything but can 
potentially 
+  introduce an instability.

s/maybe/may be/
s/brake/break/


Neither this text, nor the changelog nor the code comments tell us what the
potential instability with stopmachine *is*?  Or maybe I missed it.

- Adding new sysfs files without updating Documentation/ABI/ makes Greg
  cry.

- Why is cpu_isolated_map exported to modules?  Just for api consistency,
  it appears?


pre-existing problems:

- isolated_cpu_setup() has an on-stack array of NR_CPUS integers.  This
  will consume 4k of stack on ia64 (at least).  We'll just squeak through
  for a ittle while, but this needs to be fixed.  Just move it into
  __initdata.

- isolated_cpu_setup() expects that the user can provide an up-to-1024
  character kernel boot parameter.  Is this reasonable given cpu command
  line limits, and given that NR_CPUS will surely grow beyond 1024 in the
  future?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[git pull] CPU isolation extensions

2008-02-06 Thread Max Krasnyansky
Linus, please pull CPU isolation extensions from

git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git for-linus
 
Diffstat:

b/arch/x86/Kconfig  |1 
b/arch/x86/kernel/genapic_flat_64.c |5 ++-
b/drivers/base/cpu.c|   48 +++
b/include/linux/cpumask.h   |3 ++
b/kernel/Kconfig.cpuisol|   15 +++
b/kernel/Makefile   |4 +-
b/kernel/cpu.c  |   49 
b/kernel/sched.c|   37 ---
b/kernel/stop_machine.c |9 +-
b/kernel/workqueue.c|   31 --
kernel/Kconfig.cpuisol  |   26 ++-
11 files changed, 176 insertions(+), 52 deletions(-)

The patchset consist of 4 patches. 
cpuisol: Make cpu isolation configrable and export isolated map
cpuisol: Do not route IRQs to the CPUs isolated at boot
cpuisol: Do not schedule workqueues on the isolated CPUs
cpuisol: Do not halt isolated CPUs with Stop Machine

First two are very simple. They simply make "CPU isolation" a 
configurable feature, export cpu_isolated_map and provide some helper functions 
to access it (just 
like cpu_online() and friends).
Last two patches add support for isolating CPUs from running workqueus and stop 
machine. Last patch
is kind of controversial let me know if you think it's too ugly and I'll resend 
without it.
For more details see below.


This patch series extends CPU isolation support. Yes, most people want to 
virtuallize 
CPUs these days and I want to isolate them  :) .

The primary idea here is to be able to use some CPU cores as the dedicated 
engines for running
user-space code with minimal kernel overhead/intervention, think of it as an 
SPE in the 
Cell processor. I'd like to be able to run a CPU intensive (%100) RT task on 
one of the 
processors without adversely affecting or being affected by the other system 
activities. 
System activities here include _kernel_ activities as well. 

I'm personally using this for hard realtime purposes. With CPU isolation it's 
very easy to 
achieve single digit usec worst case and around 200 nsec average response times 
on off-the-shelf
multi- processor/core systems (vanilla kernel plus these patches) even under 
exteme system load. 
I'm working with legal folks on releasing hard RT user-space framework for that.
I believe with the current multi-core CPU trend we will see more and more 
applications that 
explore this capability: RT gaming engines, simulators, hard RT apps, etc.

Hence the proposal is to extend current CPU isolation feature.
The new definition of the CPU isolation would be:
---
1. Isolated CPU(s) must not be subject to scheduler load balancing
  Users must explicitly bind threads in order to run on those CPU(s).

2. By default interrupts must not be routed to the isolated CPU(s)
  User must route interrupts (if any) to those CPUs explicitly.

3. In general kernel subsystems must avoid activity on the isolated CPU(s) as 
much as possible
  Includes workqueues, per CPU threads, etc.
  This feature is configurable and is disabled by default.  
---

I've been maintaining this stuff since around 2.6.18 and it's been running in 
production
environment for a couple of years now. It's been tested on all kinds of 
machines, from NUMA
boxes like HP xw9300/9400 to tiny uTCA boards like Mercury AXA110.
The messiest part used to be SLAB garbage collector changes. With the new SLUB 
all that mess 
goes away (ie no changes necessary). Also CFS seems to handle CPU hotplug much 
better than O(1) 
did (ie domains are recomputed dynamically) so that isolation can be done at 
any time (via sysfs). 
So this seems like a good time to merge. 

We've had scheduler support for CPU isolation ever since O(1) scheduler went 
it. In other words
#1 is already supported. These patches do not change/affect that functionality 
in any way. 
#2 is trivial one liner change to the IRQ init code. 
#3 is addressed by a couple of separate patches. The main problem here is that 
RT thread can prevent
kernel threads from running and machine gets stuck because other CPUs are 
waiting for those threads
to run and report back.

Folks involved in the scheduler/cpuset development provided a lot of feedback 
on the first series
of patches. I believe I managed to explain and clarify every aspect. 
Paul Jackson initially suggested to implement #2 and #3 using cpusets 
subsystem. Paul and I looked 
at it more closely and determined that exporting cpu_isolated_map instead is a 
better option. 

Last patch to the stop machine is potentially unsafe and is marked as highly 
experimental. Unfortunately 
it's currently the only option that allows dynamic module insertion/removal for 
above scenarios. 
If people still feel that it's t ugly I can revert that change and keep it 
in the separate tree 

[git pull] CPU isolation extensions

2008-02-06 Thread Max Krasnyansky
Linus, please pull CPU isolation extensions from

git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git for-linus
 
Diffstat:

b/arch/x86/Kconfig  |1 
b/arch/x86/kernel/genapic_flat_64.c |5 ++-
b/drivers/base/cpu.c|   48 +++
b/include/linux/cpumask.h   |3 ++
b/kernel/Kconfig.cpuisol|   15 +++
b/kernel/Makefile   |4 +-
b/kernel/cpu.c  |   49 
b/kernel/sched.c|   37 ---
b/kernel/stop_machine.c |9 +-
b/kernel/workqueue.c|   31 --
kernel/Kconfig.cpuisol  |   26 ++-
11 files changed, 176 insertions(+), 52 deletions(-)

The patchset consist of 4 patches. 
cpuisol: Make cpu isolation configrable and export isolated map
cpuisol: Do not route IRQs to the CPUs isolated at boot
cpuisol: Do not schedule workqueues on the isolated CPUs
cpuisol: Do not halt isolated CPUs with Stop Machine

First two are very simple. They simply make CPU isolation a 
configurable feature, export cpu_isolated_map and provide some helper functions 
to access it (just 
like cpu_online() and friends).
Last two patches add support for isolating CPUs from running workqueus and stop 
machine. Last patch
is kind of controversial let me know if you think it's too ugly and I'll resend 
without it.
For more details see below.


This patch series extends CPU isolation support. Yes, most people want to 
virtuallize 
CPUs these days and I want to isolate them  :) .

The primary idea here is to be able to use some CPU cores as the dedicated 
engines for running
user-space code with minimal kernel overhead/intervention, think of it as an 
SPE in the 
Cell processor. I'd like to be able to run a CPU intensive (%100) RT task on 
one of the 
processors without adversely affecting or being affected by the other system 
activities. 
System activities here include _kernel_ activities as well. 

I'm personally using this for hard realtime purposes. With CPU isolation it's 
very easy to 
achieve single digit usec worst case and around 200 nsec average response times 
on off-the-shelf
multi- processor/core systems (vanilla kernel plus these patches) even under 
exteme system load. 
I'm working with legal folks on releasing hard RT user-space framework for that.
I believe with the current multi-core CPU trend we will see more and more 
applications that 
explore this capability: RT gaming engines, simulators, hard RT apps, etc.

Hence the proposal is to extend current CPU isolation feature.
The new definition of the CPU isolation would be:
---
1. Isolated CPU(s) must not be subject to scheduler load balancing
  Users must explicitly bind threads in order to run on those CPU(s).

2. By default interrupts must not be routed to the isolated CPU(s)
  User must route interrupts (if any) to those CPUs explicitly.

3. In general kernel subsystems must avoid activity on the isolated CPU(s) as 
much as possible
  Includes workqueues, per CPU threads, etc.
  This feature is configurable and is disabled by default.  
---

I've been maintaining this stuff since around 2.6.18 and it's been running in 
production
environment for a couple of years now. It's been tested on all kinds of 
machines, from NUMA
boxes like HP xw9300/9400 to tiny uTCA boards like Mercury AXA110.
The messiest part used to be SLAB garbage collector changes. With the new SLUB 
all that mess 
goes away (ie no changes necessary). Also CFS seems to handle CPU hotplug much 
better than O(1) 
did (ie domains are recomputed dynamically) so that isolation can be done at 
any time (via sysfs). 
So this seems like a good time to merge. 

We've had scheduler support for CPU isolation ever since O(1) scheduler went 
it. In other words
#1 is already supported. These patches do not change/affect that functionality 
in any way. 
#2 is trivial one liner change to the IRQ init code. 
#3 is addressed by a couple of separate patches. The main problem here is that 
RT thread can prevent
kernel threads from running and machine gets stuck because other CPUs are 
waiting for those threads
to run and report back.

Folks involved in the scheduler/cpuset development provided a lot of feedback 
on the first series
of patches. I believe I managed to explain and clarify every aspect. 
Paul Jackson initially suggested to implement #2 and #3 using cpusets 
subsystem. Paul and I looked 
at it more closely and determined that exporting cpu_isolated_map instead is a 
better option. 

Last patch to the stop machine is potentially unsafe and is marked as highly 
experimental. Unfortunately 
it's currently the only option that allows dynamic module insertion/removal for 
above scenarios. 
If people still feel that it's t ugly I can revert that change and keep it 
in the separate tree 

Re: [git pull] CPU isolation extensions

2008-02-06 Thread Andrew Morton
On Wed, 06 Feb 2008 21:32:55 -0800 Max Krasnyansky [EMAIL PROTECTED] wrote:

 Linus, please pull CPU isolation extensions from
 
 git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git for-linus

The feature as a whole seems useful, and I don't actually oppose the merge
based on what I see here.  As long as you're really sure that cpusets are
inappropriate (and bear in mind that Paul has a track record of being wrong
on this :)).  But I see a few glitches


- There are two separate and identical implementations of
  cpu_unusable(cpu).  Please do it once, in a header, preferably with C
  function, not macros.

- The Kconfig help is a bit scraggly:

+config CPUISOL_STOPMACHINE
+   bool Do not halt isolated CPUs with Stop Machine (HIGHLY EXPERIMENTAL)
+   depends on CPUISOL  STOP_MACHINE  EXPERIMENTAL
+   help
+  If this option is enabled kernel will not halt isolated CPUs when 
Stop Machine

the kernel

text is too wide

+  is triggered.
+  Stop Machine is currently only used by the module insertion and 
removal logic.
+  Please note that at this point this feature is highly experimental 
and maybe
+  dangerous. It is not known to really brake anything but can 
potentially 
+  introduce an instability.

s/maybe/may be/
s/brake/break/


Neither this text, nor the changelog nor the code comments tell us what the
potential instability with stopmachine *is*?  Or maybe I missed it.

- Adding new sysfs files without updating Documentation/ABI/ makes Greg
  cry.

- Why is cpu_isolated_map exported to modules?  Just for api consistency,
  it appears?


pre-existing problems:

- isolated_cpu_setup() has an on-stack array of NR_CPUS integers.  This
  will consume 4k of stack on ia64 (at least).  We'll just squeak through
  for a ittle while, but this needs to be fixed.  Just move it into
  __initdata.

- isolated_cpu_setup() expects that the user can provide an up-to-1024
  character kernel boot parameter.  Is this reasonable given cpu command
  line limits, and given that NR_CPUS will surely grow beyond 1024 in the
  future?


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/