Re: [git pull] CPU isolation extensions (updated)
Ingo Molnar wrote: > * Max Krasnyansky <[EMAIL PROTECTED]> wrote: > >> Ingo said a few different things (a bit too large to quote). > > [...] >> And at the end he said: >>> Also, i'd not mind some test-coverage in sched.git as well. > >> I far as I know "do not mind" does not mean "must go to" ;-). [...] > > the CPU isolation related patches have typically flown through > sched.git/sched-devel.git, so yes, you can take my "i'd not mind" > comment as "i'd not mind it at all". That's the tree that all the folks > who deal with this (such as Paul) are following. So lets go via the > normal contribution cycle and let this trickle through with all the > scheduler folks? I'd say 2.6.26 would be a tentative target, if it holds > up to scrutiny in sched-devel.git (both testing and review wise). And > because Andrew tracks sched-devel.git it will thus show up in -mm too. Sounds good. Can you pull my tree then ? Or do you want me to resend the patches. The tree is here: git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git Take the for-linus branch. Or as I said please let me know and I'll resend the patches. Thanx Max -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions (updated)
* Max Krasnyansky <[EMAIL PROTECTED]> wrote: > Ingo said a few different things (a bit too large to quote). [...] > And at the end he said: > > Also, i'd not mind some test-coverage in sched.git as well. > I far as I know "do not mind" does not mean "must go to" ;-). [...] the CPU isolation related patches have typically flown through sched.git/sched-devel.git, so yes, you can take my "i'd not mind" comment as "i'd not mind it at all". That's the tree that all the folks who deal with this (such as Paul) are following. So lets go via the normal contribution cycle and let this trickle through with all the scheduler folks? I'd say 2.6.26 would be a tentative target, if it holds up to scrutiny in sched-devel.git (both testing and review wise). And because Andrew tracks sched-devel.git it will thus show up in -mm too. Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions (updated)
* Max Krasnyansky [EMAIL PROTECTED] wrote: Ingo said a few different things (a bit too large to quote). [...] And at the end he said: Also, i'd not mind some test-coverage in sched.git as well. I far as I know do not mind does not mean must go to ;-). [...] the CPU isolation related patches have typically flown through sched.git/sched-devel.git, so yes, you can take my i'd not mind comment as i'd not mind it at all. That's the tree that all the folks who deal with this (such as Paul) are following. So lets go via the normal contribution cycle and let this trickle through with all the scheduler folks? I'd say 2.6.26 would be a tentative target, if it holds up to scrutiny in sched-devel.git (both testing and review wise). And because Andrew tracks sched-devel.git it will thus show up in -mm too. Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions (updated)
Ingo Molnar wrote: * Max Krasnyansky [EMAIL PROTECTED] wrote: Ingo said a few different things (a bit too large to quote). [...] And at the end he said: Also, i'd not mind some test-coverage in sched.git as well. I far as I know do not mind does not mean must go to ;-). [...] the CPU isolation related patches have typically flown through sched.git/sched-devel.git, so yes, you can take my i'd not mind comment as i'd not mind it at all. That's the tree that all the folks who deal with this (such as Paul) are following. So lets go via the normal contribution cycle and let this trickle through with all the scheduler folks? I'd say 2.6.26 would be a tentative target, if it holds up to scrutiny in sched-devel.git (both testing and review wise). And because Andrew tracks sched-devel.git it will thus show up in -mm too. Sounds good. Can you pull my tree then ? Or do you want me to resend the patches. The tree is here: git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git Take the for-linus branch. Or as I said please let me know and I'll resend the patches. Thanx Max -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions (updated)
Paul Jackson wrote: > Max wrote: >> Linus, please pull CPU isolation extensions from > > Did I miss something in this discussion? I thought > Ingo was quite clear, and Linus pretty clear too, > that this patch should bake in *-mm or some such > place for a bit first. > Andrew said: > The feature as a whole seems useful, and I don't actually oppose the merge > based on what I see here. As long as you're really sure that cpusets are > inappropriate (and bear in mind that Paul has a track record of being wrong > on this :)). But I see a few glitches As far as I can understand Andrew is ok with the merge. And I addressed all his comments. Linus said: > Have these been in -mm and widely discussed etc? I'd like to start more > carefully, and (a) have that controversial last patch not merged initially > and (b) make sure everybody is on the same page wrt this all.. As far as I can understand Linus _asked_ whether it was in -mm or not and whether everybody's on the same page. He did not say "this must be in -mm first". I explained that it has not been in -mm, and who it was discussed with, and did a bunch more testing/investigation on the controversial patch and explained why I think it's not that controversial any more. Ingo said a few different things (a bit too large to quote). - That it was not discussed. I explained that it was in fact discussed and provided a bunch of pointers to the mail threads. - That he thinks that cpuset is the way to do it. Again I explained why it's not. And at the end he said: > Also, i'd not mind some test-coverage in sched.git as well. I far as I know "do not mind" does not mean "must go to" ;-). Also I replied that I did not mind either but I do not think that it has much (if anything) to do with the scheduler. Anyway. I think I mentioned that I did not mind -mm either. I think it's ready for the mainline. But if people still strongly feel that it has to be in -mm that's fine. Lets just do s/Linus/Andrew/ on the first line and move on. But if Linus pulls it now even better ;-) Andrew, Linus, I'll let you guys decide which tree it needs to go. Max -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions (updated)
Max wrote: > Linus, please pull CPU isolation extensions from Did I miss something in this discussion? I thought Ingo was quite clear, and Linus pretty clear too, that this patch should bake in *-mm or some such place for a bit first. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.940.382.4214 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[git pull] CPU isolation extensions (updated)
Linus, please pull CPU isolation extensions from git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git for-linus Diffstat: Documentation/ABI/testing/sysfs-devices-system-cpu | 41 +++ Documentation/cpu-isolation.txt| 113 + arch/x86/Kconfig |1 arch/x86/kernel/genapic_flat_64.c |4 drivers/base/cpu.c | 48 include/linux/cpumask.h|3 kernel/Kconfig.cpuisol | 42 +++ kernel/Makefile|4 kernel/cpu.c | 54 ++ kernel/sched.c | 36 -- kernel/stop_machine.c |8 + kernel/workqueue.c | 30 - 12 files changed, 337 insertions(+), 47 deletions(-) This addresses all Andrew's comments for the last submission. Details here: http://marc.info/?l=linux-kernel=120236394012766=2 There are no code changes since last time, besides minor fix for moving on-stack array to __initdata as suggested by Andrew. Other stuff is just documentation updates. List of commits cpuisol: Make cpu isolation configrable and export isolated map cpuisol: Do not route IRQs to the CPUs isolated at boot cpuisol: Do not schedule workqueues on the isolated CPUs cpuisol: Move on-stack array used for boot cmd parsing into __initdata cpuisol: Documentation updates cpuisol: Minor updates to the Kconfig options cpuisol: Do not halt isolated CPUs with Stop Machine I suggested by Ingo I'm CC'ing everyone who is even remotely connected/affected ;-) Ingo, Peter - Scheduler. There are _no_ changes in this area besides moving cpu_*_map maps from kerne/sched.c to kernel/cpu.c. Paul - Cpuset Again there are _no_ changes in this area. For reasons why cpuset is not the right mechanism for cpu isolation see this thread http://marc.info/?l=linux-kernel=120180692331461=2 Rusty - Stop machine. After doing a bunch of testing last three days I actually downgraded stop machine changes from [highly experimental] to simply [experimental]. Pleas see this thread for more info: http://marc.info/?l=linux-kernel=120243837206248=2 Short story is that I ran several insmod/rmmod workloads on live multi-core boxes with stop machine _completely_ disabled and did no see any issues. Rusty did not get a chance to reply yet, I hopping that we'll be able to make "stop machine" completely optional for some configurations. Gerg - ABI documentation. Nothing interesting here. I simply added Documentation/ABI/testing/sysfs-devices-system-cpu and documented some of the attributes exposed in there. Suggested by Andrew. I believe this is ready for the inclusion and my impression is that Andrew is ok with that. Most changes are very simple and do not affect existing behavior. As I mentioned before I've been using Workqueue and StopMachine changes in production for a couple of years now and have high confidence in them. Yet they are marked as experimental for now, just to be safe. My original explanation is included below. btw I'll be out skiing/snow boarding for the next 4 days and will have sporadic email access. Will do my best to address question/concerns (if any) during that time. Thanx Max -- This patch series extends CPU isolation support. Yes, most people want to virtuallize CPUs these days and I want to isolate them :) . The primary idea here is to be able to use some CPU cores as the dedicated engines for running user-space code with minimal kernel overhead/intervention, think of it as an SPE in the Cell processor. I'd like to be able to run a CPU intensive (%100) RT task on one of the processors without adversely affecting or being affected by the other system activities. System activities here include _kernel_ activities as well. I'm personally using this for hard realtime purposes. With CPU isolation it's very easy to achieve single digit usec worst case and around 200 nsec average response times on off-the-shelf multi- processor/core systems (vanilla kernel plus these patches) even under extreme system load. I'm working with legal folks on releasing hard RT user-space framework for that. I believe with the current multi-core CPU trend we will see more and more applications that explore this capability: RT gaming engines, simulators, hard RT apps, etc. Hence the proposal is to extend current CPU isolation feature. The new definition of the CPU isolation would be: --- 1. Isolated CPU(s) must not be subject to scheduler load balancing Users must explicitly bind threads in order to run on those CPU(s). 2. By default interrupts
[git pull] CPU isolation extensions (updated)
Linus, please pull CPU isolation extensions from git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git for-linus Diffstat: Documentation/ABI/testing/sysfs-devices-system-cpu | 41 +++ Documentation/cpu-isolation.txt| 113 + arch/x86/Kconfig |1 arch/x86/kernel/genapic_flat_64.c |4 drivers/base/cpu.c | 48 include/linux/cpumask.h|3 kernel/Kconfig.cpuisol | 42 +++ kernel/Makefile|4 kernel/cpu.c | 54 ++ kernel/sched.c | 36 -- kernel/stop_machine.c |8 + kernel/workqueue.c | 30 - 12 files changed, 337 insertions(+), 47 deletions(-) This addresses all Andrew's comments for the last submission. Details here: http://marc.info/?l=linux-kernelm=120236394012766w=2 There are no code changes since last time, besides minor fix for moving on-stack array to __initdata as suggested by Andrew. Other stuff is just documentation updates. List of commits cpuisol: Make cpu isolation configrable and export isolated map cpuisol: Do not route IRQs to the CPUs isolated at boot cpuisol: Do not schedule workqueues on the isolated CPUs cpuisol: Move on-stack array used for boot cmd parsing into __initdata cpuisol: Documentation updates cpuisol: Minor updates to the Kconfig options cpuisol: Do not halt isolated CPUs with Stop Machine I suggested by Ingo I'm CC'ing everyone who is even remotely connected/affected ;-) Ingo, Peter - Scheduler. There are _no_ changes in this area besides moving cpu_*_map maps from kerne/sched.c to kernel/cpu.c. Paul - Cpuset Again there are _no_ changes in this area. For reasons why cpuset is not the right mechanism for cpu isolation see this thread http://marc.info/?l=linux-kernelm=120180692331461w=2 Rusty - Stop machine. After doing a bunch of testing last three days I actually downgraded stop machine changes from [highly experimental] to simply [experimental]. Pleas see this thread for more info: http://marc.info/?l=linux-kernelm=120243837206248w=2 Short story is that I ran several insmod/rmmod workloads on live multi-core boxes with stop machine _completely_ disabled and did no see any issues. Rusty did not get a chance to reply yet, I hopping that we'll be able to make stop machine completely optional for some configurations. Gerg - ABI documentation. Nothing interesting here. I simply added Documentation/ABI/testing/sysfs-devices-system-cpu and documented some of the attributes exposed in there. Suggested by Andrew. I believe this is ready for the inclusion and my impression is that Andrew is ok with that. Most changes are very simple and do not affect existing behavior. As I mentioned before I've been using Workqueue and StopMachine changes in production for a couple of years now and have high confidence in them. Yet they are marked as experimental for now, just to be safe. My original explanation is included below. btw I'll be out skiing/snow boarding for the next 4 days and will have sporadic email access. Will do my best to address question/concerns (if any) during that time. Thanx Max -- This patch series extends CPU isolation support. Yes, most people want to virtuallize CPUs these days and I want to isolate them :) . The primary idea here is to be able to use some CPU cores as the dedicated engines for running user-space code with minimal kernel overhead/intervention, think of it as an SPE in the Cell processor. I'd like to be able to run a CPU intensive (%100) RT task on one of the processors without adversely affecting or being affected by the other system activities. System activities here include _kernel_ activities as well. I'm personally using this for hard realtime purposes. With CPU isolation it's very easy to achieve single digit usec worst case and around 200 nsec average response times on off-the-shelf multi- processor/core systems (vanilla kernel plus these patches) even under extreme system load. I'm working with legal folks on releasing hard RT user-space framework for that. I believe with the current multi-core CPU trend we will see more and more applications that explore this capability: RT gaming engines, simulators, hard RT apps, etc. Hence the proposal is to extend current CPU isolation feature. The new definition of the CPU isolation would be: --- 1. Isolated CPU(s) must not be subject to scheduler load balancing Users must explicitly bind threads in order to run on those CPU(s). 2. By default
Re: [git pull] CPU isolation extensions (updated)
Max wrote: Linus, please pull CPU isolation extensions from Did I miss something in this discussion? I thought Ingo was quite clear, and Linus pretty clear too, that this patch should bake in *-mm or some such place for a bit first. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson [EMAIL PROTECTED] 1.940.382.4214 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions (updated)
Paul Jackson wrote: Max wrote: Linus, please pull CPU isolation extensions from Did I miss something in this discussion? I thought Ingo was quite clear, and Linus pretty clear too, that this patch should bake in *-mm or some such place for a bit first. Andrew said: The feature as a whole seems useful, and I don't actually oppose the merge based on what I see here. As long as you're really sure that cpusets are inappropriate (and bear in mind that Paul has a track record of being wrong on this :)). But I see a few glitches As far as I can understand Andrew is ok with the merge. And I addressed all his comments. Linus said: Have these been in -mm and widely discussed etc? I'd like to start more carefully, and (a) have that controversial last patch not merged initially and (b) make sure everybody is on the same page wrt this all.. As far as I can understand Linus _asked_ whether it was in -mm or not and whether everybody's on the same page. He did not say this must be in -mm first. I explained that it has not been in -mm, and who it was discussed with, and did a bunch more testing/investigation on the controversial patch and explained why I think it's not that controversial any more. Ingo said a few different things (a bit too large to quote). - That it was not discussed. I explained that it was in fact discussed and provided a bunch of pointers to the mail threads. - That he thinks that cpuset is the way to do it. Again I explained why it's not. And at the end he said: Also, i'd not mind some test-coverage in sched.git as well. I far as I know do not mind does not mean must go to ;-). Also I replied that I did not mind either but I do not think that it has much (if anything) to do with the scheduler. Anyway. I think I mentioned that I did not mind -mm either. I think it's ready for the mainline. But if people still strongly feel that it has to be in -mm that's fine. Lets just do s/Linus/Andrew/ on the first line and move on. But if Linus pulls it now even better ;-) Andrew, Linus, I'll let you guys decide which tree it needs to go. Max -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions
Hi Ingo, Thanks for your reply. > * Linus Torvalds <[EMAIL PROTECTED]> wrote: > >> On Wed, 6 Feb 2008, Max Krasnyansky wrote: >>> Linus, please pull CPU isolation extensions from >>> >>> git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git >>> for-linus >> Have these been in -mm and widely discussed etc? I'd like to start >> more carefully, and (a) have that controversial last patch not merged >> initially and (b) make sure everybody is on the same page wrt this >> all.. > > no, they have not been under nearly enough testing and review - these > patches surfaced on lkml for the first time one week ago (!). Almost two weeks actually. Ok 1.8 :) > I find the pull request totally premature, this stuff has not been discussed > and > agreed on _at all_. Ingo, I may have the wrong impression but my impression is that you ignored all the other emails and just read Linus' reply. I do not believe this accusation is valid. I apologize if my impression is incorrect. Since the patches _do not_ change/affect existing scheduler/cpuset functionality I did not know who to CC in the first email that I sent. Luckily Peter picked it up and CC'ed a bunch of folks, including Paul, Steven and You. All of them replied and had questions/concerns. As I mentioned before I believe I addressed all of them. > None of the people who maintain and have interest in > this code and participated in the (short) one-week discussion were > Cc:-ed to the pull request. Ok. I did not realize I'm supposed to do that. Since I got no replies to the second round of patches (take 2), which again was CC'ed to the same people that Peter CC'ed. I assumed that people are ok with it. That's what discussion on the first take ended with. > I think these patches also need a buy-in from Peter Zijlstra and Paul > Jackson (or really good reasoning while any objections from them should > be overriden) - all of whom deal with the code affected by these changes > on a daily basis and have an interest in CPU isolation features. See above. Following issues were raised: 1. Peter and Steven initially thought that workqueue isolation is not needed. 2. Paul thought that it should be implemented on top of cpusets. 3. Peter thought that stopmachine change is not safe. There were a couple of other minor misunderstandings (for example Peter thought that I'm completely disallowing IRQs on isolated CPUs, which is obviously not the case). I clarified all of them. #1 I explained in the original thread and then followed up with concrete code example of why it is needed. http://marc.info/?l=linux-kernel=120217173001671=2 Got no replies so far. So I'm assuming folks are happy. #2 I started a separate thread on that http://marc.info/?l=linux-kernel=120180692331461=2 The conclusion was, well let me just quote exactly what Paul had said: > Paul Jackson wrote: >> Max wrote: >>> Looks like I failed to explain what I'm trying to achieve. So let me try >>> again. >> >> Well done. I read through that, expecting to disagree or at least >> to not understand at some point, and got all the way through nodding >> my head in agreement. Good. >> >> Whether the earlier confusions were lack of clarity in the presentation, >> or lack of competence in my brain ... well guess I don't want to ask that >> question ;). And #3 Peter did not agree with me but said that it's up to Linus or Andrew to decide whether it's appropriate in mainline or not. I _clearly_ indicated that this part is somewhat controversial and maybe dangerous, I'm _not_ trying to sneak something in. Andrew picked it up and I'm going to do some more investigation on whether it's really not safe or is actually fine (about to send an email to Rusty). > Generally i think that cpusets is actually the feature and API that > should be used (and extended) for CPU isolation - and we already > extended it recently in the direction of CPU isolation. Most enterprise > distros have cpusets enabled so it's in use. Also, cpusets has the > appeal of being commonly used in the "big honking boxes" arena, so > reusing the same concept for RT and virtualization stuff would be the > natural approach. It already ties in to the scheduler domains code > dynamically and is flexible and scalable. I resisted ad-hoc CPU > isolation patches in -rt for that reason. That's exactly what Paul proposed initially. I completely disagree with that but I did look at it in _detail_. Please take a look here for detailed explanation http://marc.info/?l=linux-kernel=120180692331461=2 This email getting to long and I did not want to inline everything. > Also, i'd not mind some test-coverage in sched.git as well. I believe it has _nothing_ to do with the "scheduler" but I do not mind it being in that tree. Please read this email on why it has nothing to do with the scheduler http://marc.info/?l=linux-kernel=120210515323578=2 That's the email that convinced
Re: [git pull] CPU isolation extensions
* Linus Torvalds <[EMAIL PROTECTED]> wrote: > On Wed, 6 Feb 2008, Max Krasnyansky wrote: > > > > Linus, please pull CPU isolation extensions from > > > > git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git > > for-linus > > Have these been in -mm and widely discussed etc? I'd like to start > more carefully, and (a) have that controversial last patch not merged > initially and (b) make sure everybody is on the same page wrt this > all.. no, they have not been under nearly enough testing and review - these patches surfaced on lkml for the first time one week ago (!). I find the pull request totally premature, this stuff has not been discussed and agreed on _at all_. None of the people who maintain and have interest in this code and participated in the (short) one-week discussion were Cc:-ed to the pull request. I think these patches also need a buy-in from Peter Zijlstra and Paul Jackson (or really good reasoning while any objections from them should be overriden) - all of whom deal with the code affected by these changes on a daily basis and have an interest in CPU isolation features. Generally i think that cpusets is actually the feature and API that should be used (and extended) for CPU isolation - and we already extended it recently in the direction of CPU isolation. Most enterprise distros have cpusets enabled so it's in use. Also, cpusets has the appeal of being commonly used in the "big honking boxes" arena, so reusing the same concept for RT and virtualization stuff would be the natural approach. It already ties in to the scheduler domains code dynamically and is flexible and scalable. I resisted ad-hoc CPU isolation patches in -rt for that reason. Also, i'd not mind some test-coverage in sched.git as well. Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions
On Thu, 07 Feb 2008 09:22:34 -0800 Max Krasnyansky <[EMAIL PROTECTED]> wrote: > > - There are two separate and identical implementations of > > cpu_unusable(cpu). Please do it once, in a header, preferably with C > > function, not macros. > > Those are local versions that depend whether a feature is enabled or not. > If CONFIG_CPUISOL_WORKQUEUE is disabled we want to cpu_unusable() > in the workqueue.c to be a noop, and if it's enabled that macro resolve to > cpu_isolated(). > Same thing for the stopmachine.c. If CONFIG_CPUISOL_STOPMACHIN is disabled > cpu_unusable() is a noop. > In other words cpu_isolated() is the one common macro that subsystem may > want to stub out. > Do you see another way of doing this ? ah, I missed that. Yup, the implementation you have there looks OK. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions
Paul Jackson wrote: > Max - Andrew wondered if the rt tree had seen the > code or commented it on it. What became of that? I just replied to Andrew. It's not an RT feature per se. And yes Peter CC'ed RT folks. You probably did not get a chance to read all replies. They had some questions/concerns and stuff. I believe I answered/clarified all of them. > My two cents isn't worth a plug nickel here, but > I'm inclined to nod in agreement when Linus wants > to see these patches get some more exposure before > going into Linus's tree. ... what's the hurry? No hurry I guess. I did mentioned in the introductory email that I've been maintaining this stuff for awhile now. SLAB patches used to be messy, with new SLUB the mess goes away. CFS handles CPU hotplug much better than O(1), cpu hotplug is needed to be able to change isolated bit from sysfs. That's why I think it's a good time to merge. I don't mind of course if we put this stuff in -mm first. Although first part of the patchset (ie exporting isolated map, sysfs interface, etc) seem very simple and totally not controversial. Stop machine patch is really the only thing that may look suspicious. Max -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions
Andrew Morton wrote: > On Thu, 7 Feb 2008 01:59:54 -0600 Paul Jackson <[EMAIL PROTECTED]> wrote: > >> but hard real time is not my expertise > > Speaking of which.. there is the -rt tree. Have those people had a look > at the feature, perhaps played with the code? Peter Z. and Steven R. sent me some comments, I believe I explained and addressed them. Ingo's been quite. Probably too busy. btw It's not an RT feature per se. It certainly helps RT but removing all the latency sources from isolated CPUs. But in general it's just "reducing kernel overhead on some CPUs" kind of feature. Max -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions
Max - Andrew wondered if the rt tree had seen the code or commented it on it. What became of that? My two cents isn't worth a plug nickel here, but I'm inclined to nod in agreement when Linus wants to see these patches get some more exposure before going into Linus's tree. ... what's the hurry? -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.940.382.4214 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions
Paul Jackson wrote: > Andrew wrote: >> (and bear in mind that Paul has a track record of being wrong >> on this :)) > > heh - I saw that . > > Max - Andrew's about right, as usual. You answered my initial > questions on this patch set adequately, but hard real time is > not my expertise, so in the final analysis, other than my saying > I don't have any more objections, my input doesn't mean much > either way. I honestly think this one is no brainer and I do not think this one will hurt Paul's track record :). Paul initially disagreed with me and that's when he was wrong ;-)) Andrew, I looked at this in detail and here is an explanation that I sent to Paul a few days ago (a bit shortened/updated version). I thought some more about your proposal to use sched_load_balance flag in cpusets instead of extending cpu_isolated_map. I looked at the cpusets, cgroups and here are my thoughts on this. Here is the list of issues with sched_load_balance flag from CPU isolation perspective: -- (1) Boot time isolation is not possible. There is currently no way to setup a cpuset at boot time. For example we won't be able to isolate cpus from irqs and workqueues at boot. Not a major issue but still an inconvenience. -- (2) There is currently no easy way to figure out what cpuset a cpu belongs to in order to query it's sched_load_balance flag. In order to do that we need a method that iterates all active cpusets and checks their cpus_allowed masks. This implies holding cgroup and cpuset mutexes. It's not clear whether it's ok to do that from the the contexts CPU isolation happens in (apic, sched, workqueue). It seems that cgroup/cpuset api is designed from top down access. ie adding a cpu to a set and then recomputing domains. Which makes perfect sense for the common cpuset usecase but is not what cpu isolation needs. In other words I think it's much simpler and cleaner to use the cpu_isolated_map for isolation purposes. No locks, no races, etc. -- (3) cpusets are a bit too dynamic :) . What I mean by this is that sched_load_balance flag can be changed at any time without bringing a CPU offline. What that means is that we'll need some notifier mechanisms for killing and restarting workqueue threads when that flag changes. Also we'd need some logic that makes sure that a user does not disable load balancing on all cpus because that effectively will kill workqueues on all the cpus. This particular case is already handled very nicely in my patches. Isolated bit can be set only when cpu is offline and it cannot be set on the first online cpu. Workqueus and other subsystems already handle cpu hotplug events nicely and can easily ignore isolated cpus when they come online. -- #1 is probably unfixable. #2 and #3 can be fixed but at the expense of extra complexity across the board. I seriously doubt that I'll be able to push that through the reviews ;-). Also personally I still think cpusets and cpu isolation attack two different problems. cpusets is about partitioning cpus and memory nodes, and managing tasks. Most of the cgroups/cpuset APIs are designed to deal with tasks. CPU isolation is much simpler and is at the lower layer. It deals with IRQs, kernel per cpu threads, etc. The only intersection I see is that both features affect scheduling domains. CPU isolation is again simple here it uses existing logic in sched.c it does not change anything in this area. - Andrew, hopefully that clarifies it. Let me know if you're not convinced. Max -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions
Hi Linus, Linus Torvalds wrote: > > On Wed, 6 Feb 2008, Max Krasnyansky wrote: >> Linus, please pull CPU isolation extensions from >> >> git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git for-linus > > Have these been in -mm and widely discussed etc? I'd like to start more > carefully, and (a) have that controversial last patch not merged initially > and (b) make sure everybody is on the same page wrt this all.. They've been discussed with RT/scheduler/cpuset folks. Andrew is definitely in the loop. He just replied and asked for some fixes and clarifications. He seems to be ok with merging this in general. The last patch may not be as bad as I originally thought. We'll discuss it some more with Andrew. I'll also check with Rusty who wrote the stopmachine in the first place. It actually seems like an overkill at this point. My impression is that it was supposed to be a safety net if some refcounting/locking is not fully safe and may not be needed or as critical anymore. I'm maybe wrong of course. So I'll find that out :) Thanx Max -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions
Andrew Morton wrote: > On Wed, 06 Feb 2008 21:32:55 -0800 Max Krasnyansky <[EMAIL PROTECTED]> wrote: > >> Linus, please pull CPU isolation extensions from >> >> git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git for-linus > > The feature as a whole seems useful, and I don't actually oppose the merge > based on what I see here. Awesome :) I think it's get more and more useful as people will start trying to figure out what the heck there is supposed to do with the spare CPU cores. I mean pretty soon most machines will have 4 cores and some will have 8. One way to use those cores is the "dedicated engine" model. > As long as you're really sure that cpusets are > inappropriate (and bear in mind that Paul has a track record of being wrong > on this :)). I'll cover this in a separate email with more details. > But I see a few glitches Good catches. Thanks for reviewing. > - There are two separate and identical implementations of > cpu_unusable(cpu). Please do it once, in a header, preferably with C > function, not macros. Those are local versions that depend whether a feature is enabled or not. If CONFIG_CPUISOL_WORKQUEUE is disabled we want to cpu_unusable() in the workqueue.c to be a noop, and if it's enabled that macro resolve to cpu_isolated(). Same thing for the stopmachine.c. If CONFIG_CPUISOL_STOPMACHIN is disabled cpu_unusable() is a noop. In other words cpu_isolated() is the one common macro that subsystem may want to stub out. Do you see another way of doing this ? > - The Kconfig help is a bit scraggly: > > +config CPUISOL_STOPMACHINE > + bool "Do not halt isolated CPUs with Stop Machine (HIGHLY EXPERIMENTAL)" > + depends on CPUISOL && STOP_MACHINE && EXPERIMENTAL > + help > + If this option is enabled kernel will not halt isolated CPUs when > Stop Machine > > "the kernel" > > text is too wide Got it. Will fix asap. > + is triggered. > + Stop Machine is currently only used by the module insertion and > removal logic. > + Please note that at this point this feature is highly experimental > and maybe > + dangerous. It is not known to really brake anything but can > potentially > + introduce an instability. > > s/maybe/may be/ > s/brake/break/ Man, the typos are killing me :). Will fix. > Neither this text, nor the changelog nor the code comments tell us what the > potential instability with stopmachine *is*? Or maybe I missed it. That's the thing, we don't really know :). In real life does not seem to be a problem at all. As I mentioned in prev emails. We've been running all kinds of machines with this enabled, and inserting all kinds of modules left and right. Never seen any crashes or anything. But the fact that stopmachine is supposed to halt all cpus during module insertion/removal seems to imply that something bad may happen if some cpus are not halted. It may very well turnout that it's no longer needed because our locking and refcounting handles this just fine. I mean ideally we should not have to halt the entire box, it causes terrible latencies. > - Adding new sysfs files without updating Documentation/ABI/ makes Greg cry. Oh, did not know that. Will fix. > > - Why is cpu_isolated_map exported to modules? Just for api consistency, it > appears? Yes. For consistency. We'd want cpu_isolated() to work everywhere. > pre-existing problems: > > - isolated_cpu_setup() has an on-stack array of NR_CPUS integers. This > will consume 4k of stack on ia64 (at least). We'll just squeak through > for a ittle while, but this needs to be fixed. Just move it into > __initdata. Will do. > - isolated_cpu_setup() expects that the user can provide an up-to-1024 > character kernel boot parameter. Is this reasonable given cpu command > line limits, and given that NR_CPUS will surely grow beyond 1024 in the > future? I'm thinking that is reasonable for now. I'll fix and resend the patches asap. Thanx Max -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions
On Wed, 6 Feb 2008, Max Krasnyansky wrote: > > Linus, please pull CPU isolation extensions from > > git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git for-linus Have these been in -mm and widely discussed etc? I'd like to start more carefully, and (a) have that controversial last patch not merged initially and (b) make sure everybody is on the same page wrt this all.. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions
On Thu, 7 Feb 2008 01:59:54 -0600 Paul Jackson <[EMAIL PROTECTED]> wrote: > but hard real time is > not my expertise Speaking of which.. there is the -rt tree. Have those people had a look at the feature, perhaps played with the code? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions
Andrew wrote: > (and bear in mind that Paul has a track record of being wrong > on this :)) heh - I saw that . Max - Andrew's about right, as usual. You answered my initial questions on this patch set adequately, but hard real time is not my expertise, so in the final analysis, other than my saying I don't have any more objections, my input doesn't mean much either way. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.940.382.4214 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions
On Thu, 7 Feb 2008 01:59:54 -0600 Paul Jackson [EMAIL PROTECTED] wrote: but hard real time is not my expertise Speaking of which.. there is the -rt tree. Have those people had a look at the feature, perhaps played with the code? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions
Andrew wrote: (and bear in mind that Paul has a track record of being wrong on this :)) heh - I saw that grin. Max - Andrew's about right, as usual. You answered my initial questions on this patch set adequately, but hard real time is not my expertise, so in the final analysis, other than my saying I don't have any more objections, my input doesn't mean much either way. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson [EMAIL PROTECTED] 1.940.382.4214 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions
Andrew Morton wrote: On Thu, 7 Feb 2008 01:59:54 -0600 Paul Jackson [EMAIL PROTECTED] wrote: but hard real time is not my expertise Speaking of which.. there is the -rt tree. Have those people had a look at the feature, perhaps played with the code? Peter Z. and Steven R. sent me some comments, I believe I explained and addressed them. Ingo's been quite. Probably too busy. btw It's not an RT feature per se. It certainly helps RT but removing all the latency sources from isolated CPUs. But in general it's just reducing kernel overhead on some CPUs kind of feature. Max -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions
* Linus Torvalds [EMAIL PROTECTED] wrote: On Wed, 6 Feb 2008, Max Krasnyansky wrote: Linus, please pull CPU isolation extensions from git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git for-linus Have these been in -mm and widely discussed etc? I'd like to start more carefully, and (a) have that controversial last patch not merged initially and (b) make sure everybody is on the same page wrt this all.. no, they have not been under nearly enough testing and review - these patches surfaced on lkml for the first time one week ago (!). I find the pull request totally premature, this stuff has not been discussed and agreed on _at all_. None of the people who maintain and have interest in this code and participated in the (short) one-week discussion were Cc:-ed to the pull request. I think these patches also need a buy-in from Peter Zijlstra and Paul Jackson (or really good reasoning while any objections from them should be overriden) - all of whom deal with the code affected by these changes on a daily basis and have an interest in CPU isolation features. Generally i think that cpusets is actually the feature and API that should be used (and extended) for CPU isolation - and we already extended it recently in the direction of CPU isolation. Most enterprise distros have cpusets enabled so it's in use. Also, cpusets has the appeal of being commonly used in the big honking boxes arena, so reusing the same concept for RT and virtualization stuff would be the natural approach. It already ties in to the scheduler domains code dynamically and is flexible and scalable. I resisted ad-hoc CPU isolation patches in -rt for that reason. Also, i'd not mind some test-coverage in sched.git as well. Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions
On Thu, 07 Feb 2008 09:22:34 -0800 Max Krasnyansky [EMAIL PROTECTED] wrote: - There are two separate and identical implementations of cpu_unusable(cpu). Please do it once, in a header, preferably with C function, not macros. Those are local versions that depend whether a feature is enabled or not. If CONFIG_CPUISOL_WORKQUEUE is disabled we want to cpu_unusable() in the workqueue.c to be a noop, and if it's enabled that macro resolve to cpu_isolated(). Same thing for the stopmachine.c. If CONFIG_CPUISOL_STOPMACHIN is disabled cpu_unusable() is a noop. In other words cpu_isolated() is the one common macro that subsystem may want to stub out. Do you see another way of doing this ? ah, I missed that. Yup, the implementation you have there looks OK. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions
Hi Linus, Linus Torvalds wrote: On Wed, 6 Feb 2008, Max Krasnyansky wrote: Linus, please pull CPU isolation extensions from git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git for-linus Have these been in -mm and widely discussed etc? I'd like to start more carefully, and (a) have that controversial last patch not merged initially and (b) make sure everybody is on the same page wrt this all.. They've been discussed with RT/scheduler/cpuset folks. Andrew is definitely in the loop. He just replied and asked for some fixes and clarifications. He seems to be ok with merging this in general. The last patch may not be as bad as I originally thought. We'll discuss it some more with Andrew. I'll also check with Rusty who wrote the stopmachine in the first place. It actually seems like an overkill at this point. My impression is that it was supposed to be a safety net if some refcounting/locking is not fully safe and may not be needed or as critical anymore. I'm maybe wrong of course. So I'll find that out :) Thanx Max -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions
Max - Andrew wondered if the rt tree had seen the code or commented it on it. What became of that? My two cents isn't worth a plug nickel here, but I'm inclined to nod in agreement when Linus wants to see these patches get some more exposure before going into Linus's tree. ... what's the hurry? -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson [EMAIL PROTECTED] 1.940.382.4214 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions
Paul Jackson wrote: Max - Andrew wondered if the rt tree had seen the code or commented it on it. What became of that? I just replied to Andrew. It's not an RT feature per se. And yes Peter CC'ed RT folks. You probably did not get a chance to read all replies. They had some questions/concerns and stuff. I believe I answered/clarified all of them. My two cents isn't worth a plug nickel here, but I'm inclined to nod in agreement when Linus wants to see these patches get some more exposure before going into Linus's tree. ... what's the hurry? No hurry I guess. I did mentioned in the introductory email that I've been maintaining this stuff for awhile now. SLAB patches used to be messy, with new SLUB the mess goes away. CFS handles CPU hotplug much better than O(1), cpu hotplug is needed to be able to change isolated bit from sysfs. That's why I think it's a good time to merge. I don't mind of course if we put this stuff in -mm first. Although first part of the patchset (ie exporting isolated map, sysfs interface, etc) seem very simple and totally not controversial. Stop machine patch is really the only thing that may look suspicious. Max -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions
Paul Jackson wrote: Andrew wrote: (and bear in mind that Paul has a track record of being wrong on this :)) heh - I saw that grin. Max - Andrew's about right, as usual. You answered my initial questions on this patch set adequately, but hard real time is not my expertise, so in the final analysis, other than my saying I don't have any more objections, my input doesn't mean much either way. I honestly think this one is no brainer and I do not think this one will hurt Paul's track record :). Paul initially disagreed with me and that's when he was wrong ;-)) Andrew, I looked at this in detail and here is an explanation that I sent to Paul a few days ago (a bit shortened/updated version). I thought some more about your proposal to use sched_load_balance flag in cpusets instead of extending cpu_isolated_map. I looked at the cpusets, cgroups and here are my thoughts on this. Here is the list of issues with sched_load_balance flag from CPU isolation perspective: -- (1) Boot time isolation is not possible. There is currently no way to setup a cpuset at boot time. For example we won't be able to isolate cpus from irqs and workqueues at boot. Not a major issue but still an inconvenience. -- (2) There is currently no easy way to figure out what cpuset a cpu belongs to in order to query it's sched_load_balance flag. In order to do that we need a method that iterates all active cpusets and checks their cpus_allowed masks. This implies holding cgroup and cpuset mutexes. It's not clear whether it's ok to do that from the the contexts CPU isolation happens in (apic, sched, workqueue). It seems that cgroup/cpuset api is designed from top down access. ie adding a cpu to a set and then recomputing domains. Which makes perfect sense for the common cpuset usecase but is not what cpu isolation needs. In other words I think it's much simpler and cleaner to use the cpu_isolated_map for isolation purposes. No locks, no races, etc. -- (3) cpusets are a bit too dynamic :) . What I mean by this is that sched_load_balance flag can be changed at any time without bringing a CPU offline. What that means is that we'll need some notifier mechanisms for killing and restarting workqueue threads when that flag changes. Also we'd need some logic that makes sure that a user does not disable load balancing on all cpus because that effectively will kill workqueues on all the cpus. This particular case is already handled very nicely in my patches. Isolated bit can be set only when cpu is offline and it cannot be set on the first online cpu. Workqueus and other subsystems already handle cpu hotplug events nicely and can easily ignore isolated cpus when they come online. -- #1 is probably unfixable. #2 and #3 can be fixed but at the expense of extra complexity across the board. I seriously doubt that I'll be able to push that through the reviews ;-). Also personally I still think cpusets and cpu isolation attack two different problems. cpusets is about partitioning cpus and memory nodes, and managing tasks. Most of the cgroups/cpuset APIs are designed to deal with tasks. CPU isolation is much simpler and is at the lower layer. It deals with IRQs, kernel per cpu threads, etc. The only intersection I see is that both features affect scheduling domains. CPU isolation is again simple here it uses existing logic in sched.c it does not change anything in this area. - Andrew, hopefully that clarifies it. Let me know if you're not convinced. Max -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] CPU isolation extensions
Hi Ingo, Thanks for your reply. * Linus Torvalds [EMAIL PROTECTED] wrote: On Wed, 6 Feb 2008, Max Krasnyansky wrote: Linus, please pull CPU isolation extensions from git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git for-linus Have these been in -mm and widely discussed etc? I'd like to start more carefully, and (a) have that controversial last patch not merged initially and (b) make sure everybody is on the same page wrt this all.. no, they have not been under nearly enough testing and review - these patches surfaced on lkml for the first time one week ago (!). Almost two weeks actually. Ok 1.8 :) I find the pull request totally premature, this stuff has not been discussed and agreed on _at all_. Ingo, I may have the wrong impression but my impression is that you ignored all the other emails and just read Linus' reply. I do not believe this accusation is valid. I apologize if my impression is incorrect. Since the patches _do not_ change/affect existing scheduler/cpuset functionality I did not know who to CC in the first email that I sent. Luckily Peter picked it up and CC'ed a bunch of folks, including Paul, Steven and You. All of them replied and had questions/concerns. As I mentioned before I believe I addressed all of them. None of the people who maintain and have interest in this code and participated in the (short) one-week discussion were Cc:-ed to the pull request. Ok. I did not realize I'm supposed to do that. Since I got no replies to the second round of patches (take 2), which again was CC'ed to the same people that Peter CC'ed. I assumed that people are ok with it. That's what discussion on the first take ended with. I think these patches also need a buy-in from Peter Zijlstra and Paul Jackson (or really good reasoning while any objections from them should be overriden) - all of whom deal with the code affected by these changes on a daily basis and have an interest in CPU isolation features. See above. Following issues were raised: 1. Peter and Steven initially thought that workqueue isolation is not needed. 2. Paul thought that it should be implemented on top of cpusets. 3. Peter thought that stopmachine change is not safe. There were a couple of other minor misunderstandings (for example Peter thought that I'm completely disallowing IRQs on isolated CPUs, which is obviously not the case). I clarified all of them. #1 I explained in the original thread and then followed up with concrete code example of why it is needed. http://marc.info/?l=linux-kernelm=120217173001671w=2 Got no replies so far. So I'm assuming folks are happy. #2 I started a separate thread on that http://marc.info/?l=linux-kernelm=120180692331461w=2 The conclusion was, well let me just quote exactly what Paul had said: Paul Jackson wrote: Max wrote: Looks like I failed to explain what I'm trying to achieve. So let me try again. Well done. I read through that, expecting to disagree or at least to not understand at some point, and got all the way through nodding my head in agreement. Good. Whether the earlier confusions were lack of clarity in the presentation, or lack of competence in my brain ... well guess I don't want to ask that question ;). And #3 Peter did not agree with me but said that it's up to Linus or Andrew to decide whether it's appropriate in mainline or not. I _clearly_ indicated that this part is somewhat controversial and maybe dangerous, I'm _not_ trying to sneak something in. Andrew picked it up and I'm going to do some more investigation on whether it's really not safe or is actually fine (about to send an email to Rusty). Generally i think that cpusets is actually the feature and API that should be used (and extended) for CPU isolation - and we already extended it recently in the direction of CPU isolation. Most enterprise distros have cpusets enabled so it's in use. Also, cpusets has the appeal of being commonly used in the big honking boxes arena, so reusing the same concept for RT and virtualization stuff would be the natural approach. It already ties in to the scheduler domains code dynamically and is flexible and scalable. I resisted ad-hoc CPU isolation patches in -rt for that reason. That's exactly what Paul proposed initially. I completely disagree with that but I did look at it in _detail_. Please take a look here for detailed explanation http://marc.info/?l=linux-kernelm=120180692331461w=2 This email getting to long and I did not want to inline everything. Also, i'd not mind some test-coverage in sched.git as well. I believe it has _nothing_ to do with the scheduler but I do not mind it being in that tree. Please read this email on why it has nothing to do with the scheduler http://marc.info/?l=linux-kernelm=120210515323578w=2 That's the email that convinced Paul. To sum it up. It has been discussed with the right people. I do
Re: [git pull] CPU isolation extensions
On Wed, 06 Feb 2008 21:32:55 -0800 Max Krasnyansky <[EMAIL PROTECTED]> wrote: > Linus, please pull CPU isolation extensions from > > git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git for-linus The feature as a whole seems useful, and I don't actually oppose the merge based on what I see here. As long as you're really sure that cpusets are inappropriate (and bear in mind that Paul has a track record of being wrong on this :)). But I see a few glitches - There are two separate and identical implementations of cpu_unusable(cpu). Please do it once, in a header, preferably with C function, not macros. - The Kconfig help is a bit scraggly: +config CPUISOL_STOPMACHINE + bool "Do not halt isolated CPUs with Stop Machine (HIGHLY EXPERIMENTAL)" + depends on CPUISOL && STOP_MACHINE && EXPERIMENTAL + help + If this option is enabled kernel will not halt isolated CPUs when Stop Machine "the kernel" text is too wide + is triggered. + Stop Machine is currently only used by the module insertion and removal logic. + Please note that at this point this feature is highly experimental and maybe + dangerous. It is not known to really brake anything but can potentially + introduce an instability. s/maybe/may be/ s/brake/break/ Neither this text, nor the changelog nor the code comments tell us what the potential instability with stopmachine *is*? Or maybe I missed it. - Adding new sysfs files without updating Documentation/ABI/ makes Greg cry. - Why is cpu_isolated_map exported to modules? Just for api consistency, it appears? pre-existing problems: - isolated_cpu_setup() has an on-stack array of NR_CPUS integers. This will consume 4k of stack on ia64 (at least). We'll just squeak through for a ittle while, but this needs to be fixed. Just move it into __initdata. - isolated_cpu_setup() expects that the user can provide an up-to-1024 character kernel boot parameter. Is this reasonable given cpu command line limits, and given that NR_CPUS will surely grow beyond 1024 in the future? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[git pull] CPU isolation extensions
Linus, please pull CPU isolation extensions from git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git for-linus Diffstat: b/arch/x86/Kconfig |1 b/arch/x86/kernel/genapic_flat_64.c |5 ++- b/drivers/base/cpu.c| 48 +++ b/include/linux/cpumask.h |3 ++ b/kernel/Kconfig.cpuisol| 15 +++ b/kernel/Makefile |4 +- b/kernel/cpu.c | 49 b/kernel/sched.c| 37 --- b/kernel/stop_machine.c |9 +- b/kernel/workqueue.c| 31 -- kernel/Kconfig.cpuisol | 26 ++- 11 files changed, 176 insertions(+), 52 deletions(-) The patchset consist of 4 patches. cpuisol: Make cpu isolation configrable and export isolated map cpuisol: Do not route IRQs to the CPUs isolated at boot cpuisol: Do not schedule workqueues on the isolated CPUs cpuisol: Do not halt isolated CPUs with Stop Machine First two are very simple. They simply make "CPU isolation" a configurable feature, export cpu_isolated_map and provide some helper functions to access it (just like cpu_online() and friends). Last two patches add support for isolating CPUs from running workqueus and stop machine. Last patch is kind of controversial let me know if you think it's too ugly and I'll resend without it. For more details see below. This patch series extends CPU isolation support. Yes, most people want to virtuallize CPUs these days and I want to isolate them :) . The primary idea here is to be able to use some CPU cores as the dedicated engines for running user-space code with minimal kernel overhead/intervention, think of it as an SPE in the Cell processor. I'd like to be able to run a CPU intensive (%100) RT task on one of the processors without adversely affecting or being affected by the other system activities. System activities here include _kernel_ activities as well. I'm personally using this for hard realtime purposes. With CPU isolation it's very easy to achieve single digit usec worst case and around 200 nsec average response times on off-the-shelf multi- processor/core systems (vanilla kernel plus these patches) even under exteme system load. I'm working with legal folks on releasing hard RT user-space framework for that. I believe with the current multi-core CPU trend we will see more and more applications that explore this capability: RT gaming engines, simulators, hard RT apps, etc. Hence the proposal is to extend current CPU isolation feature. The new definition of the CPU isolation would be: --- 1. Isolated CPU(s) must not be subject to scheduler load balancing Users must explicitly bind threads in order to run on those CPU(s). 2. By default interrupts must not be routed to the isolated CPU(s) User must route interrupts (if any) to those CPUs explicitly. 3. In general kernel subsystems must avoid activity on the isolated CPU(s) as much as possible Includes workqueues, per CPU threads, etc. This feature is configurable and is disabled by default. --- I've been maintaining this stuff since around 2.6.18 and it's been running in production environment for a couple of years now. It's been tested on all kinds of machines, from NUMA boxes like HP xw9300/9400 to tiny uTCA boards like Mercury AXA110. The messiest part used to be SLAB garbage collector changes. With the new SLUB all that mess goes away (ie no changes necessary). Also CFS seems to handle CPU hotplug much better than O(1) did (ie domains are recomputed dynamically) so that isolation can be done at any time (via sysfs). So this seems like a good time to merge. We've had scheduler support for CPU isolation ever since O(1) scheduler went it. In other words #1 is already supported. These patches do not change/affect that functionality in any way. #2 is trivial one liner change to the IRQ init code. #3 is addressed by a couple of separate patches. The main problem here is that RT thread can prevent kernel threads from running and machine gets stuck because other CPUs are waiting for those threads to run and report back. Folks involved in the scheduler/cpuset development provided a lot of feedback on the first series of patches. I believe I managed to explain and clarify every aspect. Paul Jackson initially suggested to implement #2 and #3 using cpusets subsystem. Paul and I looked at it more closely and determined that exporting cpu_isolated_map instead is a better option. Last patch to the stop machine is potentially unsafe and is marked as highly experimental. Unfortunately it's currently the only option that allows dynamic module insertion/removal for above scenarios. If people still feel that it's t ugly I can revert that change and keep it in the separate tree
[git pull] CPU isolation extensions
Linus, please pull CPU isolation extensions from git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git for-linus Diffstat: b/arch/x86/Kconfig |1 b/arch/x86/kernel/genapic_flat_64.c |5 ++- b/drivers/base/cpu.c| 48 +++ b/include/linux/cpumask.h |3 ++ b/kernel/Kconfig.cpuisol| 15 +++ b/kernel/Makefile |4 +- b/kernel/cpu.c | 49 b/kernel/sched.c| 37 --- b/kernel/stop_machine.c |9 +- b/kernel/workqueue.c| 31 -- kernel/Kconfig.cpuisol | 26 ++- 11 files changed, 176 insertions(+), 52 deletions(-) The patchset consist of 4 patches. cpuisol: Make cpu isolation configrable and export isolated map cpuisol: Do not route IRQs to the CPUs isolated at boot cpuisol: Do not schedule workqueues on the isolated CPUs cpuisol: Do not halt isolated CPUs with Stop Machine First two are very simple. They simply make CPU isolation a configurable feature, export cpu_isolated_map and provide some helper functions to access it (just like cpu_online() and friends). Last two patches add support for isolating CPUs from running workqueus and stop machine. Last patch is kind of controversial let me know if you think it's too ugly and I'll resend without it. For more details see below. This patch series extends CPU isolation support. Yes, most people want to virtuallize CPUs these days and I want to isolate them :) . The primary idea here is to be able to use some CPU cores as the dedicated engines for running user-space code with minimal kernel overhead/intervention, think of it as an SPE in the Cell processor. I'd like to be able to run a CPU intensive (%100) RT task on one of the processors without adversely affecting or being affected by the other system activities. System activities here include _kernel_ activities as well. I'm personally using this for hard realtime purposes. With CPU isolation it's very easy to achieve single digit usec worst case and around 200 nsec average response times on off-the-shelf multi- processor/core systems (vanilla kernel plus these patches) even under exteme system load. I'm working with legal folks on releasing hard RT user-space framework for that. I believe with the current multi-core CPU trend we will see more and more applications that explore this capability: RT gaming engines, simulators, hard RT apps, etc. Hence the proposal is to extend current CPU isolation feature. The new definition of the CPU isolation would be: --- 1. Isolated CPU(s) must not be subject to scheduler load balancing Users must explicitly bind threads in order to run on those CPU(s). 2. By default interrupts must not be routed to the isolated CPU(s) User must route interrupts (if any) to those CPUs explicitly. 3. In general kernel subsystems must avoid activity on the isolated CPU(s) as much as possible Includes workqueues, per CPU threads, etc. This feature is configurable and is disabled by default. --- I've been maintaining this stuff since around 2.6.18 and it's been running in production environment for a couple of years now. It's been tested on all kinds of machines, from NUMA boxes like HP xw9300/9400 to tiny uTCA boards like Mercury AXA110. The messiest part used to be SLAB garbage collector changes. With the new SLUB all that mess goes away (ie no changes necessary). Also CFS seems to handle CPU hotplug much better than O(1) did (ie domains are recomputed dynamically) so that isolation can be done at any time (via sysfs). So this seems like a good time to merge. We've had scheduler support for CPU isolation ever since O(1) scheduler went it. In other words #1 is already supported. These patches do not change/affect that functionality in any way. #2 is trivial one liner change to the IRQ init code. #3 is addressed by a couple of separate patches. The main problem here is that RT thread can prevent kernel threads from running and machine gets stuck because other CPUs are waiting for those threads to run and report back. Folks involved in the scheduler/cpuset development provided a lot of feedback on the first series of patches. I believe I managed to explain and clarify every aspect. Paul Jackson initially suggested to implement #2 and #3 using cpusets subsystem. Paul and I looked at it more closely and determined that exporting cpu_isolated_map instead is a better option. Last patch to the stop machine is potentially unsafe and is marked as highly experimental. Unfortunately it's currently the only option that allows dynamic module insertion/removal for above scenarios. If people still feel that it's t ugly I can revert that change and keep it in the separate tree
Re: [git pull] CPU isolation extensions
On Wed, 06 Feb 2008 21:32:55 -0800 Max Krasnyansky [EMAIL PROTECTED] wrote: Linus, please pull CPU isolation extensions from git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git for-linus The feature as a whole seems useful, and I don't actually oppose the merge based on what I see here. As long as you're really sure that cpusets are inappropriate (and bear in mind that Paul has a track record of being wrong on this :)). But I see a few glitches - There are two separate and identical implementations of cpu_unusable(cpu). Please do it once, in a header, preferably with C function, not macros. - The Kconfig help is a bit scraggly: +config CPUISOL_STOPMACHINE + bool Do not halt isolated CPUs with Stop Machine (HIGHLY EXPERIMENTAL) + depends on CPUISOL STOP_MACHINE EXPERIMENTAL + help + If this option is enabled kernel will not halt isolated CPUs when Stop Machine the kernel text is too wide + is triggered. + Stop Machine is currently only used by the module insertion and removal logic. + Please note that at this point this feature is highly experimental and maybe + dangerous. It is not known to really brake anything but can potentially + introduce an instability. s/maybe/may be/ s/brake/break/ Neither this text, nor the changelog nor the code comments tell us what the potential instability with stopmachine *is*? Or maybe I missed it. - Adding new sysfs files without updating Documentation/ABI/ makes Greg cry. - Why is cpu_isolated_map exported to modules? Just for api consistency, it appears? pre-existing problems: - isolated_cpu_setup() has an on-stack array of NR_CPUS integers. This will consume 4k of stack on ia64 (at least). We'll just squeak through for a ittle while, but this needs to be fixed. Just move it into __initdata. - isolated_cpu_setup() expects that the user can provide an up-to-1024 character kernel boot parameter. Is this reasonable given cpu command line limits, and given that NR_CPUS will surely grow beyond 1024 in the future? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/