Re: [zones-discuss] improved zones/RM integration
Jeff Victor wrote: Just curious: which process(es) gets "billed" for shared text pages? Jeff, We keep track of shared cow segments so they are not double counted. Off the top of my head I think we just credit that to the first process we see using the segment, however I'll let Steve chime in here since he did the implementation of this piece. Jerry ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] improved zones/RM integration
Jerry Jelinek wrote: Mike, Mike Gerdts wrote: In the proposal you say: As part of this overall project, we will be enhancing the internal rcapd rss accounting so that rcapd will have a more accurate measurement of the overall rss for each zone. Does this spill over to prstat such that there may finally be a fix for: 4754856 prstat -atJTZ should count shared segments only once Yes, we are addressing this bug as part of this work. prstat will be able to report an accurate rss number for processes, users, projects and tasks as well as zones. prstat and rcapd will use the same, new underlying rss counting code we have developed. Just curious: which process(es) gets "billed" for shared text pages? -- Jeff VICTOR Sun Microsystemsjeff.victor @ sun.com OS AmbassadorSr. Technical Specialist Solaris 10 Zones FAQ:http://www.opensolaris.org/os/community/zones/faq -- ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] improved zones/RM integration
Mike, Mike Gerdts wrote: On 6/26/06, Gerald A. Jelinek <[EMAIL PROTECTED]> wrote: Attached is a description of a project we have been refining for a while now. The idea is to improve the integration of zones with some of the existing resource management features in Solaris. In the proposal you say: As part of this overall project, we will be enhancing the internal rcapd rss accounting so that rcapd will have a more accurate measurement of the overall rss for each zone. Does this spill over to prstat such that there may finally be a fix for: 4754856 prstat -atJTZ should count shared segments only once Yes, we are addressing this bug as part of this work. prstat will be able to report an accurate rss number for processes, users, projects and tasks as well as zones. prstat and rcapd will use the same, new underlying rss counting code we have developed. Jerry ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] improved zones/RM integration
On 6/26/06, Gerald A. Jelinek <[EMAIL PROTECTED]> wrote: Attached is a description of a project we have been refining for a while now. The idea is to improve the integration of zones with some of the existing resource management features in Solaris. In the proposal you say: As part of this overall project, we will be enhancing the internal rcapd rss accounting so that rcapd will have a more accurate measurement of the overall rss for each zone. Does this spill over to prstat such that there may finally be a fix for: 4754856 prstat -atJTZ should count shared segments only once As I am looking forward to using zfs, I am trying to figure out how I can tell how much memory is being used aside from the zfs buffer cache. Mike -- Mike Gerdts http://mgerdts.blogspot.com/ ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] improved zones/RM integration
Could we somehow work the zone name into this? It would be nice for e.g. poolstat(1) observability. Otherwise the user experience is going to be all about trying to work out what 'SUNWzone34' maps to, which seems poor. We need to have the name begin with SUNW or we could have collisions with existing pools. I supposed instead of zone{id}, it could be SUNW{zonename} although you lose the visibility that the pool is associated with a zone. Maybe SUNWzone_{zonename}? Or perhaps SUNWtemp_{zonename}? dsc ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] improved zones/RM integration
Dan, Thanks for your detailed comments. My responses are in-line. Dan Price wrote: Very belatedly, I'm just getting around to reviewing this. Overall I think it looks good. Comments in-line. 1) "Hard" vs. "Soft" RM configuration within zonecfg ... dedicated-cpu ncpus (a positive integer or range, default value 1) importance (a positive integer, default value 1) max-lwps (an integer >= 100) why >= 100? I can envision a minimized zone where this is too many. I picked 100 since I had a hard time getting a zone to boot with much less. Obviously this will vary somewhat depending on the services enabled. Is 100 really a problem as a lower limit? Part of what we are trying to do here is help the user configure a reasonable RM configuration, especially if they don't know a lot about RM. Allowing them to set a limit which prevents the zone from booting seems bad. However, we could also just let them do that if 100 seems too high for some reason. Unfortunately, it is hard to know in advance what exact number of threads will be needed to boot the zone. capped-cpu cpu-cap (a positive integer, default value 100 which represents 100% of one cpu) I'm scared of this default. To put it another way, why did you pick 100? Should there be a value which represents infinity? What is the meaning of specifying 0, or is that an error? We don't have to have a default here I guess. I picked 100 because it seemed to be symmetrical with the dedicated cpu case where the lower number is 1. As far as the other values (infinity and 0) that should be covered by the cpu-caps ARC case. I am not sure if Andrei has finished that case yet. max-lwps (an integer >= 100) cpu-shares (a positive integer) dedicated-memory TBD - once msets [12] are completed capped-memory cap (a positive decimal number with optional k, m, g, or t as a modifier, no modifier defaults to units of megabytes(m), must be at least 1m) I think this set of rules is too complex and too confusing for users-- it's weird to have the default units be larger than the smallest available units. Let's mandate that the user *always* specify units. OK. 2) Temporary Pools. ... If a dedicated-cpu (or eventually a dedicated-memory) resource is configured for the zone, then when the zone boots zoneadmd will create a temporary pool dedicated for the zones use. Zoneadmd will dynamically create a pool & pset (or eventually a mset) and assign the number of cpus specified in zonecfg to that pset. The temporary pool & pset will be named 'SUNWzone{zoneid}'. Could we somehow work the zone name into this? It would be nice for e.g. poolstat(1) observability. Otherwise the user experience is going to be all about trying to work out what 'SUNWzone34' maps to, which seems poor. We need to have the name begin with SUNW or we could have collisions with existing pools. I supposed instead of zone{id}, it could be SUNW{zonename} although you lose the visibility that the pool is associated with a zone. Maybe SUNWzone_{zonename}? Zoneadmd will set the 'pset.min' and 'pset.max' pset properties, as well as the 'pool.importance' pool property, based on the values specified for dedicated-cpu's 'ncpus' and 'importance' properties in zonecfg. Is importance mandatory? Will it have a default value? What values can it have? What does it mean? Please be a little more specific. Yes, 1. I will add more details referring to the pools documentation on importance. If the cpu (or memory) resources needed to create the temporary pool are unavailable, zoneadmd will issue an error and the zone won't boot. When the zone is halted, the temporary pool & pset will be destroyed. What about during a reboot. It seems like it'd be good to not tear down the temporary pool during reboot, but maybe that's hard. It would to me seem weird if my pool was 2-4 CPUs, and I had 2, then rebooted and had 4. We won't destroy the pool on reboot, it is preserved. Although it is not a big deal right now, it will become more of an issue when we have memory sets, so I made sure the pool is preserved across reboot. I'll clarify that. We will add a new boolean property ('temporary') that can exist on pools and any resource set. The 'temporary' property indicates that the pool or resource set should never be committed to a static configuration (e.g. pooladm -s) and that it should never be destroyed when updating the dynamic configuration from a static configuration (e.g. pooladm -c). These t
Re: [zones-discuss] improved zones/RM integration
Very belatedly, I'm just getting around to reviewing this. Overall I think it looks good. Comments in-line. > 1) "Hard" vs. "Soft" RM configuration within zonecfg > ... > dedicated-cpu > ncpus (a positive integer or range, default value 1) > importance (a positive integer, default value 1) > max-lwps (an integer >= 100) why >= 100? I can envision a minimized zone where this is too many. > capped-cpu > cpu-cap (a positive integer, default value 100 which >represents 100% of one cpu) I'm scared of this default. To put it another way, why did you pick 100? Should there be a value which represents infinity? What is the meaning of specifying 0, or is that an error? > max-lwps (an integer >= 100) > cpu-shares (a positive integer) > dedicated-memory > TBD - once msets [12] are completed > capped-memory > cap (a positive decimal number with optional k, m, g, >or t as a modifier, no modifier defaults to units >of megabytes(m), must be at least 1m) I think this set of rules is too complex and too confusing for users-- it's weird to have the default units be larger than the smallest available units. Let's mandate that the user *always* specify units. > > 2) Temporary Pools. > ... > If a dedicated-cpu (or eventually a dedicated-memory) resource is > configured for the zone, then when the zone boots zoneadmd will create > a temporary pool dedicated for the zones use. Zoneadmd will > dynamically create a pool & pset (or eventually a mset) and assign the > number of cpus specified in zonecfg to that pset. The temporary pool > & pset will be named 'SUNWzone{zoneid}'. Could we somehow work the zone name into this? It would be nice for e.g. poolstat(1) observability. Otherwise the user experience is going to be all about trying to work out what 'SUNWzone34' maps to, which seems poor. > Zoneadmd will set the 'pset.min' and 'pset.max' pset properties, as > well as the 'pool.importance' pool property, based on the values > specified for dedicated-cpu's 'ncpus' and 'importance' properties > in zonecfg. Is importance mandatory? Will it have a default value? What values can it have? What does it mean? Please be a little more specific. > If the cpu (or memory) resources needed to create the temporary pool > are unavailable, zoneadmd will issue an error and the zone won't boot. > > When the zone is halted, the temporary pool & pset will be destroyed. What about during a reboot. It seems like it'd be good to not tear down the temporary pool during reboot, but maybe that's hard. It would to me seem weird if my pool was 2-4 CPUs, and I had 2, then rebooted and had 4. > We will add a new boolean property ('temporary') that can exist on > pools and any resource set. The 'temporary' property indicates that > the pool or resource set should never be committed to a static > configuration (e.g. pooladm -s) and that it should never be destroyed > when updating the dynamic configuration from a static configuration > (e.g. pooladm -c). These temporary pools/resources can only be managed > in the dynamic configuration. These changes will be implemented within > libpool(3LIB). > > It is our expectation that most users will never need to manage > temporary pools through the existing poolcfg(1M) commands. For users > who need more sophisticated pool configuration and management, the > existing 'pool' resource within zonecfg should be used and users > should manually create a permanent pool using the existing mechanisms. > > 3) Resource controls in zonecfg will be simplified. ... > Here are the aliases we will define for the rctls: > alias rctl > - > max-lwpszone.max-lwps > cpu-shares zone.cpu-shares > cpu-cap zone.cpu-cap (future, once cpu-caps integrate) You've mentioned that you will substitute in sort of "the right" defaults for the privileged and action fields. It seems like you should spell out what those will be... alias rctl -- cpu-shares=X zone.cpu-shares(privileged, X, none) ... > If an rctl was already defined that did not match the expected value > (e.g. it had 'action=none' or multiple values), then the 'max-lwps' > alias will be disabled. An attempt to set 'max-lwps' within > 'dedicated-cpu' would print the following error: > "One or more incompatible rctls already exist for thi
Re: [zones-discuss] improved zones/RM integration
Amol, Thanks for your comments. I have some responses in-line. Amol A Chiplunkar wrote: These are very exciting features !! Some comments. If a dedicated-cpu (or eventually a dedicated-memory) resource is configured for the zone, then when the zone boots zoneadmd will create a temporary pool dedicated for the zones use. The temp pool created is going to show up in pooladm output, right ? If someone uses such a pool in a zone configuration done in traditional style. ( set pool=SUNWzone{zoneid} ) is zonecfg going to reject it ? If not, it won't be "dedicated" anymore. That is a good point. We will make sure that zonecfg disallows that and I will clarify that in the proposal. One thing I wanted to point out is that the name of the pool/pset will actually change when you reboot the zone since the zoneid will change at that time. There is no easy way to predict what the name will be or if any particular name will exist, since the zoneid is fairly dynamic. We are not going too far out of our way to try to prevent you from using the temp. pool for something else but I will make sure zonecfg doesn't allow that. Also, is there something that's going to stop poolbind that may move zones to and from temp pools to permanent pools ? What if a zone was created as a result of which a temp pool was created, and poolbind moves it to a permanent pool ? The temp pool is deleted ? That is a good question. I need to think about what we should do there but I am inclined to say that we should disallow that. Once you start allowing stuff like that, then we are back to the problem of where the configuration data is stored and managed and that is what we are trying to avoid with the whole idea of the temporary pool. Resource controls in zonecfg will be simplified. Do you think we need prctl enhancements that will allow setting rctls using the new aliases directly ? that would be good for consistency. That is another good idea. I'd like to keep that separate for now but we'll keep it on our plate. Also you mention that the backword compatibility ensures the existing tools that parse zonecfg info / export output are unaffected. It's true to some extend. But some of them treat "unknown" resources as nop and display them as is. Which means some tools will show the new resources as unknown which is harmless but could be confusing sometimes. We are continuing to add new resources. 'limitpriv' and 'bootargs' are two recent ones. We won't stop adding new resources; that would be too constraining, but we will continue to try to make sure we don't break scripts that depend on resources that they know about. May be having a command line option such as zonecfg info -legacy that would suppress the new resources could help. I think it would be better to plan for new resources. Otherwise, what would legacy be? Just the resources that were in the original S10 release? We will enhance zoneadmd so that if the zone.cpu-shares rctl is set and FSS is not already the default scheduling class, zoneadmd will set the scheduling class to be FSS for processes in the zone. On the fly ? i.e. If a zone didn't have zone.cpu-shares when it was booted and someone did prctl to set it, zoneadmd will change the sched class of all processes in the zone to FSS ? That's cool. That is not what we plan on doing. What we are planning is to set FSS when the zone boots, if it has cpu-shares. These enhancements are really targeted at people who don't know a lot about the existing RM features. If you know enough to use prctl to do this kind of thing, then we will expect you to be able to fully manage your system using all of the existing features. Thanks again for your comments, Jerry ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] improved zones/RM integration
Mike, I think most of your comments were addressed in my response to Jeff but I did want to make sure one thing was clear. Mike Gerdts wrote: On 6/27/06, Jeff Victor <[EMAIL PROTECTED]> wrote: 4) Aliases: The notion of aliases also creates redundant output which could be confusing. I like the simplification of aliases as described, but I wish I had a good solution that wouldn't break existing tools that parse "zonecfg info" output - if such tools exist. Because key features are missing from zones, I have been writing scripts that sometimes parse the output of "zonecfg info". Changing this format stands a good chance of breaking my scripts. It sounds like if I set pool and rctl resources, they would still be displayed as rctl and not translated to the syntax associated with temporary pools. Is this correct? Yes, we considered lots of alternatives but we wanted to make sure that any scripts would continue to work. So, if your scripts are setting or looking at the rctl entries, then they should continue to work, even if you also start to use the new resources. Thanks, Jerry ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] improved zones/RM integration
Jeff, Thanks for your comments. I have a few responses in-line. Jeff Victor wrote: 1) General comment: I agree that this will provide needed clarity to the seemingly unorganized RM features that we have scattered through Solaris during the last decade. The automation of certain activities (e.g. starting rcapd in the GZ when needed) will also be extremely beneficial. 2) Terminology: Although your use of the phrase "hard partition" should be clear to most people, through experience with other partitioning technologies, the use of "soft partition" is less clear. To many people, a "soft" limit is one that can be exceeded occasionally or in certain situations, or is merely an advisory limit. It also conflicts with SVM "soft partitions." We started using "hard" and "soft" to describe the general idea amongst ourselves when we first started talking about this but we were never very satisfied with those terms either. These two terms will not necessarily be used in the final documentation and are not used in the resource names themselves. Naming always seems to be a difficult area. The key technical issue to focus on is the resource names and properties being proposed as opposed to the overall words we are using to describe the general ideas. I am guessing we were able to communicate the general idea using "hard" and "soft" so I think we are ok there. We will have to figure out the best way to document this when the time comes. It is hard to find good terms that are not already used by other parts of the system. The word "resource" is a good example and is probably more confusing than "soft partition". 3) Lwps context: Why is the lwps alias defined in the context of dedicated-cpu? Lwps seem to be unrelated to hard partitions. Further, ncpus represents a subset of a finite resource. Lwps are not finite. The two should be separate. Being able to set max-lwps is a useful limit for both processor sets and cpu-caps which is why it is available in both resources. The global zone still manages all processes, even when you are using a processor set, so a fork bomb can still effect the responsiveness of the system as a whole. However, max-lwps is optional in both the dedicated-cpu and capped-cpu resources. I should have made that clearer. I will update the document to clarify that. 4) Aliases: The notion of aliases also creates redundant output which could be confusing. I like the simplification of aliases as described, but I wish I had a good solution that wouldn't break existing tools that parse "zonecfg info" output - if such tools exist. This is part of the proposal which we struggled with a lot. In the end, we decided that we needed to maintain compatibility so that we did not break scripts that talked to the CLI directly. This was the compromise we came up with that allows us to do that. 5) Another RM setting: While we're integrating RM settings, I think we should consider adding project.max-address-space using the same semantics as {project,zone}.max-lwps. A zone-specific setting, zone.max-address-space, could be added along with a zonecfg alias, max-address-space. This would allow the GZ admin to cap the virtual memory space available to that zone. This would not take the place of swap sets, but would be valuable if this RM integration work might be complete before swap sets. We are looking at additional rctls, they are just not part of this project. Thanks again for your comments, Jerry ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] improved zones/RM integration
These are very exciting features !! Some comments. If a dedicated-cpu (or eventually a dedicated-memory) resource is configured for the zone, then when the zone boots zoneadmd will create a temporary pool dedicated for the zones use. The temp pool created is going to show up in pooladm output, right ? If someone uses such a pool in a zone configuration done in traditional style. ( set pool=SUNWzone{zoneid} ) is zonecfg going to reject it ? If not, it won't be "dedicated" anymore. Also, is there something that's going to stop poolbind that may move zones to and from temp pools to permanent pools ? What if a zone was created as a result of which a temp pool was created, and poolbind moves it to a permanent pool ? The temp pool is deleted ? Resource controls in zonecfg will be simplified. Do you think we need prctl enhancements that will allow setting rctls using the new aliases directly ? that would be good for consistency. Also you mention that the backword compatibility ensures the existing tools that parse zonecfg info / export output are unaffected. It's true to some extend. But some of them treat "unknown" resources as nop and display them as is. Which means some tools will show the new resources as unknown which is harmless but could be confusing sometimes. May be having a command line option such as zonecfg info -legacy that would suppress the new resources could help. We will enhance zoneadmd so that if the zone.cpu-shares rctl is set and FSS is not already the default scheduling class, zoneadmd will set the scheduling class to be FSS for processes in the zone. On the fly ? i.e. If a zone didn't have zone.cpu-shares when it was booted and someone did prctl to set it, zoneadmd will change the sched class of all processes in the zone to FSS ? That's cool. On the other hand, How about when this zone is sharing a pool with another zone without cpu-shares and the pool's sched class is also not FSS ? We don't recommend processes running in different scheduling class share CPUs right ? This feature may take it to that situation. thanks - Amol Gerald A. Jelinek wrote: Attached is a description of a project we have been refining for a while now. The idea is to improve the integration of zones with some of the existing resource management features in Solaris. I would appreciate hearing any suggestions or questions. I'd like to submit this proposal to our internal architectural review process by mid-July. I have also posted a few slides that give an overview of the project. Those are available on the zones files page (http://www.opensolaris.org/os/community/zones/files/). Thanks, Jerry This message posted from opensolaris.org SUMMARY: This project enhances Solaris zones[1], pools[2-4] and resource caps[5,6] to improve the integration of zones with resource management (RM). It addresses existing RFEs[7-10] in this area and lays the groundwork for simplified, coherent management of the various RM features exposed through zones. We will integrate some basic pool configuration with zones, implement the concept of "temporary pools" that are dynamically created/destroyed when a zone boots/halts and we will simplify the setting of resource controls within zonecfg. We will enhance rcapd so that it can cap a zone's memory while rcapd is running in the global zone. We will also make a few other changes to provide a better overall experience when using zones with RM. Patch binding is requested for these new interfaces and the stability of most of these interfaces is "evolving" (see interface table for complete list). PROBLEM: Although zones are fairly easy to configure and install, it appears that many customers have difficulty setting up a good RM configuration to accompany their zone configuration. Understanding RM involves many new terms and concepts along with lots of documentation to understand. This leads to the problem that many customers either do not configure RM with their zones, or configure it incorrectly, leading them to be disappointed when zones, by themselves, do not provide all of the containment that they expect. This problem will just get worse in the near future with the additional RM features that are coming, such as cpu-caps[11], memory sets[12] and swap sets[13]. PROPOSAL: There are 7 different enhancements outlined below. 1) "Hard" vs. "Soft" RM configuration within zonecfg We will enhance zonecfg(1M) so that the user can configure basic RM capabilities in a structured way. The various existing and upcoming RM features can be broken down into "hard" vs. "soft" pa
Re: [zones-discuss] improved zones/RM integration
On 6/27/06, Jeff Victor <[EMAIL PROTECTED]> wrote: 1) General comment: I agree that this will provide needed clarity to the seemingly unorganized RM features that we have scattered through Solaris during the last decade. The automation of certain activities (e.g. starting rcapd in the GZ when needed) will also be extremely beneficial. Most definitely. 2) Terminology: Although your use of the phrase "hard partition" should be clear to most people, through experience with other partitioning technologies, the use of "soft partition" is less clear. To many people, a "soft" limit is one that can be exceeded occasionally or in certain situations, or is merely an advisory limit. It also conflicts with SVM "soft partitions." I suggest the phrases "dedicated partition" (or "private partition") and "shared partition" instead to clarify the intent. Choosing "dedicated partition" might then require re-naming "dedicated-cpu" to "guaranteed-cpu" or "private-cpu" and re-naming "dedicated-memory" to "guaranteed-memory" or "private memory". Or we could leave "hard partition" alone and simply change "soft partition" to "shared partition." I think that "soft partition" is more clear. "Shared partition" suggest to me that you are sharing a single hard or soft partition for multiple workloads. 3) Lwps context: Why is the lwps alias defined in the context of dedicated-cpu? Lwps seem to be unrelated to hard partitions. Further, ncpus represents a subset of a finite resource. Lwps are not finite. The two should be separate. Something seemed odd with that to me too. I didn't see it as terribly harmful there, but it is an excellent point. This is analagous to maxuprc typically set in /etc/system. What are the chances that each zone in the future gets file descriptor limits or other similar limits that should be in the same section as the lwp limit? 4) Aliases: The notion of aliases also creates redundant output which could be confusing. I like the simplification of aliases as described, but I wish I had a good solution that wouldn't break existing tools that parse "zonecfg info" output - if such tools exist. Because key features are missing from zones, I have been writing scripts that sometimes parse the output of "zonecfg info". Changing this format stands a good chance of breaking my scripts. It sounds like if I set pool and rctl resources, they would still be displayed as rctl and not translated to the syntax associated with temporary pools. Is this correct? Mike -- Mike Gerdts http://mgerdts.blogspot.com/ ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] improved zones/RM integration
1) General comment: I agree that this will provide needed clarity to the seemingly unorganized RM features that we have scattered through Solaris during the last decade. The automation of certain activities (e.g. starting rcapd in the GZ when needed) will also be extremely beneficial. 2) Terminology: Although your use of the phrase "hard partition" should be clear to most people, through experience with other partitioning technologies, the use of "soft partition" is less clear. To many people, a "soft" limit is one that can be exceeded occasionally or in certain situations, or is merely an advisory limit. It also conflicts with SVM "soft partitions." I suggest the phrases "dedicated partition" (or "private partition") and "shared partition" instead to clarify the intent. Choosing "dedicated partition" might then require re-naming "dedicated-cpu" to "guaranteed-cpu" or "private-cpu" and re-naming "dedicated-memory" to "guaranteed-memory" or "private memory". Or we could leave "hard partition" alone and simply change "soft partition" to "shared partition." 3) Lwps context: Why is the lwps alias defined in the context of dedicated-cpu? Lwps seem to be unrelated to hard partitions. Further, ncpus represents a subset of a finite resource. Lwps are not finite. The two should be separate. 4) Aliases: The notion of aliases also creates redundant output which could be confusing. I like the simplification of aliases as described, but I wish I had a good solution that wouldn't break existing tools that parse "zonecfg info" output - if such tools exist. 5) Another RM setting: While we're integrating RM settings, I think we should consider adding project.max-address-space using the same semantics as {project,zone}.max-lwps. A zone-specific setting, zone.max-address-space, could be added along with a zonecfg alias, max-address-space. This would allow the GZ admin to cap the virtual memory space available to that zone. This would not take the place of swap sets, but would be valuable if this RM integration work might be complete before swap sets. 6) From your "Project Alternatives" slide: Should we have yet another standalone GUI? Absolutely not. Unless it was a wizard - IMO Solaris adoption would accelerate greatly if it had wizards to aid potential adopters. However this ends up looking, I look forward to seeing this integration! Gerald A. Jelinek wrote: Attached is a description of a project we have been refining for a while now. The idea is to improve the integration of zones with some of the existing resource management features in Solaris. I would appreciate hearing any suggestions or questions. I'd like to submit this proposal to our internal architectural review process by mid-July. I have also posted a few slides that give an overview of the project. Those are available on the zones files page (http://www.opensolaris.org/os/community/zones/files/). Thanks, Jerry This message posted from opensolaris.org SUMMARY: This project enhances Solaris zones[1], pools[2-4] and resource caps[5,6] to improve the integration of zones with resource management (RM). It addresses existing RFEs[7-10] in this area and lays the groundwork for simplified, coherent management of the various RM features exposed through zones. We will integrate some basic pool configuration with zones, implement the concept of "temporary pools" that are dynamically created/destroyed when a zone boots/halts and we will simplify the setting of resource controls within zonecfg. We will enhance rcapd so that it can cap a zone's memory while rcapd is running in the global zone. We will also make a few other changes to provide a better overall experience when using zones with RM. Patch binding is requested for these new interfaces and the stability of most of these interfaces is "evolving" (see interface table for complete list). PROBLEM: Although zones are fairly easy to configure and install, it appears that many customers have difficulty setting up a good RM configuration to accompany their zone configuration. Understanding RM involves many new terms and concepts along with lots of documentation to understand. This leads to the problem that many customers either do not configure RM with their zones, or configure it incorrectly, leading them to be disappointed when zones, by themselves, do not provide all of the containment that they expect. This problem will just get worse in the near future with the additional RM features that are coming, such as cpu-caps[11], memory sets[12] and swap sets[13]. PROPOSAL: There are 7 different enhancements outlined below. 1) "Hard" vs. "Soft" RM configuration
[zones-discuss] improved zones/RM integration
Attached is a description of a project we have been refining for a while now. The idea is to improve the integration of zones with some of the existing resource management features in Solaris. I would appreciate hearing any suggestions or questions. I'd like to submit this proposal to our internal architectural review process by mid-July. I have also posted a few slides that give an overview of the project. Those are available on the zones files page (http://www.opensolaris.org/os/community/zones/files/). Thanks, Jerry This message posted from opensolaris.orgSUMMARY: This project enhances Solaris zones[1], pools[2-4] and resource caps[5,6] to improve the integration of zones with resource management (RM). It addresses existing RFEs[7-10] in this area and lays the groundwork for simplified, coherent management of the various RM features exposed through zones. We will integrate some basic pool configuration with zones, implement the concept of "temporary pools" that are dynamically created/destroyed when a zone boots/halts and we will simplify the setting of resource controls within zonecfg. We will enhance rcapd so that it can cap a zone's memory while rcapd is running in the global zone. We will also make a few other changes to provide a better overall experience when using zones with RM. Patch binding is requested for these new interfaces and the stability of most of these interfaces is "evolving" (see interface table for complete list). PROBLEM: Although zones are fairly easy to configure and install, it appears that many customers have difficulty setting up a good RM configuration to accompany their zone configuration. Understanding RM involves many new terms and concepts along with lots of documentation to understand. This leads to the problem that many customers either do not configure RM with their zones, or configure it incorrectly, leading them to be disappointed when zones, by themselves, do not provide all of the containment that they expect. This problem will just get worse in the near future with the additional RM features that are coming, such as cpu-caps[11], memory sets[12] and swap sets[13]. PROPOSAL: There are 7 different enhancements outlined below. 1) "Hard" vs. "Soft" RM configuration within zonecfg We will enhance zonecfg(1M) so that the user can configure basic RM capabilities in a structured way. The various existing and upcoming RM features can be broken down into "hard" vs. "soft" partitioning of the system's resources. With "hard" partitioning, resources are dedicated to the zone using processor sets (psets) and memory sets (msets). With "soft" partitioning, resources are shared, but capped, with an upper limit on their use by the zone. Hard|Soft - cpu| psets | cpu-caps memory | msets | rcapd There are also some existing rctls (zone.cpu-shares, zone.max-lwps) which will be integrated into this overall concept. Within zonecfg we will organize the various RM features into four basic zonecfg resources so that it is simple for a user to understand and configure the RM features that are to be used with their zone. Note that zonecfg "resources" are not the same as "resource management". Within zonecfg, a "resource" is the name of a top-level property of the zone (see zonecfg(1M) for more information). The four new zonecfg resources are: dedicated-cpu capped-cpu (future, once cpu-caps are integrated) dedicated-memory (future, once memory sets are integrated) capped-memory Each of these zonecfg resources will have properties that are appropriate to the RM capabilities associated with that resource. Zonecfg will only allow one instance of each these resource to be configured and it will not allow conflicting resources to be added (e.g. dedicated-cpu and capped-cpu are mutually exclusive). The mapping of these new zonecfg resources to the primary underlying RM feature is: dedicated-cpu -> temporary pset dedicated-memory -> temporary mset capped-cpu -> cpu-cap rctl [11] capped-memory -> rcapd running in GZ Temporary psets and msets are described below, in section 2. Rcapd enhancements for running in the global zone are described below, in section 4. The valid properties for each of these new zonecfg resources will be: dedicated-cpu ncpus (