Jerry, I am not sure if that was described in the proposal. Would it be possible to run s10 brands ontop of future s10 global zones ?
-----Ursprüngliche Nachricht----- Von: Jerry Jelinek <gerald.jeli...@sun.com> Gesendet: 12.5.'09, 13:28 > Enclosed is a first draft of a spec. for the S10 > brand which we plan to submit for a PSARC > inception review. Please send us any comments > or questions. > > Thanks, > Jerry > > --- > > S10C: A Solaris 10 Branded Zone for Solaris.Next > > Gerald Jelinek, Jordan Vaughan > Solaris Virtualization Technologies > > > [A note on terminology: This document uses the terms "Solaris 10" and > "Solaris.Next" very frequently. As such, the abbreviations "S10" and > "S.next" respectively are used interchangeably with the longer forms. > The term "virtualization" is abbreviated as V12N.] > > > Part 1: Introduction > ____________________________ > > Each new minor release of Solaris brings with it the well known problems > of slow user adoption, slow ISV support and concerns about compatibility. > The compatibility concerns will be more pronounced with the release of > S.next since it's anticipated that there will be greater than normal > user-visible changes (e.g. the packaging system, etc.). > > Fortunately, since the last minor release of Solaris (Solaris 10), V12N > techniques have become widespread and V12N can be used as a solution to > ease the transition to the new version of Solaris. Zones[1] combined > with a brand[2] are particularly well suited for this task since the host > system is actually running S.next, whereas this is not necessarily the > case with other V12N solutions. In addition, zones are usable on any > system which runs S.next, which is also not the case with other V12N > alternatives. > > We already have a proven track record delivering this sort of > zones/brand based solution to enable running earlier versions of Solaris > on S10 [3, 4], so in one sense this case breaks little new ground. > However, the earlier 'solaris8' and 'solaris9' brands were used to host > releases that are very static as compared to hosting a zone running S10. > In addition, S.next can be expected to continue to change rapidly for > the forseeable future. Given this, a 'solaris10' brand for S.next poses > additional challenges for projects on both the S10 and S.next sides of > the system. Many of these challenges are outside of the scope of an > architectural review and include developer education, testing and > procedural changes. However, the existence of this brand could > potentially impact future projects in various ways and at a minimum will > require ARC consideration for future reviews. The existence of this > brand can be seen as a potential "tax" on all projects which work on both > sides of the user/kernel boundary for both S10 and S.next. > > The benefits of the brand are as follows: > > For customers: > - Provides a solution to cope with compatibility differences between > S10 and S.next > - Protects investment in S10 infrastructure, training, and internal > support > - Minimize the cost of consolidating Solaris 10 systems > - Enables deployment of new technologies in S.next (e.g., crossbow) > while still running applications on S10, thereby limiting risk to > production environment > - Avoids or delays required application recertification > > For Sun: > - S.next is adopted sooner > - Provide a Solaris compatibility environment for S.next > - Sun is a solution provider easing the burden of getting to S.next > - Provide cross-platform virtualization solution for S.next across > all hardware (it is the only V12N solution on M-Series) > > This has been identified as a required feature for S.next. > > === Project Overview === > > As with the earlier 'solaris8' and 'solaris9' brands, this project > delivers the following: > > - A Branded Container which emulates Solaris 10's user environment, > based on the BrandZ infrastructure provided with zones. > This brand is called 'solaris10'. Only Solaris 10u8 and > beyond will be supported and tested in the zone. > > - A mechanism for archiving existing Solaris 10 systems and for > redeploying those archives into the branded zone. This > process is referred to as p2v and uses the same techniques > as the 'solaris8' and 'solaris9' brands. > > In addition, the following additional capabilities will be provided > as compared to the 'solaris8' and 'solaris9' brands. > > - This brand will be supported on all hardware architectures > that run S.next (sun4v, sun4u and x86). The specific platforms, > particularly sun4u, will be the same as are certified for S.next. > > - A "virtual to virtual" or v2v mechanism for archiving existing > Solaris 10 native zones and for redeploying those archives into > the branded zone on S.next will be provided. The process will be > very similar to the existing zone migration [5] feature except that > the zone's brand will be changed as part of the process. In > addition, if the zone is sparse on S10 it must be converted to > a whole-root zone during the migration. > > Part 2: solaris10 Brand > ____________________________ > > The solaris10 brand is conceptually similar to the existing solaris8 > and solaris9 brands and builds directly on the BrandZ infrastructure > that was created to support the lx brand. Familiarity with BrandZ > and the solaris8 and solaris9 brands is assumed. > > At this time the design and development of the brand is only > supporting the shared stack [6] networking model in which the zone's > network is managed by the global zone. The exclusive stack model > is anticipated to require more complex solutions or emulation due > to the introduction of Crossbow [7] into S.next. The exclusive > stack issues will be resolved before commitment review. > > The ZFS ioctls have been audited and no issues have been seen. Because > so much of ZFS has been backported to S10 updates earlier than the first > S10 version being supported in the brand (S10u8), ZFS delegated datasets > appear to work fine. Further testing needs to be done and future > ZFS enhancements might require work at some point. > > === System Call Emulation === > > This section details the system call emulation provided by the current > solaris10 brand module. > > The following system calls are currently being emulated. > > SYS_exec 11 > SYS_ioctl 54 > SYS_exeve 59 > SYS_acctctl 71 > SYS_getpagesizes 73 > SYS_issetugid 75 > SYS_uname 135 > SYS_pwrite 174 > SYS_sigqueue 190 > SYS_pwrite64 223 > SYS_zone 227 > > SYS_exec > SYS_exeve > The emulator interposes on these system calls to provide a > convenient mechanism for branded processes to be able to spawn > native processes. > > SYS_ioctl > Emulate process contract ioctls for init(1M) because the > ioctl parameter structure changed between S10 and Nevada. > > SYS_acctctl > The mode shift, mode mask and option mask for acctctl changed for > crossbow. > > SYS_getpagesizes > New first arg "legacy" must be set to 1. > > SYS_issetugid > S10's issetugid() syscall is now a subcode to privsys(). > > SYS_uname > The emulator simply passes this through, then modifies the result > upon return, so that the system call returns 5.10 for the 'release' > field and 'Generic_Virtual' for the 'version' field. > > SYS_pwrite > SYS_pwrite64 > pwrite's behavior differs between S10 and Nevada when applied to > files opened with O_APPEND. The offset argument is ignored and the > buffer is appended to the target file in S10, whereas the current > file position is ignored in Nevada (i.e., pwrite() acts as though > the target file wasn't opened with O_APPEND). This is a result of > the fix for: > 6655660 pwrite() must ignore the O_APPEND/FAPPEND flag. > Emulate the old S10 pwrite() behavior by checking whether the target > file was opened with O_APPEND. If it was, then invoke the write() > system call instead of pwrite(); otherwise, invoke the pwrite() > system call as usual. > > SYS_sigqueue > New last arg "block" flag should be zero. The block flag is used > by the Opensolaris AIO implementation, which is now part of libc. > > SYS_zone > See discussion below. > > === zone(2) support === > > Zones have been part of S10 since its FCS, so in general S10 is > already zone-aware and does the right thing in most cases. Commands > that are zone-aware will continue to work as they do today in > S10 native zones. One set of commands which does require emulation > are the S10 SVr4 packaging and patch commands. Those commands are > zone-aware and in some cases will check if they are running in the > global zone and refuse to function if not. If running in the global > zone they will also attempt to look for other zones to operate on. > > The brand emulation interposes on the zone syscall and selectively > provides emulation when the running command is one of the SVr4 > package or patch commands. In these cases the emulation indicates > that it is the global zone (zoneid 0) and various zone attributes, > such as the zone brand itself, are emulated. In all other cases > the syscall is passed through so that the other S10 commands continue > to behave as they do currently. Because the solaris10 branded zones > are whole-root zones, all packaging and patch operations will > be successful, although the kernel components of the package or > patch are not used. This is exactly the same behavior as on the > solaris8 and solaris9 branded zones. > > One further considerations for zones is related to the p2v process. > During p2v there may be zones on the original physical system. > Since zones do not nest, p2v-ing these systems means that the zones > themselves are not usable inside the branded zone. This is detected > when the zone is installed and a warning is issued indicating that > any nested zones will not be usable and that the disk space could be > recovered. Those zones can be migrated ahead of time using the v2v > feature described below. In addition, a future project is planned > which will assess a system prior to p2v and report any possible issues > that may arise. Detecting zones would be part of that report. > > === solaris10 Brand: What's Not Emulated === > > This project does not make any changes to existing native zones > limitations. One point to note is that TX will continue to > be incompatible with branded zones. Customers using TX on S10 > systems will need to transition to a certified, native S.next TX > solution. Discussions with the TX team indicate that this is > the normal behavior for users of TX, since the base OS itself must > be certified for TX. > > === Versioning === > > Because of the potential issues with compatibility of various releases > of S10 hosted on differing releases of S.next, a basic versioning > system is incorporated into the brand. This versioning system works > both ways. That is, the brand emulation can check which version of > S10 is being hosted in the zone and adjust the emulation accordingly. > Likewise, future S10 updates which require specific emulation can > indicate that a specific version of the emulation is required. If > necessary, they can also check if they are running in a branded zone > and, if so, determine what version of emulation is available. The > initial release of the software won't need this versioning mechanism, > but it is being included to cope with possible future enhancements to > either S10 or S.next. > > If a change is made to S10 which requires an enhancement to the brand > emulation library, it is expected that this change would be delivered > in a S10 KU patch which provides components on both sides of the > user/kernel boundary. When the branded zone boots, the brand boot hook > determines the minimal version of the KU that is installed in the zone > to verify that the zone's release is supported (i.e. currently the > minimal KU will be the one from S10u8). It then makes the associated > version (i.e. version 0 of the emulation) available as an attribute on > the zone. The brand library can then use this information to provide > conditional emulation if needed. Future projects that enhance the > emulation for new features in S10 can add a check for a different KU > version number which would then provide associated versions (e.g. 1, 2, > etc.) to the brand library. > > If the KU version is not sufficient, future S10 projects may need to > design some other version check for the brand to enable it to properly > detect the S10 changes. The ability to detect the KU version is > already covered by the contract on the zone "update on attach" > feature [8]. > > The situation is more complicated for future changes within the S10 > code base which will require associated enhancements to the brand > emulation. There are two mechanisms being proposed. > > The first mechanism is that the future version of S10 can specify > that it requires a minimal version of the brand emulation. It does > this by delivering a version number into the > '/usr/lib/brand/solaris10/version' file on S10. When this future > version of S10 is p2v-ed into a solaris10 branded zone, the > solaris10 brand will check for the presence of this file and if it > exists, the brand will verify that the brand's version is greater than > or equal to the version specified in the S10 file. If not, then an > error will be emitted and the zone p2v will fail, leaving the zone in > the configured state. > > If the '/usr/lib/brand/solaris10/version' file is missing on S10, that > indicates that the version of S10 is still compatible with the initial > release of the solaris10 brand emulation. The first time a project is > backported to S10 which requires an enhancement to the emulation, this > file must be created and the version number in the file will be bumped. > > This first mechanism is useful if a future S10 update is fundamentally > incompatible with an older version of the S.next brand emulation. > > The second mechanism allows projects that have been backported to S10 > to actually be brand aware. A new zone attribute will be available > indicating which version of the brand emulation is currently installed > on the system. For these future S10 updates, if they deliver a new > feature which requires changes to the brand library, that S10 feature > can also determine if it is running in a branded zone and if so, if the > necessary emulation is available. If the newer S10 update is running > in a zone on an older version of S.next which does not provide the > required emulation, the S10 feature can adjust its behavior in the > appropriate manner. > > The existing getzoneid() and zone_getattr(ZONE_ATTR_BRAND) functions can > be used by S10 code to determine if it is running in a non-global zone > and if that zone is a 'solaris10' branded zone. A new solaris10 > brand-specific zone attribute, S10_EMUL_VERSION_NUM, is defined. The > S10 feature can use the zone_getattr(S10_EMUL_VERSION_NUM) function to > determine if the brand emulation supports the feature. The getzoneid() > and zone_getattr() functions are already used throughout the ON > consolidation for code that is zone-aware. These functions will continue > to be consolidation private. > > Engineers backporting features to a future S10 update will need to > first determine if that feature requires enhancements to the solaris10 > brand library. If so, they will then have to enhance the emulation > in S.next and bump the emulation version number. They can then either > bump the minimal emulation version number in the > /usr/lib/brand/solaris10/version file on S10 during the S10 backport or > they can add the appropriate checks to the backported S10 code so that > it can determine if the support is available in the brand library and > change behavior accordingly. > > This obviously adds a great deal of complexity to projects backporting > features to future S10 updates if those features require emulation to > function correctly in the branded zone. Ideally, projects requiring > such enhancements to the brand emulation will not be backported. > Perhaps the presence of the S10 brand on S.next may discourage projects > from backporting since the brand provides S10 compatibility on S.next. > Future projects which cross the user/kernel boundary and which request > patch binding should be reviewed by the ARCs to determine if those > projects must take the solaris10 brand into account. > > In addition to the above, any changes integrating into S.next which might > impact the solaris10 brand will need to test the supported versions > of S10 in the branded zone and make any needed changes to the solaris10 > emulation. > > > Part 3: Archiving, Installation, p2v & v2v > ____________________________ > > The p2v process for the solaris10 brand is the same as for > the solari8, solaris9 and native [9] brands. A contract will > be included with this case for the flar command to explicitly > call out the use of flash archives for migrating system images > into zones. > > The v2v process for migrating S10 native zones to solaris10 branded > zones will support the same archive formats as p2v. This process will > use the 'zoneamd attach' subcommand since thats the existing > interface for migrating [3] zones from one system to another. The > solaris10 brand attach subcommand will be extended to accept the > following options which correspond to the same options in the install > subcommand. > > -a {path} - specifies a path to an archive to unpack into the zone > -d {path} - specifies a path to a tree of files as the source for the > installation. > > One issue with v2v of a S10 zone is that those zones can be sparse > but the solaris10 branded zone must be whole root. The current plan > is that the zone must be readied on the source system. This will > mount any inherited-pkg-dirs and an archive can then be made of the > readied zone. > > The p2v conversion during the installation of the zone will again > be similar to the native p2v process [9]. > > === Interface Table === > > The solaris10 brand seeks minor release binding. > > Exported Interfaces Stability > ---------------------------------------------------------------------- > "solaris10" brand name Committed > "SUNWsolaris10" brand template name Committed > > For the solaris10 brand > brand-specific install and > attach subcommand options Committed > documented in this case > > /usr/lib/brand/solaris10 directory Committed > > SUNWs10brandr, SUNWs10brandu packages Committed > > /usr/lib/brand/solaris10/version Committed > > getzoneid(), zone_getattr(), > ZONE_ATTR_BRAND and > S10_EMUL_VERSION_NUM,attibutes Consolidation Private > > > Imported Interfaces Stability > ---------------------------------------------------------------------- > brandz[2] Project Private > > Nevada syscall traps documented above Consolidation Private > > flar(1m) Evolving > Contract included with this case > > > REFERENCES > > 1. PSARC 2002/174 Virtualization and Namespace Isolation in Solaris > 2. PSARC 2005/471 BrandZ: Support for non-native zones > 3. PSARC/2007/350 Etude: Migration Technology > 4. PSARC/2008/125 Etude Part Deux > 5. PSARC/2006/030 Zone migration > 6. PSARC/2006/366 Stack instances: Exclusive IP stack per zone > 7. PSARC/2006/357 Crossbow - Network Virtualization and Resource Management > 8. PSARC/2007/621 zone update on attach > 9. PSARC/2008/766 native zones p2v > _______________________________________________ > zones-discuss mailing list > zones-discuss@opensolaris.org _______________________________________________ zones-discuss mailing list zones-discuss@opensolaris.org