Jerry,
I am not sure if that was described in the proposal.
Would it be possible to run s10 brands ontop of future s10 global zones ?

-----Ursprüngliche Nachricht-----
Von: Jerry Jelinek <gerald.jeli...@sun.com>
Gesendet: 12.5.'09,  13:28

> Enclosed is a first draft of a spec. for the S10
> brand which we plan to submit for a PSARC
> inception review.  Please send us any comments
> or questions.
>
> Thanks,
> Jerry
>
> ---
>
>             S10C: A Solaris 10 Branded Zone for Solaris.Next
>
>                       Gerald Jelinek, Jordan Vaughan
>                     Solaris Virtualization Technologies
>
>
> [A note on terminology: This document uses the terms "Solaris 10" and
>   "Solaris.Next" very frequently.  As such, the abbreviations "S10" and
>    "S.next" respectively are used interchangeably with the longer forms.
>    The term "virtualization" is abbreviated as V12N.]
>
>
> Part 1: Introduction
> ____________________________
>
> Each new minor release of Solaris brings with it the well known problems
> of slow user adoption, slow ISV support and concerns about compatibility.
> The compatibility concerns will be more pronounced with the release of
> S.next since it's anticipated that there will be greater than normal
> user-visible changes (e.g. the packaging system, etc.).
>
> Fortunately, since the last minor release of Solaris (Solaris 10), V12N
> techniques have become widespread and V12N can be used as a solution to
> ease the transition to the new version of Solaris.  Zones[1] combined
> with a brand[2] are particularly well suited for this task since the host
> system is actually running S.next, whereas this is not necessarily the
> case with other V12N solutions.  In addition, zones are usable on any
> system which runs S.next, which is also not the case with other V12N
> alternatives.
>
> We already have a proven track record delivering this sort of
> zones/brand based solution to enable running earlier versions of Solaris
> on S10 [3, 4], so in one sense this case breaks little new ground.
> However, the earlier 'solaris8' and 'solaris9' brands were used to host
> releases that are very static as compared to hosting a zone running S10.
> In addition, S.next can be expected to continue to change rapidly for
> the forseeable future.  Given this, a 'solaris10' brand for S.next poses
> additional challenges for projects on both the S10 and S.next sides of
> the system.  Many of these challenges are outside of the scope of an
> architectural review and include developer education, testing and
> procedural changes.  However, the existence of this brand could
> potentially impact future projects in various ways and at a minimum will
> require ARC consideration for future reviews.  The existence of this
> brand can be seen as a potential "tax" on all projects which work on both
> sides of the user/kernel boundary for both S10 and S.next.
>
> The benefits of the brand are as follows:
>
>    For customers:
>      - Provides a solution to cope with compatibility differences between
>        S10 and S.next
>      - Protects investment in S10 infrastructure, training, and internal
>        support
>      - Minimize the cost of consolidating Solaris 10 systems
>      - Enables deployment of new technologies in S.next (e.g., crossbow)
>        while still running applications on S10, thereby limiting risk to
>        production environment
>      - Avoids or delays required application recertification
>
>    For Sun:
>      - S.next is adopted sooner
>      - Provide a Solaris compatibility environment for S.next
>      - Sun is a solution provider easing the burden of getting to S.next
>      - Provide cross-platform virtualization solution for S.next across
>        all hardware (it is the only V12N solution on M-Series)
>
> This has been identified as a required feature for S.next.
>
> === Project Overview ===
>
> As with the earlier 'solaris8' and 'solaris9' brands, this project
> delivers the following:
>
>     - A Branded Container which emulates Solaris 10's user environment,
>       based on the BrandZ infrastructure provided with zones.
>       This brand is called 'solaris10'.  Only Solaris 10u8 and
>       beyond will be supported and tested in the zone.
>
>     - A mechanism for archiving existing Solaris 10 systems and for
>       redeploying those archives into the branded zone. This
>       process is referred to as p2v and uses the same techniques
>       as the 'solaris8' and 'solaris9' brands.
>
> In addition, the following additional capabilities will be provided
> as compared to the 'solaris8' and 'solaris9' brands.
>
>     - This brand will be supported on all hardware architectures
>       that run S.next (sun4v, sun4u and x86).  The specific platforms,
>       particularly sun4u, will be the same as are certified for S.next.
>
>     - A "virtual to virtual" or v2v mechanism for archiving existing
>       Solaris 10 native zones and for redeploying those archives into
>       the branded zone on S.next will be provided.  The process will be
>       very similar to the existing zone migration [5] feature except that
>       the zone's brand will be changed as part of the process.  In
>       addition, if the zone is sparse on S10 it must be converted to
>       a whole-root zone during the migration.
>
> Part 2: solaris10 Brand
> ____________________________
>
> The solaris10 brand is conceptually similar to the existing solaris8
> and solaris9 brands and builds directly on the BrandZ infrastructure
> that was created to support the lx brand.  Familiarity with BrandZ
> and the solaris8 and solaris9 brands is assumed.
>
> At this time the design and development of the brand is only
> supporting the shared stack [6] networking model in which the zone's
> network is managed by the global zone.  The exclusive stack model
> is anticipated to require more complex solutions or emulation due
> to the introduction of Crossbow [7] into S.next.  The exclusive
> stack issues will be resolved before commitment review.
>
> The ZFS ioctls have been audited and no issues have been seen.  Because
> so much of ZFS has been backported to S10 updates earlier than the first
> S10 version being supported in the brand (S10u8), ZFS delegated datasets
> appear to work fine.  Further testing needs to be done and future
> ZFS enhancements might require work at some point.
>
> === System Call Emulation ===
>
> This section details the system call emulation provided by the current
> solaris10 brand module.
>
> The following system calls are currently being emulated.
>
>          SYS_exec                11
>          SYS_ioctl               54
>          SYS_exeve               59
>          SYS_acctctl             71
>          SYS_getpagesizes        73
>          SYS_issetugid           75
>          SYS_uname               135
>          SYS_pwrite              174
>          SYS_sigqueue            190
>          SYS_pwrite64            223
>          SYS_zone                227
>
>      SYS_exec
>      SYS_exeve
>          The emulator interposes on these system calls to provide a
>          convenient mechanism for branded processes to be able to spawn
>          native processes.
>
>      SYS_ioctl
>          Emulate process contract ioctls for init(1M) because the
>          ioctl parameter structure changed between S10 and Nevada.
>
>      SYS_acctctl
>          The mode shift, mode mask and option mask for acctctl changed for
>          crossbow.
>
>      SYS_getpagesizes
>          New first arg "legacy" must be set to 1.
>
>      SYS_issetugid
>          S10's issetugid() syscall is now a subcode to privsys().
>
>      SYS_uname
>          The emulator simply passes this through, then modifies the result
>          upon return, so that the system call returns 5.10 for the 'release'
>          field and 'Generic_Virtual' for the 'version' field.
>
>      SYS_pwrite
>      SYS_pwrite64
>          pwrite's behavior differs between S10 and Nevada when applied to
>          files opened with O_APPEND.  The offset argument is ignored and the
>          buffer is appended to the target file in S10, whereas the current
>          file position is ignored in Nevada (i.e., pwrite() acts as though
>          the target file wasn't opened with O_APPEND).  This is a result of
>          the fix for:
>             6655660 pwrite() must ignore the O_APPEND/FAPPEND flag.
>          Emulate the old S10 pwrite() behavior by checking whether the target
>          file was opened with O_APPEND.  If it was, then invoke the write()
>          system call instead of pwrite(); otherwise, invoke the pwrite()
>          system call as usual.
>
>      SYS_sigqueue
>          New last arg "block" flag should be zero.  The block flag is used
>          by the Opensolaris AIO implementation, which is now part of libc.
>
>      SYS_zone
>  See discussion below.
>
> === zone(2) support ===
>
> Zones have been part of S10 since its FCS, so in general S10 is
> already zone-aware and does the right thing in most cases.  Commands
> that are zone-aware will continue to work as they do today in
> S10 native zones.  One set of commands which does require emulation
> are the S10 SVr4 packaging and patch commands.  Those commands are
> zone-aware and in some cases will check if they are running in the
> global zone and refuse to function if not.  If running in the global
> zone they will also attempt to look for other zones to operate on.
>
> The brand emulation interposes on the zone syscall and selectively
> provides emulation when the running command is one of the SVr4
> package or patch commands.  In these cases the emulation indicates
> that it is the global zone (zoneid 0) and various zone attributes,
> such as the zone brand itself, are emulated.  In all other cases
> the syscall is passed through so that the other S10 commands continue
> to behave as they do currently.  Because the solaris10 branded zones
> are whole-root zones, all packaging and patch operations will
> be successful, although the kernel components of the package or
> patch are not used.  This is exactly the same behavior as on the
> solaris8 and solaris9 branded zones.
>
> One further considerations for zones is related to the p2v process.
> During p2v there may be zones on the original physical system.
> Since zones do not nest, p2v-ing these systems means that the zones
> themselves are not usable inside the branded zone.  This is detected
> when the zone is installed and a warning is issued indicating that
> any nested zones will not be usable and that the disk space could be
> recovered.  Those zones can be migrated ahead of time using the v2v
> feature described below.  In addition, a future project is planned
> which will assess a system prior to p2v and report any possible issues
> that may arise.  Detecting zones would be part of that report.
>
> === solaris10 Brand: What's Not Emulated ===
>
> This project does not make any changes to existing native zones
> limitations.  One point to note is that TX will continue to
> be incompatible with branded zones.  Customers using TX on S10
> systems will need to transition to a certified, native S.next TX
> solution.  Discussions with the TX team indicate that this is
> the normal behavior for users of TX, since the base OS itself must
> be certified for TX.
>
> === Versioning ===
>
> Because of the potential issues with compatibility of various releases
> of S10 hosted on differing releases of S.next, a basic versioning
> system is incorporated into the brand.  This versioning system works
> both ways.  That is, the brand emulation can check which version of
> S10 is being hosted in the zone and adjust the emulation accordingly.
> Likewise, future S10 updates which require specific emulation can
> indicate that a specific version of the emulation is required.  If
> necessary, they can also check if they are running in a branded zone
> and, if so, determine what version of emulation is available.  The
> initial release of the software won't need this versioning mechanism,
> but it is being included to cope with possible future enhancements to
> either S10 or S.next.
>
> If a change is made to S10 which requires an enhancement to the brand
> emulation library, it is expected that this change would be delivered
> in a S10 KU patch which provides components on both sides of the
> user/kernel boundary.  When the branded zone boots, the brand boot hook
> determines the minimal version of the KU that is installed in the zone
> to verify that the zone's release is supported (i.e. currently the
> minimal KU will be the one from S10u8).  It then makes the associated
> version (i.e. version 0 of the emulation) available as an attribute on
> the zone.  The brand library can then use this information to provide
> conditional emulation if needed.  Future projects that enhance the
> emulation for new features in S10 can add a check for a different KU
> version number which would then provide associated versions (e.g. 1, 2,
> etc.) to the brand library.
>
> If the KU version is not sufficient, future S10 projects may need to
> design some other version check for the brand to enable it to properly
> detect the S10 changes.  The ability to detect the KU version is
> already covered by the contract on the zone "update on attach"
> feature [8].
>
> The situation is more complicated for future changes within the S10
> code base which will require associated enhancements to the brand
> emulation.  There are two mechanisms being proposed.
>
> The first mechanism is that the future version of S10 can specify
> that it requires a minimal version of the brand emulation.  It does
> this by delivering a version number into the
> '/usr/lib/brand/solaris10/version' file on S10.  When this future
> version of S10 is p2v-ed into a solaris10 branded zone, the
> solaris10 brand will check for the presence of this file and if it
> exists, the brand will verify that the brand's version is greater than
> or equal to the version specified in the S10 file.  If not, then an
> error will be emitted and the zone p2v will fail, leaving the zone in
> the configured state.
>
> If the '/usr/lib/brand/solaris10/version' file is missing on S10, that
> indicates that the version of S10 is still compatible with the initial
> release of the solaris10 brand emulation.  The first time a project is
> backported to S10 which requires an enhancement to the emulation, this
> file must be created and the version number in the file will be bumped.
>
> This first mechanism is useful if a future S10 update is fundamentally
> incompatible with an older version of the S.next brand emulation.
>
> The second mechanism allows projects that have been backported to S10
> to actually be brand aware.  A new zone attribute will be available
> indicating which version of the brand emulation is currently installed
> on the system.  For these future S10 updates, if they deliver a new
> feature which requires changes to the brand library, that S10 feature
> can also determine if it is running in a branded zone and if so, if the
> necessary emulation is available.  If the newer S10 update is running
> in a zone on an older version of S.next which does not provide the
> required emulation, the S10 feature can adjust its behavior in the
> appropriate manner.
>
> The existing getzoneid() and zone_getattr(ZONE_ATTR_BRAND) functions can
> be used by S10 code to determine if it is running in a non-global zone
> and if that zone is a 'solaris10' branded zone.  A new solaris10
> brand-specific zone attribute, S10_EMUL_VERSION_NUM, is defined.  The
> S10 feature can use the zone_getattr(S10_EMUL_VERSION_NUM) function to
> determine if the brand emulation supports the feature.  The getzoneid()
> and zone_getattr() functions are already used throughout the ON
> consolidation for code that is zone-aware.  These functions will continue
> to be consolidation private.
>
> Engineers backporting features to a future S10 update will need to
> first determine if that feature requires enhancements to the solaris10
> brand library.  If so, they will then have to enhance the emulation
> in S.next and bump the emulation version number.  They can then either
> bump the minimal emulation version number in the
> /usr/lib/brand/solaris10/version file on S10 during the S10 backport or
> they can add the appropriate checks to the backported S10 code so that
> it can determine if the support is available in the brand library and
> change behavior accordingly.
>
> This obviously adds a great deal of complexity to projects backporting
> features to future S10 updates if those features require emulation to
> function correctly in the branded zone.  Ideally, projects requiring
> such enhancements to the brand emulation will not be backported.
> Perhaps the presence of the S10 brand on S.next may discourage projects
> from backporting since the brand provides S10 compatibility on S.next.
> Future projects which cross the user/kernel boundary and which request
> patch binding should be reviewed by the ARCs to determine if those
> projects must take the solaris10 brand into account.
>
> In addition to the above, any changes integrating into S.next which might
> impact the solaris10 brand will need to test the supported versions
> of S10 in the branded zone and make any needed changes to the solaris10
> emulation.
>
>
> Part 3: Archiving, Installation, p2v & v2v
> ____________________________
>
> The p2v process for the solaris10 brand is the same as for
> the solari8, solaris9 and native [9] brands.  A contract will
> be included with this case for the flar command to explicitly
> call out the use of flash archives for migrating system images
> into zones.
>
> The v2v process for migrating S10 native zones to solaris10 branded
> zones will support the same archive formats as p2v.  This process will
> use the 'zoneamd attach' subcommand since thats the existing
> interface for migrating [3] zones from one system to another.  The
> solaris10 brand attach subcommand will be extended to accept the
> following options which correspond to the same options in the install
> subcommand.
>
>      -a {path} - specifies a path to an archive to unpack into the zone
>      -d {path} - specifies a path to a tree of files as the source for the
>                  installation.
>
> One issue with v2v of a S10 zone is that those zones can be sparse
> but the solaris10 branded zone must be whole root.  The current plan
> is that the zone must be readied on the source system.  This will
> mount any inherited-pkg-dirs and an archive can then be made of the
> readied zone.
>
> The p2v conversion during the installation of the zone will again
> be similar to the native p2v process [9].
>
> === Interface Table ===
>
>    The solaris10 brand seeks minor release binding.
>
>      Exported Interfaces                     Stability
>    ----------------------------------------------------------------------
>      "solaris10" brand name                  Committed
>      "SUNWsolaris10" brand template name     Committed
>
>      For the solaris10 brand
>          brand-specific install and
>          attach subcommand options           Committed
>          documented in this case
>
>      /usr/lib/brand/solaris10 directory      Committed
>
>      SUNWs10brandr, SUNWs10brandu packages   Committed
>
>      /usr/lib/brand/solaris10/version        Committed
>
>      getzoneid(), zone_getattr(),
>      ZONE_ATTR_BRAND and
>      S10_EMUL_VERSION_NUM,attibutes          Consolidation Private
>
>
>      Imported Interfaces                     Stability
>    ----------------------------------------------------------------------
>      brandz[2]                               Project Private
>
>      Nevada syscall traps documented above   Consolidation Private
>
>      flar(1m)                                Evolving
>                                              Contract included with this case
>
>
> REFERENCES
>
> 1. PSARC 2002/174 Virtualization and Namespace Isolation in Solaris
> 2. PSARC 2005/471 BrandZ: Support for non-native zones
> 3. PSARC/2007/350 Etude: Migration Technology
> 4. PSARC/2008/125 Etude Part Deux
> 5. PSARC/2006/030 Zone migration
> 6. PSARC/2006/366 Stack instances: Exclusive IP stack per zone
> 7. PSARC/2006/357 Crossbow - Network Virtualization and Resource Management
> 8. PSARC/2007/621 zone update on attach
> 9. PSARC/2008/766 native zones p2v
> _______________________________________________
> zones-discuss mailing list
> zones-discuss@opensolaris.org

_______________________________________________
zones-discuss mailing list
zones-discuss@opensolaris.org

Reply via email to