Re: [pkg-discuss] initial linked images design doc

Shawn Walker Thu, 27 May 2010 11:18:21 -0700

On 05/26/10 09:25 PM, Edward Pilatowicz wrote:

hey shawn,


thanks for looking this over.  i've addressed and deleted all the s///
style comments below and my replies to the other issues are inline.

ed

On Mon, May 24, 2010 at 01:39:43PM -0700, Shawn Walker wrote:

On 03/ 7/10 12:23 PM, Edward Pilatowicz wrote:

hey all,

so in an effort to get pkg image-update to work with zones, i've been
prototyping support for linked images within pkg(5). i don't have a fully
functional prototype to pass around yet, but i have documented some
aspects of my prototype in an initial design doc. i've attached that doc
so that people can take a look, see what i'm thinking, provide feed
back, etc. all comments and criticism are welcome.


lines 10-34:
     So are linked image types just *additional* types of images (e.g.
     full, partial, etc.) or is this a subtype of an existing image type?


hm.  so i didn't even realize that pkg had support like things for
partial and user images...  looking at the code i see that
IMG_TYPE_PARTIAL and IMG_TYPE_USER are not implemented.  so lucky for me
i don't have a lot of re-designing todo to integrate with these types
of images.  ;)

User images are not "fully implemented", but they are implemented in thesense that image type is actually being used. Notably, the glassfishteam uses that functionality already. What's missing from the userimage type is the linked image capability.

my current prototype doesn't really match either description you've
given.  i don't have any new IMG_TYPE_* types, and i also am not
creating any new classes derived from the Image class.  so far, i've
been think of linking as more of an attribute of two images, and the
link type is then an attribute of the linking.

If I understand you correctly then, you view the current image types asa low-level image format thing while linking functionality is a moregeneric image synchronisation mechanism?

lines 164-205: User linked images:
     I'd like to make the observation that it seems like user and zone
     linked images fall into two high level models:

     - push type

     This is roughly what you describe zone images as; a push-only model
     where the parent tells any child images what the new state of the
     parent image is after operations are completed.

     - pull type

     This is what you describe user images as currently; a pull-only
     model where the user or some service is responsible for detecting
     changes in a parent image and syncing the child user image.

     I'd suggest that user images could be both push and pull.  That is,
     an administrator may find it useful to create a special set of
     user images for creating isolated software stacks of specific
     packages that can't normally co-exist.  This is often seen with
     software installed in /opt as an example.


correct.  but user images are out of scope for the current project.  i
talk about them as an example of what else is possible.

but note that user linked images are really an extension of default
linked images.  (which are in scope for the project.)  default linked
images just assume a push model and a user of root.

so i've been using the terminology export/import where you're saying
push/pull.  would you like me to change the terminology i'm currently
using?

It's not my personal preference, but when I first saw export, import, itwas a bit confusing. I generally interpreted export to be a "save somedata here in a format something else can use" whereas 'export' in thiscontext is almost like going to File -> Export in an application, savingthe data, and then watching as it automatically starts up anotherprogram and imports it. While convenient, the terminology left me a bitsurprised ;)


The short answer is, I don't know.

..

lines 375-384:
     When you say "constraints data", you really mean "image state",
     right?  Or you imply "constraints data" in the more generic sense
     (e.g. package versions, package state, package constraints)?


to dive into details, the "constraints data" maps to:

- a list of packages (which includes their full version numbers and
   publishers) installed in the parent image.  this list doesn't include
   ALL the packages in the parent image, just those required to implement
   the specified image syncing policy. this data could arguably be called
   the parent data "image state", but it's not the complete state.

Although, in some cases, the packages required to complete the syncingis the entire image state, right?


So this is really the full image state or some subset of it, correct?

- some linked image property values.  , linked image properties may have
   different authoritative sources for different types of linked images.
   for example, for zones the pkg synchronization policy will be coming
   from the global zone.  so in this case we need to tell the zone what
   to keep in sync.  while for user images, the image itself can decide
   what it wants to keep in sync so it doesn't need to have that property
   exported to it.  this doesn't really seem like parent data "image
   state".

lines 386-391:
     I think /var/pkg/state/export /var/pkg/state/import or
     /var/pkg/cache/state/export, /var/pkg/cache/state/import
     would fit better with the on-disk format proposal.  I don't
     understand why 'master' is part of the pathname here since
     there's no corresponding linked/child/* set of directories
     in this proposal either.


master is there because i've gone through lots of iterations, some of
which used to have a "child" directory.  my current prototype doesn't
have "child" directories, but it might have to come back.

the underlying problem is that the gz can't really safely write directly
to ngz filesystems.  this makes exporting data to a running zone
difficult.  in my current prototype i use fork() and chroot() to solve
this problem.  another way to solve this problem would be to have
"child" directories that the master image that the gz writes into, and
those directories are read only lofs mounted into each zone before that
zone is booted.

wrt location, i have no issues with moving the directories around to
/var/pkg/state or /var/pkg/cache.  but i do like having "linked" in the
path name designate that this as linked image data.

lines 393-399:
     This bit is really confusing to me:

     - What does the linked image type and content policy stored here
       indicate?


um.  it indicates the type of linked image and the constraints on it's
content?

so for example.  in a zone the type would be zone.  in a default image
the type would be default.

for the content policy, it would be whatever the user had configured.
in zones, the minimum content policy value would be to keep cip packages
in sync.  (which will just be called minimum.)  there could be other
content policies that would perhaps say "keep all packages in sync".

     - How does that meaning change based on whether it is in the export
       or import directory?


the meaning doesn't change.  the import vs export directories are more
about who can write to the directories and where the data is coming
from.  as mentioned previously, in some cases i though that the export
directory might actually be a read-only lofs mount.

     - What form is this information stored in and when does it get
       stored?


the data is stored in private .xml files.

We've been avoiding using XML if possible due to the extra dependenciesit drags in and for other reasons.


Can you use JSON instead?

exported data gets written:

- an operation on the parent image changes the parent image state.  for
   example, a pkg install/uninstall/image-update on the parent image.

- an explicit "pkg linked-export" is done.

- an linked image operation initiated from the parent image.  for
   example, if you do "pkg linked-sync -l<linked-image-name>" then we'll
   re-export the linked image data before starring the linked-sync
   operation on the specified image.

i haven't implemented the import/pull yet so i can't really nail down
when that will occur, but my best guess is that any pkg operation on an
child image which is going to write to the image would take a write lock
on the image, update the import/pull data, and then starting it's
operation.

     - It seems like images that you *push* changes to should simply get
       their data from a temporary directory that contains the exported
       information for the duration of the operation.  I'm not sure this


the exported/pushed constraint data really needs to persist across
operations.  the reason for this is that not all linked image operations
are initiated from the parent image.  for example, someone within the
zone might run pkg(1) to install some new software, and that command
should not allow installation of software which would violate the
constraints on the image.  i tried to call this out in the "zones
requirements" section:

- all the constraints required to perform software management operations
   within a non-global zone must be available within that zone.

It's really not going to be practical to have a parent image blindlywrite out a *second* set of image state data every time an operationhappens. I say this since the image linking model being proposed hereallows images to link to parents without a parent's knowledge, that alsomeans that we'd always have to write out this second set of state data.

I'd like to work with you to figure out some way we can avoid that, oralter the design where being able to link to a parent image isn'tpossible without some sort of explicit enabling of linking functionality.

       should be stored in $IMGROOT/pkg.  For images that *pull* changes
       from the master image, realistically this information needs to be
       generated on demand instead of being stored in $IMGROOT/pkg.  If
       the purpose of this directory is to simply cache information that
       was imported or is being used to perform a sync operation, then
       I'd say it should live under /var/pkg/cache/linked (to fit with
       the on-disk proposal).  Does this data need to be kept longer
       than that?


for pull image this is debatable.  i was imaging that there would be a
desire to be able to do operations on linked child images when the
parent image is not accessible.  for example, if i had a user image in
my home directory, which is nfs accessible from any machine, and that
image was currently sync'd up with jurassic, should i be able to install
packages into that image on another machine and still have the linked
image constraints apply to that operation?  if so the information needs
to be persisted within the image.

     - I think that if you're exporting "constraint information" that it
       really sounds like you should just be generating an incorporation
       package since that's equivalent and fits with existing
       functionality.  In particular,  I would anticipate pkg freeze
       to be based on incorporations so that would fit better with what
       you've proposed here.


well, kinda.  i certainly could represent what's installed in the
gz/parent as a package that use "require" and "incorporate"
dependencies.  this is actually, what i do, just all in memory.  (when
pkg runs it reads the constraint information from disk and generates
packages based on it.

now i'm going to guess that your next suggestion will be to do this but
just on disk.  ie, write the constraint package into the child image and
into the child images installed catalog.  unfortunately i don't really
want to do this, and there are a few reasons.

- this requires that the parent image constantly update lots of data in
   the client image.  and as i've already mentioned, in the case of zones
   this represents a security issue.  we can't trust any data in the
   child images, so reading in catalogs is a problem unless we've audited
   all the python code involved with parsing catalogs.  i'd really prefer
   to avoid that.

I don't really follow this argument; none of our code has been auditedto that level of trust. Nor was I suggesting that the parent imagewould be the one to perform these updates. I was only suggesting thatthe parent image (which already generates the constraint data we have totrust) simply generate that information in a *different* format whichthe child would then cache and use on its own.

- also, with this approach there is no good way to deal with out of sync
   images.  ie, images that don't adhere to their sync policy.  you can't
   represent out-of-sync images in the installed catalog because you'd
   have an invalid state, which leads to a very unhappy sat solver.

I don't follow; you could have that problem either way. Can you give anexample?

     - If you're going to export state information for the parent image,
       then that's what's contained in the /var/pkg/state/installed
       directory.  Specifically, just catalog.attrs and catalog.base.C.


i'm aware of the data stored in there.  but i don't actually want to
export all that data.  a zone should not know more about the global zone
than it needs to.  the ngz needs to know about the software installed in
the gz that needs to be keep in sync with the gz, but it doesn't
necessarily need to know about all the monitoring and management
software installed in the gz.

Except, based on policy, we may be passing that entire set of dataanyway. So what's the issue?

so really the zone needs to know about a subset of the information in
catalog.attrs and catalog.base.C.  now admittingly, i could represent
that data in the same format as the current catalog.attrs and
catalog.base.C files, but i haven't seen any advantage to doing that.
as i mentioned before, currently i just have an xml file that lists the
information, but i'm not wedded to that, it's more an implementation
detail.

The advantage of re-using the catalog format is avoiding introducing anadditional project private format and the fact that a lot ofoptimisation work has been done to efficiently store catalog data inthat format.

...

lines 421-427:
     Since a linked image can realistically only account for the last
     sync'd state of a parent image, it seems like the parent image
     could simply provide constraints as a dynamically generated
     incorporation package (manifest) as part of the export/import
     process.  That manifest could then be stored locally and treated
     exactly like a package normally would be without any special logic.
     Doing so also makes it possible for the normal memory management
     that the client api uses internally to not have to have special
     logic to marshal this information to disk (possibly repeatedly).


possibly, but isn't manifest dependency information cached in the
catalog?  if so this would require re-writing the catalog in zones which
requires reading the catalog in zones.  as i've pointed out before, that
would be bad.

The catalog has to be read in zones; again, I'm not following thesecurity/bad logic. A catalog is used to track the state of the image,if we don't trust catalogs, then the client is useless.


The only way to know the state of a child image is to read its catalog.

in case it isn't obvious, i'm really trying hard keep data flowing in
one direction.  from the parent image to the child.  i'm also trying to
keep the data flow simple.  having pkg traverse directories in a zone
image and read data from that zone isn't safe.  i tried to call this out
early in the document with in the "zones requirements" section:

- since zones are untrusted execution environment, global zone pkg(5)
   operations should not be required to read data from non-global zone.
   ie, any data flow required to support linked images must be from a
   global zone to a non-global zone, and never in the reverse direction.

I don't believe I've suggested anything that would require a ngz to readdata from a gz. I've only suggested *how* to export the data and theprocess to use to import it.

also, there is one more thing to consider.  not all the linked image
constraints on an image can be represented as a package.  specifically,
package exclusions can not be adequately represented with exclude
dependencies.  these need to be fed directly to the solver.  if we
wanted to represent these in packages we'd have to invent a new
dependency actions or attribute.  i initially prototyped this and talked
to bart about it and he just recommended that i feed exclude
dependencies directly to the solver, which is what i'm currently doing.

Can you give an example of how exclude dependencies can't currently beexpressed adequately using depend actions?

lines 429-436:
     This could be greatly simplified by simply stating that the special
     system packages will not have a publisher.  That is simpler and
     works better than having a special string constant value for the
     publisher.  That also fits nicely with the transport framework
     since a publisher is required to perform transport operations, so
     simply checking for "if pfmri.publisher" is faster and we avoid
     the memory usage for the publisher string and parsing.  Every FMRI
     normally has a publisher, so you can pretty much be guaranteed that
     in any case where you'd care, you can rely on simply checking to see
     if an FMRI has a publisher.


sure.  i was never wedded to the name "none", and if this can be made to
work no problem.  it's just that there's a lot of code that assumes a
publisher exists, so going this route might end up requiring more code
changes that defining a special publisher.  (bart had suggested using an
invalid character in the publisher name to signify that it was
"special.")

I'd rather fix the code that assumes a publisher exists than propagatethat bad assumption. There's lots of code that works without apublisher too.

The logic for just not having one at all is simple--it completely avoidsany potential collisions with an actual publisher and it ensures thatthe behaviour that we want for these is generic and consistent anddoesn't rely on magic values.

...

lines 456-465:
     Why would the first package be empty instead of just making this an
     update from the out of sync version to the in-sync version of this
     package?  Or alternatively, an uninstall of the old one and an
     install of the new one?


because as i alluded to before, you can't have an image with
inconsistent packaging installed.  the sat solver can't transition you
to a valid state if your starting state is invalid.  ie, if you have pkg
X installed and it depends on pkg Y.2, but pkg Y.1 is installed, you're
hosed and can't plan anything.

If the state is truly invalid, then how can we safely upgrade the image?I'm totally lost here.

My understanding of what you had written was that you had one packagethat represented the last known good set of constraints on the image,another package with the new set of constraints on the image, and thenone to account in progress updates.

So, my belief was that the first package represented what was already"installed" in the image, that is, its initial state. I don't see howthat could possible be invalid to start with since it represents thestate of the child image after the last operation was performed.

My belief then was that the other two packages represented the state youwere upgrading to, which the solver should be able to handle, and ifthey're invalid, that's because the parent image is in an invalid state,at which point, none of this matters.

line 481-491:
     This seems difficult and fragile for several reasons:

     - no guarantee that the operation in progress for the parent will
       complete successfully


so what?  this data is used for planning operations initiated from
within parent images.  (any operation initiated directly on a linked
image can't really take into account on-going changes in a parent image,
and eventually we'll need locking to enforce this.)  for these types of
operations, we first do the planning across all the images and then
execute.  so if the parent doesn't execute it's plan for some reason we
won't update the children.

The so what part is there's data exported that may not actually reflectthe state of the parent image, and so how can a child image safely knowwhat it should sync its state to? I was specifically referring to thecase where you have images that are linked to a parent and the parentdoesn't know about them.

     - does this distinguish between an operation being *planned* in
       the parent and one that is intended to be executed?


no.  this package exists whenever an operation is being planned in a
parent image.

For the disconnected image sync case, this seems difficult then as Ithink those types of children should probably not synchronise unlessit's the current state of the parent as opposed to a "proposed" state.

     - this implies that a parent image would always have to export
       (marshal?) its plan data during operations so that images that
       are linked to the parent (that the parent doesn't know about)
       can use this information


um.  the parent can't really marshal data out to images it doesn't know
about.  child images which the parent doesn't know about may be out of
sync wrt their content policy after any parent operation.  subsequently,
this situation can be detected and corrected with a linked-audit and
lineked-sync on the child image.

So then, you're saying that usage case is one where we generate theparent image state on demand instead of the active export we doautomatically when the parent knows what linked images it has?

...


lines 565-570:
     independent-minimal:

     - This is a bit confusing to me as it seems to overlap with the
       li-content-policy property above.  I see that you referenced
       that, but what I don't understand is why this wouldn't *always*
       be the case.  Specifically, what's the real difference between
       this value and "independent"?  I would think that the linked
       child image's content-policy would have to always be honoured.


so the values of this property needs to change, probably to:
        li-update-policy = { full | minimal }

full would be the same as the old independent and minimal would be the
same as the old independent-minimal.  (there was a reason for the
"independent" bit, but that reason has been rationalized away.)

here's an example of the difference between full and minimal.

lets say "pkg image-update" is run in the global zone, and we make a
plan to update the gz from snv_140 to snv_141.

now, if li-update-policy = full, then we'll go to that child image and
do an image-update there as well.  so we'll likely update the image from
snv_140 to snv_141.  but also, since it's a zone and it may have
different publishers and software from the gz, we'll also check for
updates from those publishers.  so if oracle is installed in that zone
and it comes from a special oracle publisher, we'll update that if
there are updates available.

now, if li-update-policy = minimal, then we'll go to that child image
and only update the minimum amount of software that is required to keep
us in sync with the changes happening in the parent.

so in the end, the content policy is always honored, but this just
determines how aggressive we are with child updates when an image-update
is done an a parent image.

Its still confusing to me that the child can have a "full" sync policyas well as the parent. I must be missing something still. Why wouldn'tyou just set the full/minimal behaviour in the child image instead, orwhy can't the child get the same behaviour without setting something inthe parent image? It feels like this is some sort of override.

...

     Also, why require the -l option for specifying the identifier of
     linked images?  Our other subcommands simply accept positional
     operands instead.  Obviously you need to use an option for image-
     create, but the other subcommands don't really need one.


so in prototyping i've settled on the following convention:
        pkg linked-audit
                - assume the current image is a child and audit it.

        pkg linked-audit -a
                - assume the current image is a parent and audit all
                  it's children

        pkg linked-audit -l foo
                - assume the current image is a parent and audit it's
                  child named "foo"

this convention will apply to the following subcommands:
        linked-export
        linked-sync
        linked-detach

this convention (sans the -a option) will apply to the following
subcommands:
        linked-property
        linked-set-property

I'm still not thrilled with the idea of starting all of the subcommandswith linked-; I don't think image-update and image-create alone set theexamples for how all our subcommands should be named.


I guess the -l behaviour makes sense.

...

     image-create
         What characters are allowed in the linked image name?

         It seems like being able to name linked images can lead to
         naming collisions.  I'd like to suggest that this is strictly
         a human-readable "alias" for the image. And that you also change
         images to have a unique "id" (a UUID specifically), so that in
         the event that there is a naming collision,  you can still
         perform operations on an image using it's unique ID.

         Alternatively, you could leave it as "name", but I'd still like
         to see images have a unique ID that we could rely on instead.
         Again, I wonder if the name here is a property of the linked
         image being created, or something that the master is recording.

         If the name is a property recorded in the linked image, then
         I'd also suggest that the parent may want to reference linked
         images only by their unique id&  path to allow the name to be
         changed at any time.


my current thinking is that a fully qualified linked image name will be:
        <linked-image-type>:<linked-image-name>

So in the event that disambiguation is required, the user can qualifythe name with the type, but it isn't actually part of the name, correct?

where the constraints on linked-image-name are type specific.  if there
are no name collisions then a user can just specify<linked-image-name>.
if there are collisions a full name needs to be specified.  so some
examples would be:
        zones:<linked-image-name>
        default:<linked-image-name>
        user:<username>,<linked-image-name>

for zones, the linked-image-name would have to conform to the
restrictions on zone names.

for user images, the username would have to conform with what's
specified in passwd.4 as a valid username format.

I don't think username should be in the image name. User images aren'tliterally tied to a specific "user". usernames can also be consideredsecurity-sensitive information, so I'd rather not have them there.

If you're not saying that the username would be part of the actual name,how would you determine what <username> should be to disambiguate?

It also seems this might be difficult in some environments given the ARCchange being proposed right now to allow usernames with more than 8characters. In an NFS or mixed environment, this might cause difficulties.

for default images and for the user images, the linked-image-name would
probably have a conservative format similar to usernames and zones,
say:
        - case sensitive
        - first character must be alphabetic
        - can contain alphanumerics plus hyphen (-), underscore (_), and
          dot (.)

wrt who's recording the name, that will vary based on the type of linked
image.  for images that the parent is not aware of, obviously that
information will be recorded in the child.  for zones that information
will be stored in zone configuration files.

wrt uuids, i'm not to keen on using them because of the fact that the
place where we store information is distributed.  for zones images, all
our information is stored in the zones configuration files.  so there
really is no single authoritative place where we could store these uuids.

My suggestions for using uuids was that they are far less likely tocollide with the sort of textual names suggested above. The textualnames are great for mere mortals that have to identify these images onthe command line, but uuids are almost guaranteed to providedisambiguation while what has been suggested above is far less likely.

Of course, one could argue that the probability of a naming collision islow to begin with...


Cheers,
-Shawn

_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Re: [pkg-discuss] initial linked images design doc

Reply via email to