Re: [pkg-discuss] initial linked images design doc

Edward Pilatowicz Thu, 27 May 2010 17:01:46 -0700

On Thu, May 27, 2010 at 11:17:29AM -0700, Shawn Walker wrote:
> On 05/26/10 09:25 PM, Edward Pilatowicz wrote:
> >On Mon, May 24, 2010 at 01:39:43PM -0700, Shawn Walker wrote:
> >>
> >>lines 10-34:
> >>     So are linked image types just *additional* types of images (e.g.
> >>     full, partial, etc.) or is this a subtype of an existing image type?
> >>
> >
> >hm.  so i didn't even realize that pkg had support like things for
> >partial and user images...  looking at the code i see that
> >IMG_TYPE_PARTIAL and IMG_TYPE_USER are not implemented.  so lucky for me
> >i don't have a lot of re-designing todo to integrate with these types
> >of images.  ;)
>
> User images are not "fully implemented", but they are implemented in
> the sense that image type is actually being used.  Notably, the
> glassfish team uses that functionality already.  What's missing from
> the user image type is the linked image capability.
>


i thought that the only difference between user images and all other
images was the place where the pkg metadata was stored.
".org.opensolaris,pkg" vs "/var/pkg".  if that's the extent of the
differences then i don't really care that much.


> >my current prototype doesn't really match either description you've
> >given.  i don't have any new IMG_TYPE_* types, and i also am not
> >creating any new classes derived from the Image class.  so far, i've
> >been think of linking as more of an attribute of two images, and the
> >link type is then an attribute of the linking.
>
> If I understand you correctly then, you view the current image types
> as a low-level image format thing while linking functionality is a
> more generic image synchronisation mechanism?
>

honestly, i didn't pay much attention to the existing image types so i
don't really have any view on them.  i do view the linking functionality
as a sync mechanism.

> >>lines 164-205: User linked images:
> >>     I'd like to make the observation that it seems like user and zone
> >>     linked images fall into two high level models:
> >>
> >>     - push type
> >>
> >>     This is roughly what you describe zone images as; a push-only model
> >>     where the parent tells any child images what the new state of the
> >>     parent image is after operations are completed.
> >>
> >>     - pull type
> >>
> >>     This is what you describe user images as currently; a pull-only
> >>     model where the user or some service is responsible for detecting
> >>     changes in a parent image and syncing the child user image.
> >>
> >>     I'd suggest that user images could be both push and pull.  That is,
> >>     an administrator may find it useful to create a special set of
> >>     user images for creating isolated software stacks of specific
> >>     packages that can't normally co-exist.  This is often seen with
> >>     software installed in /opt as an example.
> >>
> >
> >correct.  but user images are out of scope for the current project.  i
> >talk about them as an example of what else is possible.
> >
> >but note that user linked images are really an extension of default
> >linked images.  (which are in scope for the project.)  default linked
> >images just assume a push model and a user of root.
> >
> >so i've been using the terminology export/import where you're saying
> >push/pull.  would you like me to change the terminology i'm currently
> >using?
>
> It's not my personal preference, but when I first saw export,
> import, it was a  bit confusing.  I generally interpreted export to
> be a "save some data here in a format something else can use"
> whereas 'export' in this context is almost like going to File ->
> Export in an application, saving the data, and then watching as it
> automatically starts up another program and imports it.  While
> convenient, the terminology left me a bit surprised ;)
>
> The short answer is, I don't know.
>

i don't have a strong preference. so i'll stick with export/import for
now and i can always s/export/push/ and s/import/pull/ later.

> >>lines 375-384:
> >>     When you say "constraints data", you really mean "image state",
> >>     right?  Or you imply "constraints data" in the more generic sense
> >>     (e.g. package versions, package state, package constraints)?
> >>
> >
> >to dive into details, the "constraints data" maps to:
> >
> >- a list of packages (which includes their full version numbers and
> >   publishers) installed in the parent image.  this list doesn't include
> >   ALL the packages in the parent image, just those required to implement
> >   the specified image syncing policy. this data could arguably be called
> >   the parent data "image state", but it's not the complete state.
>
> Although, in some cases, the packages required to complete the
> syncing is the entire image state, right?
>

depending on the policy, it could be.

> So this is really the full image state or some subset of it, correct?
>

it depends on how you define image state.  it's a list of fully
qualified package names that is equal or a subset of what's installed in
the parent image.

> >>     - What form is this information stored in and when does it get
> >>       stored?
> >>
> >
> >the data is stored in private .xml files.
>
> We've been avoiding using XML if possible due to the extra
> dependencies it drags in and for other reasons.
>
> Can you use JSON instead?
>

sure.  it's just an implementation detail.

> >>     - It seems like images that you *push* changes to should simply get
> >>       their data from a temporary directory that contains the exported
> >>       information for the duration of the operation.  I'm not sure this
> >
> >the exported/pushed constraint data really needs to persist across
> >operations.  the reason for this is that not all linked image operations
> >are initiated from the parent image.  for example, someone within the
> >zone might run pkg(1) to install some new software, and that command
> >should not allow installation of software which would violate the
> >constraints on the image.  i tried to call this out in the "zones
> >requirements" section:
> >
> >- all the constraints required to perform software management operations
> >   within a non-global zone must be available within that zone.
>
> It's really not going to be practical to have a parent image blindly
> write out a *second* set of image state data every time an operation
> happens.  I say this since the image linking model being proposed
> here allows images to link to parents without a parent's knowledge,
> that also means that we'd always have to write out this second set
> of state data.
>

well, there are never two sets of data.  there is only one.  and the
parent just blindly updates that one.  eventually it will even do
locking before doing the updates.  ;)

i also don't understand your concern here.

for push based linked images, the parent must know about children so it
can push information to them.  for pull based children, the parent
doesn't have any knowledge of the children, so it will never write
anything to them.

> I'd like to work with you to figure out some way we can avoid that,
> or alter the design where being able to link to a parent image isn't
> possible without some sort of explicit enabling of linking
> functionality.
>

sure.  although i still don't understand why you would want this.  for
push based children you need to have write access to the parent image.
that's essentially "explicitly enabling".  for pull based children i
don't really see why you'd want to do this.  if i wanted to have a
partial user image that was synced up to whatever is installed on
jurassic, then since i have read access to jurassic i should be able to
do this.  i don't see why i'd have to ask the jurassic admins for
permissions first...

> >>     - I think that if you're exporting "constraint information" that it
> >>       really sounds like you should just be generating an incorporation
> >>       package since that's equivalent and fits with existing
> >>       functionality.  In particular,  I would anticipate pkg freeze
> >>       to be based on incorporations so that would fit better with what
> >>       you've proposed here.
> >>
> >
> >well, kinda.  i certainly could represent what's installed in the
> >gz/parent as a package that use "require" and "incorporate"
> >dependencies.  this is actually, what i do, just all in memory.  (when
> >pkg runs it reads the constraint information from disk and generates
> >packages based on it.
> >
> >now i'm going to guess that your next suggestion will be to do this but
> >just on disk.  ie, write the constraint package into the child image and
> >into the child images installed catalog.  unfortunately i don't really
> >want to do this, and there are a few reasons.
> >
> >- this requires that the parent image constantly update lots of data in
> >   the client image.  and as i've already mentioned, in the case of zones
> >   this represents a security issue.  we can't trust any data in the
> >   child images, so reading in catalogs is a problem unless we've audited
> >   all the python code involved with parsing catalogs.  i'd really prefer
> >   to avoid that.
>
> I don't really follow this argument; none of our code has been
> audited to that level of trust.  Nor was I suggesting that the
> parent image would be the one to perform these updates.  I was only
> suggesting that the parent image (which already generates the
> constraint data we have to trust) simply generate that information
> in a *different* format which the child would then cache and use on
> its own.
>

oh. well, what your suggesting is what i do right now.  (sorry, my
description was based on the premise that you'd want me to update the
on-disk catalogs and manifests.)

> >- also, with this approach there is no good way to deal with out of sync
> >   images.  ie, images that don't adhere to their sync policy.  you can't
> >   represent out-of-sync images in the installed catalog because you'd
> >   have an invalid state, which leads to a very unhappy sat solver.
>
> I don't follow; you could have that problem either way.  Can you
> give an example?
>

what i'm describing above is the problem that i also mentioned below.
if an image is out-of-sync with it's constraints, you can't represent
those constraints via an installed package.  if you do, then the sat
solver can't solve anything.


> >>     - If you're going to export state information for the parent image,
> >>       then that's what's contained in the /var/pkg/state/installed
> >>       directory.  Specifically, just catalog.attrs and catalog.base.C.
> >>
> >
> >i'm aware of the data stored in there.  but i don't actually want to
> >export all that data.  a zone should not know more about the global zone
> >than it needs to.  the ngz needs to know about the software installed in
> >the gz that needs to be keep in sync with the gz, but it doesn't
> >necessarily need to know about all the monitoring and management
> >software installed in the gz.
>
> Except, based on policy, we may be passing that entire set of data
> anyway.  So what's the issue?
>

we may expose everything, but in most cases we won't.  the issue is that
we shouldn't expose more information than is necessary.  the default syn
policy will be minimal which will only expose a small amount of
information.

> >so really the zone needs to know about a subset of the information in
> >catalog.attrs and catalog.base.C.  now admittingly, i could represent
> >that data in the same format as the current catalog.attrs and
> >catalog.base.C files, but i haven't seen any advantage to doing that.
> >as i mentioned before, currently i just have an xml file that lists the
> >information, but i'm not wedded to that, it's more an implementation
> >detail.
>
> The advantage of re-using the catalog format is avoiding introducing
> an additional project private format and the fact that a lot of
> optimisation work has been done to efficiently store catalog data in
> that format.
>

sure.  and if performance turns out to be an issue we can work on this
intermediate data exchange format as well.  hell, it could be the same
format as the current catalog.  (doesn't really matter to me.)

to put some real numbers to this, currently for my testing, i'm using
a snv_136 parent image with redistributable installed.  for linked
images with a minimum sync policy the current package sync list (which
is a text base xml file) is 34K.  if i decide to sync everything, then
that goes up to 138K.

> >>lines 421-427:
> >>     Since a linked image can realistically only account for the last
> >>     sync'd state of a parent image, it seems like the parent image
> >>     could simply provide constraints as a dynamically generated
> >>     incorporation package (manifest) as part of the export/import
> >>     process.  That manifest could then be stored locally and treated
> >>     exactly like a package normally would be without any special logic.
> >>     Doing so also makes it possible for the normal memory management
> >>     that the client api uses internally to not have to have special
> >>     logic to marshal this information to disk (possibly repeatedly).
> >>
> >
> >possibly, but isn't manifest dependency information cached in the
> >catalog?  if so this would require re-writing the catalog in zones which
> >requires reading the catalog in zones.  as i've pointed out before, that
> >would be bad.
>
> The catalog has to be read in zones; again, I'm not following the
> security/bad logic.  A catalog is used to track the state of the
> image, if we don't trust catalogs, then the client is useless.
>
> The only way to know the state of a child image is to read its catalog.
>

correct.  and operations which are initiated on a parent image should
never do this.  only pkg processes running at reduced privileges or in
special environments (zones or scratch zones) should access child
images.

> >in case it isn't obvious, i'm really trying hard keep data flowing in
> >one direction.  from the parent image to the child.  i'm also trying to
> >keep the data flow simple.  having pkg traverse directories in a zone
> >image and read data from that zone isn't safe.  i tried to call this out
> >early in the document with in the "zones requirements" section:
> >
> >- since zones are untrusted execution environment, global zone pkg(5)
> >   operations should not be required to read data from non-global zone.
> >   ie, any data flow required to support linked images must be from a
> >   global zone to a non-global zone, and never in the reverse direction.
>
> I don't believe I've suggested anything that would require a ngz to
> read data from a gz.  I've only suggested *how* to export the data
> and the process to use to import it.
>

perhaps i'm not understanding your suggestions.  i have no problems
changing the data format used to export/push data to clients.  i'm don't
want to use the existing on-disk parent catalogs because in many cases
that exposes to much information. i'm perfectly fine with using the
catalog format to expose a subset of what's installed.  if you've
suggested something else then i'm sorry but i've failed to understand
it.

> >also, there is one more thing to consider.  not all the linked image
> >constraints on an image can be represented as a package.  specifically,
> >package exclusions can not be adequately represented with exclude
> >dependencies.  these need to be fed directly to the solver.  if we
> >wanted to represent these in packages we'd have to invent a new
> >dependency actions or attribute.  i initially prototyped this and talked
> >to bart about it and he just recommended that i feed exclude
> >dependencies directly to the solver, which is what i'm currently doing.
>
> Can you give an example of how exclude dependencies can't currently
> be expressed adequately using depend actions?
>

ok.  but fyi, the cases where this can happen are complicated, and
unfortunately i erased the drawing on my whiteboard where i figured out
when it was impossible to do this with existing optional and exclude
dependencies.  so if my explanation below seems confusing it's probably
because i got it wrong and i'll need a couple more tries explaining it
to get it right...

the requirement to keep packages in sync is actually tied to specific
versions of a package.  so for example, say we are looking at packages
that utilize a "cip" packaging attribute that indicates they need to be
kept in sync between the gz and ngz.  pkg A.1 and A.3 may not have this
flag set, while pkg A.2 may have this version set.

so, if the parent image doesn't have A.2 installed, then how can we
express (via the existing dependency actions) that the child is allowed
to have A.1 or A.3?  optional dependencies only allow us to say verion X
or newer.  exclude dependencies allow us to say versions older than
version Y.  so if you look at different versions of a package as a
timeline, you can combine exclude and optional dependencies to allow you
to select one specific window of package versions from that timeline,
but you can't use them to allow non-sequential packages version from
within that timeline.

another complication is that optional and exclude dependencies do not
look at package timestamps.  they only consider package version numbers.

in my initial prototype i worked around this issue by adding a
"exactmatch" attribute to exclude dependencies.  if this attribute was
set i didn't apply the "older than" clause that is normally used by
exclude dependencies.  this flag also made exclude dependencies take the
timestamp into consideration.  bart convinced me that instead of adding
this new dependency attribute, it'd be better to simply feed the list of
excludes to the pkg solver and trim them from the solution space before
running the solver, which is what i'm currently doing.  for all other
constraints i'm using incorporate dependencies, which are work down to
the specified precision, which normally includes a timestamp.


> >>lines 429-436:
> >>     This could be greatly simplified by simply stating that the special
> >>     system packages will not have a publisher.  That is simpler and
> >>     works better than having a special string constant value for the
> >>     publisher.  That also fits nicely with the transport framework
> >>     since a publisher is required to perform transport operations, so
> >>     simply checking for "if pfmri.publisher" is faster and we avoid
> >>     the memory usage for the publisher string and parsing.  Every FMRI
> >>     normally has a publisher, so you can pretty much be guaranteed that
> >>     in any case where you'd care, you can rely on simply checking to see
> >>     if an FMRI has a publisher.
> >>
> >
> >sure.  i was never wedded to the name "none", and if this can be made to
> >work no problem.  it's just that there's a lot of code that assumes a
> >publisher exists, so going this route might end up requiring more code
> >changes that defining a special publisher.  (bart had suggested using an
> >invalid character in the publisher name to signify that it was
> >"special.")
>
> I'd rather fix the code that assumes a publisher exists than
> propagate that bad assumption.  There's lots of code that works
> without a publisher too.
>
> The logic for just not having one at all is simple--it completely
> avoids any potential collisions with an actual publisher and it
> ensures that the behaviour that we want for these is generic and
> consistent and doesn't rely on magic values.
>

so.  is it possible to have any other packages installed on a system
which have no publisher?  for example, if i install a package and then 

> >>lines 456-465:
> >>     Why would the first package be empty instead of just making this an
> >>     update from the out of sync version to the in-sync version of this
> >>     package?  Or alternatively, an uninstall of the old one and an
> >>     install of the new one?
> >>
> >
> >because as i alluded to before, you can't have an image with
> >inconsistent packaging installed.  the sat solver can't transition you
> >to a valid state if your starting state is invalid.  ie, if you have pkg
> >X installed and it depends on pkg Y.2, but pkg Y.1 is installed, you're
> >hosed and can't plan anything.
>
> If the state is truly invalid, then how can we safely upgrade the
> image?  I'm totally lost here.
>

the whole point is that we need to make sure we don't create an image
that is invalid/inconsistent.

> My understanding of what you had written was that you had one
> package that represented the last known good set of constraints on
> the image, another package with the new set of constraints on the
> image, and then one to account in progress updates.
>
> So, my belief was that the first package represented what was
> already "installed" in the image, that is, its initial state.  I
> don't see how that could possible be invalid to start with since it
> represents the state of the child image after the last operation was
> performed.
>
> My belief then was that the other two packages represented the state
> you were upgrading to, which the solver should be able to handle,
> and if they're invalid, that's because the parent image is in an
> invalid state, at which point, none of this matters.
>

no.  let me try to re-state this.  we have three packages:

- constrai...@0,0-0 - this package is always marked as installed.  it
  represents the current constraints on an image iff that image is in
  sync with it's current constraints.  if an image is NOT is sync with
  it's current constraints, this package is empty.

- constrai...@0,0-1 - this package never marked as installed. if an
  image is in sync with it's current constraints, this package is empty.
  if an image is OUT of sync with it's current constraints, this package
  represents the current constraints and installing this package will
  bring the image into sync with it's current constraints.

- constrai...@0,0-2 - this package never marked as installed.  the
  contents of this package represents the planned constraints on an
  image and installing this package will bring the image into sync with
  it's planned constraints.

now.  here's an example.  say we have an out-of-sync image wrt it's
current constraints.  the constrai...@0,0-0 package will be installed
but it will be empty.  so to sync this image we install the
constrai...@0,0-1 package.  this updates the image (if possible) to be
in sync with it's current constraints.  of course, once this operation
is done if we list the packages contents in the image we'll see that
constrai...@0,0-0 is still installed.  but once the image is in sync,
the contents of constrai...@0,0-0 will represent the constraints on the
image.

> >>line 481-491:
> >>     This seems difficult and fragile for several reasons:
> >>
> >>     - no guarantee that the operation in progress for the parent will
> >>       complete successfully
> >>
> >
> >so what?  this data is used for planning operations initiated from
> >within parent images.  (any operation initiated directly on a linked
> >image can't really take into account on-going changes in a parent image,
> >and eventually we'll need locking to enforce this.)  for these types of
> >operations, we first do the planning across all the images and then
> >execute.  so if the parent doesn't execute it's plan for some reason we
> >won't update the children.
>
> The so what part is there's data exported that may not actually
> reflect the state of the parent image, and so how can a child image
> safely know what it should sync its state to?  I was specifically
> referring to the case where you have images that are linked to a
> parent and the parent doesn't know about them.
>

if the parent doesn't know about a child image, it will never export
data to it.

> >>     - does this distinguish between an operation being *planned* in
> >>       the parent and one that is intended to be executed?
> >>
> >
> >no.  this package exists whenever an operation is being planned in a
> >parent image.
>
> For the disconnected image sync case, this seems difficult then as I
> think those types of children should probably not synchronise unless
> it's the current state of the parent as opposed to a "proposed"
> state.
>

pull based children will never know about a "proposed" state in the
parent.  they will only ever be able to sync to a snapshot of an actual
parent state.

> >>     - this implies that a parent image would always have to export
> >>       (marshal?) its plan data during operations so that images that
> >>       are linked to the parent (that the parent doesn't know about)
> >>       can use this information
> >>
> >
> >um.  the parent can't really marshal data out to images it doesn't know
> >about.  child images which the parent doesn't know about may be out of
> >sync wrt their content policy after any parent operation.  subsequently,
> >this situation can be detected and corrected with a linked-audit and
> >lineked-sync on the child image.
>
> So then, you're saying that usage case is one where we generate the
> parent image state on demand instead of the active export we do
> automatically when the parent knows what linked images it has?
>

yes.  parent's won't know about pull based children.  so pull based
children will have to generate their own linkage data.

> >>
> >>lines 565-570:
> >>     independent-minimal:
> >>
> >>     - This is a bit confusing to me as it seems to overlap with the
> >>       li-content-policy property above.  I see that you referenced
> >>       that, but what I don't understand is why this wouldn't *always*
> >>       be the case.  Specifically, what's the real difference between
> >>       this value and "independent"?  I would think that the linked
> >>       child image's content-policy would have to always be honoured.
> >>
> >
> >so the values of this property needs to change, probably to:
> >     li-update-policy = { full | minimal }
> >
> >full would be the same as the old independent and minimal would be the
> >same as the old independent-minimal.  (there was a reason for the
> >"independent" bit, but that reason has been rationalized away.)
> >
> >here's an example of the difference between full and minimal.
> >
> >lets say "pkg image-update" is run in the global zone, and we make a
> >plan to update the gz from snv_140 to snv_141.
> >
> >now, if li-update-policy = full, then we'll go to that child image and
> >do an image-update there as well.  so we'll likely update the image from
> >snv_140 to snv_141.  but also, since it's a zone and it may have
> >different publishers and software from the gz, we'll also check for
> >updates from those publishers.  so if oracle is installed in that zone
> >and it comes from a special oracle publisher, we'll update that if
> >there are updates available.
> >
> >now, if li-update-policy = minimal, then we'll go to that child image
> >and only update the minimum amount of software that is required to keep
> >us in sync with the changes happening in the parent.
> >
> >so in the end, the content policy is always honored, but this just
> >determines how aggressive we are with child updates when an image-update
> >is done an a parent image.
>
> Its still confusing to me that the child can have a "full" sync
> policy as well as the parent.  I must be missing something still.
> Why wouldn't you just set the full/minimal behaviour in the child
> image instead, or why can't the child get the same behaviour without
> setting something in the parent image?  It feels like this is some
> sort of override.
>

for any given type of image, the authoritative policy configuration can
only be stored in one place.

in the case of zones, the policy will be stored in a zones configuration
file in /etc/zones/, entirely outside the packaging system.  during a
push, this policy will get written into the child so operations on the
child can take it into account, but the child can't change that policy.

for default push linked images, the policy will probably be stored in
the parent in the cfg_cache file, but similarly to zones images, will
be pushed into child images, where once again it is read only.

for pull based linked images, the parent won't know about them so the
policy configuration will live in the child and can be changed in the
child as desired.

> >>     Also, why require the -l option for specifying the identifier of
> >>     linked images?  Our other subcommands simply accept positional
> >>     operands instead.  Obviously you need to use an option for image-
> >>     create, but the other subcommands don't really need one.
> >>
> >
> >so in prototyping i've settled on the following convention:
> >     pkg linked-audit
> >             - assume the current image is a child and audit it.
> >
> >     pkg linked-audit -a
> >             - assume the current image is a parent and audit all
> >               it's children
> >
> >     pkg linked-audit -l foo
> >             - assume the current image is a parent and audit it's
> >               child named "foo"
> >
> >this convention will apply to the following subcommands:
> >     linked-export
> >     linked-sync
> >     linked-detach
> >
> >this convention (sans the -a option) will apply to the following
> >subcommands:
> >     linked-property
> >     linked-set-property
>
> I'm still not thrilled with the idea of starting all of the
> subcommands with linked-; I don't think image-update and
> image-create alone set the examples for how all our subcommands
> should be named.
>
> I guess the -l behaviour makes sense.
>

sorry i was unclear, i'm fine with renaming them to the verb prefixed
form.  also, if there are too many commands that i'm introducing and
someone thinks it's polluting the command line namespace i'd even be
fine with introducing a new pkg-linked command.  this stuff doesn't
concern me as much as getting the actual design and implementation
right.  :)

> >>     image-create
> >>         What characters are allowed in the linked image name?
> >>
> >>         It seems like being able to name linked images can lead to
> >>         naming collisions.  I'd like to suggest that this is strictly
> >>         a human-readable "alias" for the image. And that you also change
> >>         images to have a unique "id" (a UUID specifically), so that in
> >>         the event that there is a naming collision,  you can still
> >>         perform operations on an image using it's unique ID.
> >>
> >>         Alternatively, you could leave it as "name", but I'd still like
> >>         to see images have a unique ID that we could rely on instead.
> >>         Again, I wonder if the name here is a property of the linked
> >>         image being created, or something that the master is recording.
> >>
> >>         If the name is a property recorded in the linked image, then
> >>         I'd also suggest that the parent may want to reference linked
> >>         images only by their unique id&  path to allow the name to be
> >>         changed at any time.
> >>
> >
> >my current thinking is that a fully qualified linked image name will be:
> >     <linked-image-type>:<linked-image-name>
>
> So in the event that disambiguation is required, the user can
> qualify the name with the type, but it isn't actually part of the
> name, correct?
>

in the event where disambiguation is required the user would *have* to
specify the type.

> >where the constraints on linked-image-name are type specific.  if there
> >are no name collisions then a user can just specify<linked-image-name>.
> >if there are collisions a full name needs to be specified.  so some
> >examples would be:
> >     zones:<linked-image-name>
> >     default:<linked-image-name>
> >     user:<username>,<linked-image-name>
> >
> >for zones, the linked-image-name would have to conform to the
> >restrictions on zone names.
> >
> >for user images, the username would have to conform with what's
> >specified in passwd.4 as a valid username format.
>
> I don't think username should be in the image name.  User images
> aren't literally tied to a specific "user".  usernames can also be
> considered security-sensitive information, so I'd rather not have
> them there.
>

well, user linked images are out of scope for my current proposal, so i
don't feel compelled to nail down a naming strategy.  i was just more
listing one above as an example.

> If you're not saying that the username would be part of the actual
> name, how would you determine what <username> should be to
> disambiguate?
>

in general, if a name doesn't contain a ':' we initialize all the linked
image plugins and ask them if they have an image with the specified
name.  if multiple plugins claim the image then we generate an error and
the user must specify a fully qualified name.

> It also seems this might be difficult in some environments given the
> ARC change being proposed right now to allow usernames with more
> than 8 characters.  In an NFS or mixed environment, this might cause
> difficulties.
>

i don't really forsee any problems, but once again, it's out of scope
anyway.  :)

> >for default images and for the user images, the linked-image-name would
> >probably have a conservative format similar to usernames and zones,
> >say:
> >     - case sensitive
> >     - first character must be alphabetic
> >     - can contain alphanumerics plus hyphen (-), underscore (_), and
> >       dot (.)
> >
> >wrt who's recording the name, that will vary based on the type of linked
> >image.  for images that the parent is not aware of, obviously that
> >information will be recorded in the child.  for zones that information
> >will be stored in zone configuration files.
> >
> >wrt uuids, i'm not to keen on using them because of the fact that the
> >place where we store information is distributed.  for zones images, all
> >our information is stored in the zones configuration files.  so there
> >really is no single authoritative place where we could store these uuids.
>
> My suggestions for using uuids was that they are far less likely to
> collide with the sort of textual names suggested above.  The textual
> names are great for mere mortals that have to identify these images
> on the command line, but uuids are almost guaranteed to provide
> disambiguation while what has been suggested above is far less
> likely.
>
> Of course, one could argue that the probability of a naming
> collision is low to begin with...
>

one could argue that, but i wouldn't.  i don't want to design a system
that can't deal with naming collisions, which are bound to happen
because the linked image name space, as currently proposed, is partially
out of the control of pkg.  for example, currently zone linked image
names match the zone name configured with zonecfg(1m).  so unless i want
to start aliasing names, or start including uuids everywhere, then at
least my current idea gives me a way to avoid collisions.

ed
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Re: [pkg-discuss] initial linked images design doc

Reply via email to