On 03/ 7/10 12:23 PM, Edward Pilatowicz wrote:
hey all,
so in an effort to get pkg image-update to work with zones, i've been
prototyping support for linked images within pkg(5). i don't have a fully
functional prototype to pass around yet, but i have documented some
aspects of my prototype in an initial design doc. i've attached that doc
so that people can take a look, see what i'm thinking, provide feed
back, etc. all comments and criticism are welcome.
lines 10-34:
So are linked image types just *additional* types of images (e.g.
full, partial, etc.) or is this a subtype of an existing image type?
lines 164-205: User linked images:
I'd like to make the observation that it seems like user and zone
linked images fall into two high level models:
- push type
This is roughly what you describe zone images as; a push-only model
where the parent tells any child images what the new state of the
parent image is after operations are completed.
- pull type
This is what you describe user images as currently; a pull-only
model where the user or some service is responsible for detecting
changes in a parent image and syncing the child user image.
I'd suggest that user images could be both push and pull. That is,
an administrator may find it useful to create a special set of
user images for creating isolated software stacks of specific
packages that can't normally co-exist. This is often seen with
software installed in /opt as an example.
lines 253ff:
Indeed, instead of "system publisher", I'd prefer to see the text
"system repository" or "parent image repository" since that's what
the functionality really is.
lines 270-288:
I'm assuming that this complements / assumes implementation of
bug 15343?
line 298-299:
s/do not do any do any check/do not perform any checks/
line 299:
s/installed installed/installed/
lines 320-330:
It seems like it would be helpful if beadm permitted linking any
arbitrary ZFS filesystem to a given BE to enable this functionality.
In particular, I was thinking that it would be useful for
administrators to choose whether some filesystems remained
independent or stayed in sync with a given boot environment.
This could be particularly useful in the linked user image
management of isolated software stacks I mentioned above or
for the management of configuration data.
I agree that this enhancement is out of scope for this proposal,
but wanted to mention it.
lines 333-340:
I could be mistaken, but I seem to recall from my past experience
with translation and localisation that the terms parent/child
were strongly preferred in general end-user documentation over
master/slave. I'd ask someone in l10n about this and then only
use the terms they recommend.
line 365:
s/access the/access to the/
line 375:
s/mange/manage/
lines 375-384:
When you say "constraints data", you really mean "image state",
right? Or you imply "constraints data" in the more generic sense
(e.g. package versions, package state, package constraints)?
lines 386-391:
I think /var/pkg/state/export /var/pkg/state/import or
/var/pkg/cache/state/export, /var/pkg/cache/state/import
would fit better with the on-disk format proposal. I don't
understand why 'master' is part of the pathname here since
there's no corresponding linked/child/* set of directories
in this proposal either.
lines 393-399:
This bit is really confusing to me:
- What does the linked image type and content policy stored here
indicate?
- How does that meaning change based on whether it is in the export
or import directory?
- What form is this information stored in and when does it get
stored?
- It seems like images that you *push* changes to should simply get
their data from a temporary directory that contains the exported
information for the duration of the operation. I'm not sure this
should be stored in $IMGROOT/pkg. For images that *pull* changes
from the master image, realistically this information needs to be
generated on demand instead of being stored in $IMGROOT/pkg. If
the purpose of this directory is to simply cache information that
was imported or is being used to perform a sync operation, then
I'd say it should live under /var/pkg/cache/linked (to fit with
the on-disk proposal). Does this data need to be kept longer
than that?
- I think that if you're exporting "constraint information" that it
really sounds like you should just be generating an incorporation
package since that's equivalent and fits with existing
functionality. In particular, I would anticipate pkg freeze
to be based on incorporations so that would fit better with what
you've proposed here.
- If you're going to export state information for the parent image,
then that's what's contained in the /var/pkg/state/installed
directory. Specifically, just catalog.attrs and catalog.base.C.
lines 401-407:
If we're going to go this route, I'd like to see a documented
serialisation format for the imageplan that can be used more
generically. It should also be JSON; not pickle-based. You
need a more general format anyway to fit with some of the
functionality described later on in this proposal for --runid.
lines 421-427:
Since a linked image can realistically only account for the last
sync'd state of a parent image, it seems like the parent image
could simply provide constraints as a dynamically generated
incorporation package (manifest) as part of the export/import
process. That manifest could then be stored locally and treated
exactly like a package normally would be without any special logic.
Doing so also makes it possible for the normal memory management
that the client api uses internally to not have to have special
logic to marshal this information to disk (possibly repeatedly).
lines 429-436:
This could be greatly simplified by simply stating that the special
system packages will not have a publisher. That is simpler and
works better than having a special string constant value for the
publisher. That also fits nicely with the transport framework
since a publisher is required to perform transport operations, so
simply checking for "if pfmri.publisher" is faster and we avoid
the memory usage for the publisher string and parsing. Every FMRI
normally has a publisher, so you can pretty much be guaranteed that
in any case where you'd care, you can rely on simply checking to see
if an FMRI has a publisher.
line 442:
I wonder if one of the existing reserved namespaces that were
proposed in May 2008 [1] could be used instead? In particular,
feature, cluster, metacluster, or service? At the least, I'd
like to see the name be a bit more specific than
"linkedimage/constraints"; perhaps "parent-image-constraints"?
I'm uncertain if the site/ namespace could be used here as that
would seem to fit nicely with the purpose of this package.
lines 456-465:
Why would the first package be empty instead of just making this an
update from the out of sync version to the in-sync version of this
package? Or alternatively, an uninstall of the old one and an
install of the new one?
line 481-491:
This seems difficult and fragile for several reasons:
- no guarantee that the operation in progress for the parent will
complete successfully
- does this distinguish between an operation being *planned* in
the parent and one that is intended to be executed?
- this implies that a parent image would always have to export
(marshal?) its plan data during operations so that images that
are linked to the parent (that the parent doesn't know about)
can use this information
lines 526-543:
Why are "in-core" packages defined using a package attribute
instead of an incorporation? In particular, forcing this to
be a property doesn't seem very flexible since the definition
of what an in-core package is could be drastically different
depending on the use case. Is the intent that this property
is only used in special dynamically generated packages?
I think I'd rather see a list of incorporations that were used
to define the sync policy. That would also allow pkg freeze
policies in the parent image to be applied to the child image
more easily.
It also seems like the property values could be a bit more
user friendly. I'd suggest:
sync-cip -> minimum
sync-all -> exact
superset -> possible
lines 565-570:
independent-minimal:
- This is a bit confusing to me as it seems to overlap with the
li-content-policy property above. I see that you referenced
that, but what I don't understand is why this wouldn't *always*
be the case. Specifically, what's the real difference between
this value and "independent"? I would think that the linked
child image's content-policy would have to always be honoured.
lines 572-579:
Will administrators be able to leave images "linked" but simply put
them into a "disabled" state of some sort? I can definitely see
administrators wanting to temporarily defer updating a linked image
because of problems without preventing update of the parent image.
I don't think simply unlinking it is the right solution as that
means you lose the location information (which is valuable). This
is not for zones obviously as you can simply detach those, which I
assume simply puts them into an equivalent offline state or they're
automatically skipped even though they remain linked.
lines 589-595:
While this list fits the current definition and model of client
operation execution, this will be changing in the near future. In
particular, there are may be multiple data retrieval phases which
means that you can't rely on this model to determine what can and
cannot be done during different parts of an operation. More on
that below in my comments for 610-636.
As I mentioned above, I'd like to see a documented format for the
plan serialization, and this may also be a good time to somewhat
adjust how the client prepares and executes a plan to simplify the
processes involved. Since what you've proposed here is basically
marshalling an imageplan at different phases, I think this needs
to be more generic so that we can use this functionality for low
or restricted resource environments as well (which zones may be).
line 626: s/equilivant/equivalent/
lines 610-636:
Instead of making these project private interfaces and to make them
more generally useful, I'd like to see them become a bit more
generic with the hope that they'd eventually be suitable for
end-users. In particular, we already have a few RFEs open for
being able to only perform the download portion of an operation,
etc. With that in mind, I'd suggest the following changes:
runid -> --plan
Path to a pkg(5) plan file.
plan -> --stage
Specifying --stage without --runid
Another question is where the plan gets stored when you run it
with the --stage option? --runid seems too magical since it
requires that the plan be stored within $IMGROOT, and I'd like
for the plan to be retrievable from anywhere. If that's not
desirable, can you expound on why?
Finally, I'd also like to see the stages changed a bit to be more
general in light of my earlier comments about multiple download
phases and to be a bit more user friendly:
default
I'm assuming this implies that if --plan-id *is* specified,
then the client will resume from the last point in the
plan and this just continue until completion of the
operation.
pkgs -> evaluate
Just evaluates the operation (may trigger metadata
retrieval in the v0 repository case). This is enough
to determine what will be installed but not the size of
the operation (disk space) or how many actions will be
involved.
actions -> prepare
Prepares for package content retrieval and operation
execution. This will retrieve package metadata (manifests).
download
Retrieve package content required to execute operation and
exit.
execute
Execute the operation.
lines 643-784:
I think verb oriented subcommands are easier to remember and type;
that also fits with our existing subcommand naming pattern.
Also, why require the -l option for specifying the identifier of
linked images? Our other subcommands simply accept positional
operands instead. Obviously you need to use an option for image-
create, but the other subcommands don't really need one.
So with the above in mind:
linked-list -> list-linked
linked-sync -> sync-linked
linked-unlink -> unlink-image
No corresponding "link-image" subcommand? I'm aware that
you account for it at image-create, wondered if post image
create is also possible?
linked-property -> linked-property
linked-set-property -> set-linked-property
Are these subcommands managing properties of the parent image
that record information about a child (linked) image? Or are
they reading and manipulating the general image properties of
the child (linked) image?
image-create
What characters are allowed in the linked image name?
It seems like being able to name linked images can lead to
naming collisions. I'd like to suggest that this is strictly
a human-readable "alias" for the image. And that you also change
images to have a unique "id" (a UUID specifically), so that in
the event that there is a naming collision, you can still
perform operations on an image using it's unique ID.
Alternatively, you could leave it as "name", but I'd still like
to see images have a unique ID that we could rely on instead.
Again, I wonder if the name here is a property of the linked
image being created, or something that the master is recording.
If the name is a property recorded in the linked image, then
I'd also suggest that the parent may want to reference linked
images only by their unique id & path to allow the name to be
changed at any time.
lines 766-771:
In the past, when I've suggested global options like this, it's been
suggested that they be moved to the subcommands they actually apply
to instead. For example, I doubt this option really applies to the
refresh, info, or list subcommands of pkg(1). I'd also like to
suggest that the option name be --ignore-linked or --skip-linked
and can be specified multiple times.
Cheers,
-Shawn
[1] http://markmail.org/message/qep43eehttsyc5w7
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss