HI Karen.
This document is pretty comprehensive and complete. Here are my
comments, submitted late with your permission:
Section 1:
Readers may find it useful to make the connection that DC can be seen as
an "installer" in the sense that it assembles a target image.
Section 2:
In the same vane as for section 1, instead of saying "executing an
installation", does saying "executing an image-build" make more sense,
as that includes DC?
Section 5.2:
Not sure it adds value to the doc to list the whole class here. Maybe
the method signatures with a brief (e.g. 1-line) description?
6.3.4:
checkpoint_obj: I concur with other respondants that this would be
better called checkpoint_class or checkpoint_class_name.
args: I think this has to be a list, but the doc doesn't say that
explicitly. Also, if there is only one arg, does it have to be
specified as a one-item list? Is an empty list OK?
checkpoint log level: This is paragraph is confusing to me. Which two
log levels are different from each other? Do you mean the application
wants to use a different log level than specified in this argument?
Isn't it the application that calls register_checkpoint when it sets up
the engine? Why would a keyword arg be needed if the log level is
specified here already? Since each checkpoint is registered separately,
each can already have its own level.
6.6.1: cleanup_checkpoint(): I would change the name to
cleanup_checkpoints() since it cleans up all checkpoints that have been
executed in the current run, not just one.
7.1.1: So to be sure I understand, a checkpoint can be interactive and
can register or change subsequent checkpoints based on input, right?
7.2:
- I think the first sentence is trying to say that some limitations
exist because ZFS snapshots are used. Is this correct?
- In the first bullet, ZFS and data cache snapshots are mentioned. Is
the data cache snapshot also ZFS? If not, isn't it not limited by ZFS
limitations? If it is ZFS, how can the second bullet be true?
7.3.2.2:
- Termites -> terminates (have you been talking to Sue, lately?)
- Lead sentence talks of finding out which checkpoints are resumable,
but the first bullet talks about registering checkpoints, which is
different. Perhaps for the lead sentence you mean something like this:
"For an application that terminates, a subsequent invocation of the
application might want to resume. That application would have to do one
of the following to establish resumable checkpoints:"
7.4:
- So the DataObjectCache snapshot will be stored in multiple places? It
will be in /tmp (or /var/run or wherever) as well as stored as part of
the ZFS dataset? If there is no ZFS dataset, the DOC snapshot in "/tmp"
will be used?
- Last PP: It says "the engine verifies the DOC is empty before rolling
back to a DOC snapshot." Wouldn't the normal case be that the DOC isn't
empty on resume? (See 7.4.1 #3.) If so, no rollbacks would ever
occur. I'm missing something here...
7.5: resume_execute_checkpoint Description PP: Won't rollback be to
state *before* the named checkpint is completed, rather than *after* ?
10.1: I'm not sure a standardized machine is needed nor feasible.
(Eventually that machine would become obsolete and unavailable; then
what?) I suggest creating a program against which checkpoint times can
be standardized. For example, regardless of the machine the test
program runs on, let's say it will take 1000 units of time to run. On
the same machine it will take checkpoint One an amount of X units of
time to run. Then when you run on a faster machine, both test and
checkpoint programs will run proportionately faster. (I know I'm
oversimplifying this and different things (e.g. CPU intensive ops vs
network intensive ops) run faster or slower on different machines, but
this is to get an approximation. If some of all kinds of ops are built
into the test program it will be more normalized to the different machines.)
Then each checkpoint could return its number of time units to perform
its task, and have a method inside it to return the % done.
General: Tomorrow when I'm back in the office, I'll turn over my
hardcopy which has grammatical corrections, etc, since as my officemate
you are conveniently located :) .
Thanks,
Jack
_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss