Generally OK with your responses, but more comments on this one:
On 06/10/10 07:41 PM, Karen Tung wrote:
Hi Dave,
Thank you for reviewing the document. Please see my responses inline.
On 06/09/10 09:09, Dave Miner wrote:
...
I'm disappointed that the methodology for determining checkpoint
progress weighting is still TBD. I'd thought this was one of the
things we were trying to sort out in prototyping, but perhaps I
assumed too much. When can we expect this to be specified?
The prototype confirm that we can have each of the checkpoints report
it's own weight, and the engine used
these weight to normalize the process reported by the checkpoints. We
also shown in the prototype that we
can use the logger for progress reporting.
I don't have plans to work on the methodology in detail in the short
term. It would involve
specifying exactly which machine with what configuration should be used
as the standard,
and also provide the mapping between a performance number generated from
that
machine to the weight. In order to do this accurately, I think it would
involve more research
and experimentation to determine what would work for most cases. If we
have the
code in the engine to accept and interpret weights provided by
checkpoints, when we
eventually have the methodology in place, we can just change the value
returned by the get_performance_estimate() function in the checkpoints,
which should
have a very minimal impact. At the mean time, we can have the
checkpoints return
the "guess" weight like we do now.
I don't think we need to define a mapping of a performance number to a
weight. The weights are relative to each other in a sequence, not to
some absolute standard, so all you should need from the checkpoint is a
number that you can then compute as a ratio against the sum of all the
estimates. I think you can define a configuration that's fairly easily
available (T2000 LDOM with 1 GB of memory or something) and go.
Dave
In section 11, we seem to be eliminating the ability that DC currently
has to continue in spite of errors in a particular checkpoint. Has
this been discussed with the DC team?
I forgot about the ability to continue in spite of errors. That
functionality should be included in the engine.
I will provide an interface in the InstallEngine class for the
application to indicate that they
want to continue despite errors. By default, we will not continue if
there's an error.
Finally, a moderately out-there question that is admittedly not part
of the existing requirements list: what if we wanted to make
checkpoints executable in parallel at some point in the future? Would
we look at a tree model, rather than linear list, for checkpoint
specification, or something else? Is there anything else in this
design that would hinder that possibility (or things we could easily
modify now to help allow it more easily)? An existing case where I
might want to do this right now is to generate USB images at the same
time as an ISO in DC, rather than generating a USB from an ISO
exclusively (the current behavior). It's also the case that many of
the existing ICT's could probably be run in parallel since they
generally wouldn't conflict.
In order to support this, we would probably need to change how
checkpoints are registered. Probably adding
more arguments to the register_checkpoint() function to specify which
checkpoints can be executed together.
In the InstallEngine object, we currently store checkpoints in the order
that they are to be executed as a list.
Since Python allows one to store a list within a list, for the case of
executing checkpoints in parallel,
we can store all the checkpoints that are meant to be run in parallel in
a sub-list inside the
"execution list". At execution time, we go down the "execution list"
and run one single checkpoint
or a group of checkpoints.
For example, if our execution list has the following
A, (B, C, D), E, (F, G)
First, Checkpoint A will be executed by itself
Then, Checkpoint B, C, D will execute at the same time
Then, Checkpoint E will execute
Finally, Checkpoint F and G will be executed at the same time.
So, I don't think the current design would hinder that possibility.
Thank you again for taking the time to review the install design doc.
--Karen
_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss