Hi Karen,

This is a really good design document. I do have some comments/questions. I tried not to repeat others comments, but if I did, simply refer me to your response emails.

Section 3:
Data Object Cache is a singleton. All components access the DataObjectCache directly without going through the engine. The engine initialize the DataObjectCache. No other components
should initialize the DataObjectCache.
I don't believe this is a correct assumption. As a result of it not being a singleton, it is likely to change some of the interfaces you need to provide in the Engine.

Even though the engine runs the checkpoints, it does not report their progress. Applications that wants to monitor progress should register one or more progress receiver with the logger. Checkpoints report their progress directly to the logger. The engine helps to normalize the progress information for the logger. The install logger publishes progress information received
from the checkpoints to all registered progress receivers.

So, in looking at your uml diagram it looks like the logger has to call to the engine for the normalize_progress() method. Why then do we not have consumers log using the engine and have the engine normalize it before sending it to the progress logger? It seems backwards to me that the logger has to call the engine to normalize this data, then send the normalized progress to the progress receiver.

Section 6:
Other things we need to allow for with regard to interaction with the engine are: -Getting DOC object handle for consumers. this is based on the DOC not being a singleton.
-Getting a list of checkpoints registered.
-unregister checkpoints

Section 6.2.1:
loglevel: I thought that we wanted to be able to set log level per checkpoint as well? Is this still the case? If it is, why do we have a log level for the Engine as well?

Debug: so, if set to false, when are the DOC snapshots removed from /tmp? My question is how do we know if we want them removed if we are planning on stopping and resuming?

6.3.2: ordering of registered checkpoints
Why can't an application insert a new checkpoint in front of an already executed checkpoint? It seems as if this happens, and we resume from this newly inserted checkpoint, then any subsequent checkpoint has to be re-run, regardless of its current state. Which means the engine must cleanup the previous run checkpoint's snapshots. I can see that it might be useful to allow an application, or user, to specify a new checkpoint that doesn't follow in order of the last executed checkpoint, but before with the intent of starting from the newly inserted checkpoint.

Section 6.3.4:
I would think that the input args to register_checkpoint could be a dictionary or something like that,rather than individual args. Much like an nvlist allows for extensibility in the design, we could do something similar here, in case we ever want or need to change the parameters for registering a checkpoint. I am referring specifically to the checkpoint_name, module_path, checkpoint_obj, insert_before and log_level args.

Section 6.5.1: cancel_checkpoint():
It seems that with python's threading model, that is there is no kill method for threads, that the cancel function is going to perhaps set a flag in the checkpoint object, and the execute() method on that object will have to poll and wait on a condition to stop themselves. I would think we would have to spell out in your interface description what has to be done for both the execute() and cancel() methods so implementers know how to make use of the cancel functionality.

Also, since it is up to the engine to decide which thread to cancel, how can the consumer know where it can resume, since it may not know what was currently executing?

A general comment on the format of this design doc.. I found it hard to figure out what objects the methods were associated with because of the way you laid out the document. Perhaps describing the install engine class at first, with all the named methods, and any other classes such as the checkpoint class and all its methods, and then describing them would make it easier to understand.

Section 7.4: resume_execute_checkpoint, using a dictionary or something similar as input args like suggested for execute_checkpoint() would make this more extensible, imo.

Section 8: Creation of the checkpoint data subtree.. in my proposed schema the checkpoints would be stored at the time, in the desired system state part of the DOC, at the time we read these from the manifest. how does this work in terms of your storing the checkpoints later if they were executed successfully only? Maybe not an issue, just trying to understand why we need this Engine private data.

thanks,
sarah


On 05/28/10 04:12 PM, Karen Tung wrote:
Hi,

The draft install engine design doc is ready for review.
I pushed the doc to the caiman-docs repo.  You can
find them under the install_engine directory in
ssh://[email protected]/hg/caiman/caiman-docs

If you don't want to clone the gate, you can also access
the file directly here:

http://cvs.opensolaris.org/source/xref/caiman/caiman-docs/install_engine

You will find 2 files in the directory:
* engine-design-doc.odt is the design document
* engine-sequence-diag-0528.png is the sequence diagram inserted into chapter 14 of the document. You might want to see the bigger image directly.

Please send your feedback by 6/11/2010.

If you plan to review the document, please send me an email privately,
and preferably also let me know when you will complete the review, so
I can plan accordingly and ensure the document is reviewed thoroughly.

Thanks,

--Karen
_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss

_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss

Reply via email to