[caiman-discuss] Summary of Install Engine discussion

Karen Tung Thu, 17 Jun 2010 16:37:42 -0700

Here's a summary of discussions we had this morning
on the Install Execution Engine design.


1) Application's usage of InstallEngine.execute_checkpoints() and threading:

- If an application calls execute_checkpoints() with
a callback function, execute_checkpoints() will return after
all checkpoints are instantiated.  When the thread executing
checkpoints is completed, the callback function provided
by the application will be called.

- If an application calls execute_checkpoints() function without
providing a callback function, execute_checkpoints() will not
return until all checkpoints are executed.

2) Canceling a checkpoint

- It's the application's responsibility to setup a signal handler to process
signals such as control-c.

- When the engine receives a cancel_checkpoints() request, it will
call the cancel() function of the checkpoint that's executing.

- The default implementation for the AbstractCheckpoint.cancel() function
will be to set a threading.Event variable.
Checkpoints that do not overwrite the default cancel() implementation should
check the value of this variable using the is_set() function, and
perform the necessary cleanup and exit.

- Checkpoints that do not want to use the default cancel()implementation can overwrite with it's

own implementation when they subclass the AbstractCheckpoint object.

3) stop-on-error

- If stop-on-error is false, the engine will continue executing allcheckpoints despite exceptions

from one or more of the checkpoints.

- DOC and/or ZFS snapshots will be taken after each of the checkpointsare executed, despite theexception(s). If the application wants to resume at a previously failedcheckpoint and the stop-on-errorflag is false, the application is allowed to resume at that checkpointif other resume requirements are met.


4) AbstractCheckpoint.get_progress_estimate()

- This function will return the number of seconds it takes to executethe checkpoint

in seconds as measured by the wall-clock, on a standardized machine.

- Developers who might not have access to the standardized machine or if the

standardized machine becomes obsolete in the future, can run one of theexistingcheckpoints that perform similar operation to their checkpoint on anyavailable machine anduse that as a guidance to figure out the approximate number of secondsit takes

to run the newly developed checkpoint.

5) Keith's question about using Error Service module (errsvc) for storing
exceptions raised by the checkpoints, instead of storing the exceptions
as a list.

- the Error Service module is suitable and can be used with some
modifications.

- The ErrorInfo object can be used to store the exception.

The mod_id in the ErrorInfo object can be used for storing the name ofthe checkpoint

that raised the exception.

- As currently implemented, ErrorInfo object only accepts
"integer" and "string" as the error data type.  It needs to be modified to
accept an "object", which will be used for storing the exception raised
by the checkpoint.

6) After a checkpoint completes successfully, the engine will always send
a progress update to the logger on the overall percentage complete.  This
allows accurate progress to be reported even if a checkpoint does not report
intermediate progress.

Please let me know if you have any questions or comments on this summary.

Thanks,

--Karen


_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss

[caiman-discuss] Summary of Install Engine discussion

Reply via email to