Karen,
Is there anything in your current spec that states
that you do not allow
execute_checkpoint()
followed by
resume_execute_checkpoint() ?
(I see a comment that only one call to resume-execute_checkpoint()
is allowed per invocation, but I don't see that this must occur
before execute_checkpoint()?)
Is the implication of your example below not simply this:
you should add a clarification to the spec, stating that when users
run resume-execute_checkpoint(), they must be aware that persistent
cache data from other checkpoints previously executed in this
invocation, will be lost?
- Dermot
On 06/04/10 16:58, Karen Tung wrote:
Hi Darren,
Thank you for sending out the example below.
The example you provided below indeed works OK.
However, to support ManifestParser being a checkpoint,
and be executed before any resume request means that
the engine will need to allow resume_execute() request
after execution has started.
The following example illustrates the problem of the
engine allowing resume after execution has started
in the general case.
Run 1 of application successfully run checkpoints A, B, C, D.
Persistent DOC will have information about these.
Run 2 of application registers checkpoints A1, B1, C1. Requests engine
to run all of them. They all run successfully, and persistent DOC
now have info about A1, B1, C1. Then, application registers
checkpoints A, B, C, D, and want to
resume at checkpoint B. Engine rollback to checkpoint B, and loose all
information
about A1, B1, C1.
As you can see, if we allow any "resume" request after
execute_checkpoint() has run,
we will run into problems.
Thanks,
--Karen
On 06/ 4/10 06:21 AM, Darren Kenny wrote:
Hi,
Just though that I'd like to mention this example after talking with
Dermot
after yesterdays meeting...
I believe that one of the concerns was that on resumption of DC, and
we need
to call ManifestParser again, then the data currently in the
Persistent data
tree would conflict with this since it would contain the fact that
ManifestParser was already run.
I don't think that this is the case, unless you reload a snapshot, and
you
cannot do that until after you've run ManifestParser - since you need
to do
that to get the location of DC's work dir (/rpool/dc) from the
manifest...
So I don't believe this is a problem because each time you run
ManifestParser
you will be starting with an empty DOC (unless the Application puts
something
in there of course).
I've tried to do this "visually" below...
Hope that resolves the issue being referred to, but if now, please
feel free
to provide me with a specific scenario that I can try work through.
Thanks,
Darren.
===========================================================
FIRST RUN
Add ManifestParser Checkpoint
+--------------------------------------------------------------+
|
| DOC:
| Persistent Data Volatile Data
|
| - Completed Checkpoints - Checkpoints to Run
| - EMPTY - ManifestParser
|
+--------------------------------------------------------------+
Call Engine.execute()
Add Application-specific checkpoints.
+--------------------------------------------------------------+
|
| DOC:
| Persistent Data Volatile Data
|
| - Completed Checkpoints - Checkpoints to Run
| - ManifestParser - TargetDiscovery
| - TI
| - TransferIPS
| - Transfer(s)
| - Finalizer(s)
|
| - DC Workdir = /rpool/dc
|
+--------------------------------------------------------------+
#Run up to TransferIPS
Engine.execute(end=TransferIPS)
+--------------------------------------------------------------+
|
| DOC:
| Persistent Data Volatile Data
|
| - Completed Checkpoints - Checkpoints to Run
| - ManifestParser - TargetDiscovery
| - TargetDiscovery - TI
| - TI - TransferIPS
| - TransferIPS - Transfer(s)
| - Finalizer(s)
|
| - DC Workdir = /rpool/dc
|
+--------------------------------------------------------------+
===========================================================
Now if we stop DC, and then attempt to resume:
===========================================================
+--------------------------------------------------------------+
|
| DOC:
| Persistent Data Volatile Data
|
| - Completed Checkpoints - Checkpoints to Run
| - EMPTY - ManifestParser
|
+--------------------------------------------------------------+
Call Engine.execute()
Add Application-specific checkpoints.
Look for resume-able checkpoints using DC Workdir info...
+--------------------------------------------------------------+
|
| DOC:
| Persistent Data Volatile Data
|
| - Completed Checkpoints - Checkpoints to Run
| - ManifestParser - TargetDiscovery
| - TI
| - TransferIPS
| - Transfer(s)
| - Finalizer(s)
|
| - DC Workdir = /rpool/dc
|
+--------------------------------------------------------------+
Found, one, we want to resume from Transfer, so re-load snapshot
from last run:
+--------------------------------------------------------------+
|
| DOC:
| Persistent Data Volatile Data
|
| - Completed Checkpoints - Checkpoints to Run
| - ManifestParser - TargetDiscovery
| - TargetDiscovery - TI
| - TI - TransferIPS
| - TransferIPS - Transfer(s)
| - Finalizer(s)
|
| - DC Workdir = /rpool/dc
|
+--------------------------------------------------------------+
#Now we run until the end:
Engine.execute()
===========================================================
_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss