Hi Karen,

Do we really support this?

I would think that if the base set of checkpoints changes between runs, it's not
valid to resume anything out of sequence - once you've inserted checkpoints
before one already run, you need to step back to the common point at least
before a resume could be done.

That's a basic premise of resuming things I would think, and we should discover
this before allowing such a resume.

Am I wrong here?

Thanks,

Darren.

On 06/ 4/10 04:58 PM, Karen Tung wrote:
> Hi Darren,
> 
> Thank you for sending out the example below.
> The example you provided below indeed works OK.
> 
> However, to support ManifestParser being a checkpoint,
> and be executed before any resume request means that
> the engine will need to allow resume_execute() request
> after execution has started.
> 
> The following example illustrates the problem of the
> engine allowing resume after execution has started
> in the general case.
> 
> Run 1 of application successfully run checkpoints A, B, C, D.
> Persistent DOC will have information about these.
> 
> Run 2 of application registers checkpoints A1, B1, C1.   Requests engine
> to run all of them.  They all run successfully, and persistent DOC
> now have info about A1, B1, C1.  Then, application registers
> checkpoints A, B, C, D, and want to
> resume at checkpoint B.  Engine rollback to checkpoint B, and loose all 
> information
> about A1, B1, C1.
> 
> As you can see, if we allow any "resume" request after 
> execute_checkpoint() has run,
> we will run into problems.
> 
> Thanks,
> 
> --Karen
> 
> On 06/ 4/10 06:21 AM, Darren Kenny wrote:
>> Hi,
>>
>> Just though that I'd like to mention this example after talking with Dermot
>> after yesterdays meeting...
>>
>> I believe that one of the concerns was that on resumption of DC, and we need
>> to call ManifestParser again, then the data currently in the Persistent data
>> tree would conflict with this since it would contain the fact that
>> ManifestParser was already run.
>>
>> I don't think that this is the case, unless you reload a snapshot, and you
>> cannot do that until after you've run ManifestParser - since you need to do
>> that to get the location of DC's work dir (/rpool/dc) from the manifest...
>>
>> So I don't believe this is a problem because each time you run ManifestParser
>> you will be starting with an empty DOC (unless the Application puts something
>> in there of course).
>>
>> I've tried to do this "visually" below...
>>
>> Hope that resolves the issue being referred to, but if now, please feel free
>> to provide me with a specific scenario that I can try work through.
>>
>> Thanks,
>>
>> Darren.
>>
>>
>>
>> ===========================================================
>>      FIRST RUN
>>
>>      Add ManifestParser Checkpoint
>>
>>        +--------------------------------------------------------------+
>>        |
>>        | DOC:
>>        |   Persistent Data               Volatile Data
>>        |
>>        |   - Completed Checkpoints       - Checkpoints to Run
>>        |     - EMPTY                     - ManifestParser
>>        |
>>        +--------------------------------------------------------------+
>>
>>      Call Engine.execute()
>>      Add Application-specific checkpoints.
>>
>>        +--------------------------------------------------------------+
>>        |
>>        | DOC:
>>        |   Persistent Data               Volatile Data
>>        |
>>        |   - Completed Checkpoints       - Checkpoints to Run
>>        |     - ManifestParser              - TargetDiscovery
>>        |                                   - TI
>>        |                                   - TransferIPS
>>        |                                   - Transfer(s)
>>        |                                   - Finalizer(s)
>>        |
>>        |                                 - DC Workdir = /rpool/dc
>>        |
>>        +--------------------------------------------------------------+
>>
>>
>>      #Run up to TransferIPS
>>      Engine.execute(end=TransferIPS)
>>
>>        +--------------------------------------------------------------+
>>        |
>>        | DOC:
>>        |   Persistent Data               Volatile Data
>>        |
>>        |   - Completed Checkpoints       - Checkpoints to Run
>>        |     - ManifestParser              - TargetDiscovery
>>        |     - TargetDiscovery             - TI
>>        |     - TI                          - TransferIPS
>>        |     - TransferIPS                 - Transfer(s)
>>        |                                   - Finalizer(s)
>>        |
>>        |                                 - DC Workdir = /rpool/dc
>>        |
>>        +--------------------------------------------------------------+
>>
>> ===========================================================
>>
>> Now if we stop DC, and then attempt to resume:
>>
>> ===========================================================
>>        +--------------------------------------------------------------+
>>        |
>>        | DOC:
>>        |   Persistent Data               Volatile Data
>>        |
>>        |   - Completed Checkpoints       - Checkpoints to Run
>>        |     - EMPTY                       - ManifestParser
>>        |
>>        +--------------------------------------------------------------+
>>
>>      Call Engine.execute()
>>      Add Application-specific checkpoints.
>>
>>      Look for resume-able checkpoints using DC Workdir info...
>>
>>        +--------------------------------------------------------------+
>>        |
>>        | DOC:
>>        |   Persistent Data               Volatile Data
>>        |
>>        |   - Completed Checkpoints       - Checkpoints to Run
>>        |     - ManifestParser              - TargetDiscovery
>>        |                                   - TI
>>        |                                   - TransferIPS
>>        |                                   - Transfer(s)
>>        |                                   - Finalizer(s)
>>        |
>>        |                                 - DC Workdir = /rpool/dc
>>        |
>>        +--------------------------------------------------------------+
>>
>>      Found, one, we want to resume from Transfer, so re-load snapshot from 
>> last run:
>>
>>        +--------------------------------------------------------------+
>>        |
>>        | DOC:
>>        |   Persistent Data               Volatile Data
>>        |
>>        |   - Completed Checkpoints       - Checkpoints to Run
>>        |     - ManifestParser              - TargetDiscovery
>>        |     - TargetDiscovery             - TI
>>        |     - TI                          - TransferIPS
>>        |     - TransferIPS                 - Transfer(s)
>>        |                                   - Finalizer(s)
>>        |
>>        |                                 - DC Workdir = /rpool/dc
>>        |
>>        +--------------------------------------------------------------+
>>
>>      #Now we run until the end:
>>      Engine.execute()
>>
>> ===========================================================
>>
>>
>>    
> 
_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss

Reply via email to