On 10/28/2016 04:24 PM, Tom Pantelis wrote:
> 
> 
> On Fri, Oct 28, 2016 at 10:00 AM, Robert Varga <n...@hq.sk
> <mailto:n...@hq.sk>> wrote:
> 
>     On 10/28/2016 08:00 AM, he.yu...@zte.com.cn
>     <mailto:he.yu...@zte.com.cn> wrote:
>     > so we need to public the DataTreeCandidateTip.getTipRoot() API in
>     > yangtools to get the last tx's root
>     >
>     > All of the above is not related to commit phase, the overall process is
>     > as follows
>     >
>     >
>     > Tx1.prepare() ---> Tx1.candidate
>     >                    Tx1 persist ReplicatedLogEntry
>     >                    Tx1 add to pipelineTransactions
>     >                    Tx2.prepare(Tx1.candidate) ----------> Tx2.candidate
> 
>     Actually the flow should be:
> 
>     TipProducingDataTree dataTree;
>     DataTreeCandidateTip tx1Candidate = dataTree.prepare(tx1);
>     persist tx1Candidate
>     DataTreeCandidateTip tx2Candidate = tx1Candidate.prepare(tx2);
>     persist tx2Candidate
>     dataTree.commit(tx1Candidate);
>     dataTree.commit(tx2Candidate);
> 
>     All of the accounting needed will occur inside the DataTree
>     implementation without leaking tipRoot. The API has been explicitly
>     designed for this use case -- it just has not been implemented yet,
>     because appendAndPersist() is synchronous and the surrounding code
>     assumes that once the candidate has been persisted, it has also been
>     committed.
> 
> 
> I'm not clear on the last statement. Which surrounding code are you
> referring to? appendAndPersist doesn't directly commit the entry. After
> persistence completes, it replicates and commit occurs on ApplyState
> once consensus is achieved. For single-node of course the replicate part
> is skipped and ApplyState is sent immediately.

Sorry about that, I was writing off the top of my head and have not
properly introduced the context I am thinking in.

My previous text is written from the point of view of ShardDataTree,
which acts as an intermediary between frontend (DistributedDataStore et
al.) and the Raft journal (via Shard and RaftActor).

When I say 'persist' and 'persist completes' I really mean 'when
ShardDataTree calls Shard.persistPayload()' and 'Shard invokes
ShardDataTree.applyReplicatedPayload()'.

The reason I did this is because it abstracts out the unneeded details
of whether persistence is enabled and whether we are a single node --
ShardDataTree does not care.

Having completed a context switch to this topic, yes, parts of the
persist process are asynchronous -- I guess I have not realized that before.

>     For this to work correctly in face of failures and recovery, there need
>     to be further persist events and frontend replies need to be sent as
>     following:
> 
>     - once candidate persist completes, notify fronted of precommit success,
>       wait for commit message (or shortcut in directCommit case)
> 
> 
> So you propose to couple persistence in the pre-commit phase. Sounds
> reasonable.

Both phases, actually. There would be two callouts to
Shard.persistPayload(), with two different payloads:
- one at precommit time, which contains the DataTreeCandidate
- one at commit time, which contains only the TransactionIdentifier

>     - once commit request arrives (or 3PC commit timer expires):
> 
> 
> Isn't this where we would replicate? 
>  
> 
>         dataTree.commit()
> 
> 
> This would occur on ApplyState?
>  
> 
>         persist tx commit record (only identifier, no data)
> 
> 
> Don't we already do this via ApplyJournalEntries?

Well, we only have a single journal entry for each transaction. We need
to have two.

Hope this makes it a bit more clear.

Thanks,
Robert

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev

Reply via email to