Edward, One of my colleagues managed to track down the paper. It turns out that the diagram is the same as the one in the Blueprint that Richard Elling and I wrote! Having said that, I got it original from some of the internal design documents created by the subsequent authors of the code.
More answers inline, though I'll have to leave it to some of my colleagues who know more about the details of these internals to provide more detail. Regards, Tim --- On 12/02/08 02:20, yang wrote: > Hi all, > The "sun cluster white paper" have a picture in the desicription. I have some > question,from the picture , I can't figure out what is the process of replica. > question1: in step 1, what is the "transfer commit"of"request + transfer > commit" means, and what is the "+" means. The '+' simply means 'and'. So this is "request and transfer commit". > question2: in step 6,why use the dot line and send confirm to secondary? in > my mind , the client should send it to primary ,because secondary is offline > from the client's aspect. If I remember correctly, the solid lines are synchronous operations, the dotted lines are asynchronous operations. So the step 6 is to confirm to the secondary that the transaction has been complete and they don't need to 'remember' anything about it. This would free up any memory held I guess. > question3:in the step 5, why the primary need "+ confirm" ? replay means > confirm already. The reply from the primary to the client is the 'commit'. It is confirming that the transaction has been completed. > based on my understanding, i think the replica will work like this: > 1/client send a request to primary.(+transfer commit means need commit,yet i > don't know why transfer commit yet not only commit?) > 2/primary do checkpoint and then send it to secondary. > 3/secondary receive checkpoint and make confirm to primary > 4/primary receive commit and then continue to response to the reply, and keep > the force log for the steps. > 5/after finish , send client a reply with a flag to tell client to response > the confirm to himself or secondary(no link to primary,do primary don't need > the confirm. how client send confirm to secondary ,and secondary is > offline,so why send to secondary?) > > In my mind , I think this picture only fit for one special kind of > service.yet I really do need to know whether the process is right and what is > the impaction from the picture should be. > > the checkpoint struction in code shows it contain > replica::ckpt_seq_t minseq; > replica::ckpt_seq_t value; > So I am wonder how to use this two values? > Thanks > Edward -- Tim Read Staff Engineer Solaris Availability Engineering Sun Microsystems Ltd Springfield Linlithgow EH49 7LR Phone: +44 (0)1506 672 684 Mobile: +44 (0)7802 212 137 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~