Edward, I'm not able to provide any more information here as this involved sections of the code that I am not familiar with.
I will try and get one of my more knowledgeable colleague to post an answer. Regards, Tim --- On 12/03/08 00:44, Brilliant wrote: > Hi Tim, > Thank you first. > according to your answer for my questions, I have some other idea about > them , Please see it bellow. > Thanks > Edward > > ----- Original Message ----- > *From:* Tim Read - Staff Engineer Solaris Availability Engineering > <mailto:Tim.Read at Sun.COM> > *To:* yang <mailto:yanggongming at huawei.com> > *Cc:* ha-clusters-discuss at opensolaris.org > <mailto:ha-clusters-discuss at opensolaris.org> > *Sent:* Tuesday, December 02, 2008 10:08 PM > *Subject:* Re: [ha-clusters-discuss] Replica Process discussion > > Edward, > > One of my colleagues managed to track down the paper. It turns out that > the diagram is the same as the one in the Blueprint that Richard Elling > and I wrote! Having said that, I got it original from some of the > internal design documents created by the subsequent authors of the code. > > More answers inline, though I'll have to leave it to some of my > colleagues who know more about the details of these internals to > provide > more detail. > > Regards, > > Tim > --- > > On 12/02/08 02:20, yang wrote: > > Hi all, > > The "sun cluster white paper" have a picture in the desicription. > I have some question,from the picture , I can't figure out what is > the process of replica. > > question1: in step 1, what is the "transfer commit"of"request + > transfer commit" means, and what is the "+" means. > > The '+' simply means 'and'. So this is "request and transfer commit". > ---> Why the client need transfer commit, what is the 'transfer > commit' for? it is the first request msg. > what is the 'transfer commit' mean , does this different from > 'commit' means. > I perfer to explian the + transfer with " this message need > commit from remote endpoint" > > > question2: in step 6,why use the dot line and send confirm to > secondary? in my mind , the client should send it to primary > ,because secondary is offline from the client's aspect. > > If I remember correctly, the solid lines are synchronous operations, > the > dotted lines are asynchronous operations. So the step 6 is to > confirm to > the secondary that the transaction has been complete and they don't > need > to 'remember' anything about it. This would free up any memory held > I guess. > > --> The Client don't know any thing about the secondary Application. > and why the client need to send commit? > I think the client receive the commit request, then the client will > send commit to primary. > and primary will send a copy to secondary.So this msg need the > server and the replica have interaction. > I still don't know why call the message "fogot msg"? > > > question3:in the step 5, why the primary need "+ confirm" ? > replay means confirm already. > > The reply from the primary to the client is the 'commit'. It is > confirming that the transaction has been completed. > ---> the transaction isn't finished,because after receive the commit > from secondary,it do force log and write disk. so i still think + > means need a confirm or other. > and from the code , the need confirm is taken from the message received. > > > based on my understanding, i think the replica will work like this: > > 1/client send a request to primary.(+transfer commit means need > commit,yet i don't know why transfer commit yet not only commit?) > > 2/primary do checkpoint and then send it to secondary. > > 3/secondary receive checkpoint and make confirm to primary > > 4/primary receive commit and then continue to response to the > reply, and keep the force log for the steps. > > 5/after finish , send client a reply with a flag to tell client > to response the confirm to himself or secondary(no link to > primary,do primary don't need the confirm. how client send confirm > to secondary ,and secondary is offline,so why send to secondary?) > > > > In my mind , I think this picture only fit for one special kind > of service.yet I really do need to know whether the process is right > and what is the impaction from the picture should be. > > > > the checkpoint struction in code shows it contain > > replica::ckpt_seq_t minseq; > > replica::ckpt_seq_t value; > > So I am wonder how to use this two values? > --I don't know how the value means . and what is recorded in the > checkpoint. > > > Thanks > > Edward > > -- > > Tim Read > Staff Engineer > Solaris Availability Engineering > Sun Microsystems Ltd > Springfield > Linlithgow > EH49 7LR > > Phone: +44 (0)1506 672 684 > Mobile: +44 (0)7802 212 137 > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > NOTICE: This email message is for the sole use of the intended > recipient(s) and may contain confidential and privileged information. > Any unauthorized review, use, disclosure or distribution is prohibited. > If you are not the intended recipient, please contact the sender by > reply email and destroy all copies of the original message. > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -- Tim Read Staff Engineer Solaris Availability Engineering Sun Microsystems Ltd Springfield Linlithgow EH49 7LR Phone: +44 (0)1506 672 684 Mobile: +44 (0)7802 212 137 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~