Hi Tim,
Thank you first.
according to your answer for my questions, I have some other idea about them , 
Please see it bellow.
Thanks
Edward
  ----- Original Message ----- 
  From: Tim Read - Staff Engineer Solaris Availability Engineering 
  To: yang 
  Cc: ha-clusters-discuss at opensolaris.org 
  Sent: Tuesday, December 02, 2008 10:08 PM
  Subject: Re: [ha-clusters-discuss] Replica Process discussion


  Edward,

  One of my colleagues managed to track down the paper. It turns out that 
  the diagram is the same as the one in the Blueprint that Richard Elling 
  and I wrote! Having said that, I got it original from some of the 
  internal design documents created by the subsequent authors of the code.

  More answers inline, though I'll have to leave it to some of my 
  colleagues who know more about the details of these internals to provide 
  more detail.

  Regards,

  Tim
  ---

  On 12/02/08 02:20, yang wrote:
  > Hi all,
  > The "sun cluster white paper" have a picture in the desicription. I have 
some question,from the picture , I can't figure out what is the process of 
replica.
  > question1: in step 1, what is the "transfer commit"of"request + transfer 
commit" means, and what is the "+" means.

  The '+' simply means 'and'. So this is "request and transfer commit".
  ---> Why the client need transfer commit, what is the 'transfer commit' for? 
it is the first request msg.
         what is the 'transfer commit' mean , does this different from 'commit' 
means.
         I perfer to explian the + transfer with " this message need commit  
from remote endpoint"

  > question2: in step 6,why use the dot line and send confirm to secondary? in 
my mind , the client should send it to primary ,because secondary is offline 
from the client's aspect.

  If I remember correctly, the solid lines are synchronous operations, the 
  dotted lines are asynchronous operations. So the step 6 is to confirm to 
  the secondary that the transaction has been complete and they don't need 
  to 'remember' anything about it. This would free up any memory held I guess.

  --> The Client don't know any thing about the secondary Application. and why 
the client need to send commit?
  I think the client receive the commit request, then the client will send 
commit to primary.
  and primary will send a copy to secondary.So this msg need the server and the 
replica have interaction.
  I still don't know why call the message "fogot msg"?

  > question3:in the step 5, why the primary need "+ confirm" ? replay means 
confirm already.

  The reply from the primary to the client is the 'commit'. It is 
  confirming that the transaction has been completed.
  ---> the transaction isn't finished,because after receive the commit from 
secondary,it do force log and write disk. so i still think + means need a 
confirm or other.
  and from the code , the need confirm is taken from the message received.

  > based on my understanding, i think the replica will work like this:
  > 1/client send a request to primary.(+transfer commit means need commit,yet 
i don't know why transfer commit yet not only commit?)
  > 2/primary do checkpoint and then send it to secondary.
  > 3/secondary receive checkpoint and make confirm to primary
  > 4/primary receive commit and then continue to response to the reply, and 
keep the force log for the steps.
  > 5/after finish , send client a reply with a flag to tell client to response 
the confirm to himself or secondary(no link to primary,do primary don't need 
the confirm. how client send confirm to secondary ,and secondary is offline,so 
why send to secondary?)
  > 
  > In my mind , I think this picture only fit for one special kind of 
service.yet I really do need to know whether the process is right and what is 
the impaction from the picture should be.  
  > 
  > the checkpoint struction in code shows it contain 
  > replica::ckpt_seq_t minseq;
  > replica::ckpt_seq_t value;
  > So I am wonder how to use this two values? 
  --I don't know how the value means . and what is recorded in the checkpoint.

  > Thanks
  > Edward

  -- 

  Tim Read
  Staff Engineer
  Solaris Availability Engineering
  Sun Microsystems Ltd
  Springfield
  Linlithgow
  EH49 7LR

  Phone: +44 (0)1506 672 684
  Mobile: +44 (0)7802 212 137

  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  NOTICE: This email message is for the sole use of the intended 
  recipient(s) and may contain confidential and privileged information. 
  Any unauthorized review, use, disclosure or distribution is prohibited. 
  If you are not the intended recipient, please contact the sender by 
  reply email and destroy all copies of the original message.

  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://mail.opensolaris.org/pipermail/ha-clusters-discuss/attachments/20081203/3aa9bde5/attachment.html>
  • [ha-clusters-di... yang
    • [ha-cluste... Tim Read - Staff Engineer Solaris Availability Engineering
    • [ha-cluste... Tim Read - Staff Engineer Solaris Availability Engineering
      • [ha-cl... Brilliant
        • [h... Tim Read - Staff Engineer Solaris Availability Engineering
          • ... Binu Jose Philip
            • ... Brilliant

Reply via email to