lengyuexuexuan opened a new pull request, #1909:
URL: https://github.com/apache/incubator-pegasus/pull/1909

   ### What problem does this PR solve? 
   #1880
   #1856
   
   ### What is changed and how does it work?
   As for #1856. 
   when go client is writing to one partition and the replica node core dump, 
go client will finish after timeout without updating the configuration. In this 
case, the go client only restart to solve the problem. 
   
   In this pr, the client would update conconfiguration of table automatically 
when someone replica core dump.
   After testing, we found that the the replicaerror is 
"context.DeadlineExceeded" when the replica core dump.
   
   
https://github.com/apache/incubator-pegasus/blob/41141c11c36930a19da727fd25a4876bd56f76a6/go-client/pegasus/table_connector.go#L705-L706
   
   Therefore, when client meets the errror, the go client will update 
configuration automatically.
   Besides, this request will not retry. Because only in the case of timeout, 
the configuration will be automatically updated. If you try again before then, 
it will still fail. There is also the risk of infinite retries. 
   Therefore, it is better to directly return the request error to the user and 
let the user try again.
   
   
   
   
   As for #1880
   When the client sends an RPC message 
"RPC_CM_QUERY_PARTITION_CONFIG_BY_INDEX" to the meta server, if the meta server 
isn't primary, the response that forward to the primary meta server will 
return. 
   
   According to the above description, assuming that the client does not have a 
primary meta server configured, we can connect to the primary meta server in 
this way.
   
   In this PR, we implement this function through the following steps.
   
   1. First parse the response, determine whether its errno is 
ERR_FORWARD_TO_OTHERS, and then parse it to get the primary meta server 
address.  
   
https://github.com/apache/incubator-pegasus/blob/41141c11c36930a19da727fd25a4876bd56f76a6/go-client/session/meta_call.go#L166-L177
   2. Secondly, determine whether the address is already in the client 
configuration. If it is already there, skip it directly. Otherwise, establish a 
connection and pull the configuration directly from the primary meta server.
   
https://github.com/apache/incubator-pegasus/blob/41141c11c36930a19da727fd25a4876bd56f76a6/go-client/session/meta_call.go#L118-L138
   
   It should be noted that the IP address and session do not have a one-to-one 
correspondence, because there may be situations where the IP address is 
unavailable. 
   This is why there is a priamry meta server configuration in the client, but 
the curllead cannot be used as the index of the metaIPAddrs array.
   
https://github.com/apache/incubator-pegasus/blob/41141c11c36930a19da727fd25a4876bd56f76a6/go-client/session/meta_call.go#L123-L128
   
   
   ##### Tests 
   
   - Unit test
   - Manual test (add detailed scripts or steps below)
   1. Start onebox, and the primary meta server is not added to the go client 
configuration.
   2. The go client writes data to a certain partition and then kills the 
replica process.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to