littlepangdi opened a new pull request, #1444: URL: https://github.com/apache/incubator-pegasus/pull/1444
### What problem does this PR solve? <!--add issue link with summary if exists--> fix #1385 ### What is changed and how does it work? expected behavior: when replica server restart,The client should be able to reconnect after the restart of the node without considering the client configuration update (reselecting the primary replica). Actually, client configuration update is another issue we should talk about. I'd like to work on this on another pr. In this pr, we focus on why client cannot reconnect with restarted server. 1. SDK use several loop to monitor rpc.conn, when server is closed, `loopForDialing` will continue to retry dialling until connected with server(when server is restart),then, two new loop `loopForRequest` & `loopForResponse` will be created to handle request. however, `loopForRequest` will not return when correlative `loopForResponse` returned because of IsNetworkClosed(EOF), since latter only return nil and will not shutdown tom. thus, there will be more alive`loopForRequest` than `loopForResponse` in this case. 2. SDK retried not on timeout err, however, we wrapped timeout err incorrectly in here https://github.com/apache/incubator-pegasus/blob/d16e65cd88b4fb6113d5e6a99234b5be527f28da/go-client/session/session.go#L338 ,thus every request like `multiset` will continue to do retry and never recover. ### Checklist <!--REMOVE the items that are not applicable--> ##### Tests <!-- At least one of them must be included. --> - Unit test - Integration test - Manual test (add detailed scripts or steps below) - No code ##### Code changes - Has exported function/method change - Has exported variable/fields change - Has interface methods change - Has persistent data change ##### Side effects - Possible performance regression - Increased code complexity - Breaking backward compatibility ##### Related changes - Need to cherry-pick to the release branch - Need to update the documentation - Need to be included in the release note -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
