ninsmiracle commented on issue #2114:
URL: 
https://github.com/apache/incubator-pegasus/issues/2114#issuecomment-2345197032

   When I check the app status:
   ```
   [replicas]
   pidx  ballot  replica_count  primary  secondaries  
   0     0       0/3            -        []           
   1     0       0/3            -        []           
   2     0       0/3            -        []           
   3     0       0/3            -        [] 
   ```
   We can see that even the primary is not successfully established. So I check 
the log of replica server for this replica:
   ```
   D2024-09-11 15:32:08.284 (1726039928284328485 93977) replica.io-thrd.93977: 
server_negotiation.cpp:40:start(): 
SERVER_NEGOTIATION(CLIENT=10.xxx.xx.1:41689): start negotiation
   D2024-09-11 15:32:08.284 (1726039928284356416 93977) replica.io-thrd.93977: 
network.cpp:696:on_server_session_accepted(): server session accepted, 
remote_client = 10.xxx.xx.1:41689, current_count = 1
   D2024-09-11 15:32:08.284 (1726039928284364187 93977) replica.io-thrd.93977: 
network.cpp:701:on_server_session_accepted(): ip session inserted, 
remote_client = 10.xxx.xx.1:41689, current_count = 1
   W2024-09-11 15:32:08.289 (1726039928289020684 93993) 
replica.default10.04006f170001004b: server_negotiation.cpp:137:do_challenge(): 
SERVER_NEGOTIATION(CLIENT=10.xxx.xx.1:41689): negotiation failed, with err = 
ERR_UNKNOWN, msg = ERR_UNKNOWN
   D2024-09-11 15:32:08.289 (1726039928289039571 93993) 
replica.default10.04006f170001004b: 
network.cpp:738:on_server_session_disconnected(): session 10.xxx.xx1:41689 
disconnected, the total client sessions count remains 0
   D2024-09-11 15:32:08.289 (1726039928289046389 93993) 
replica.default10.04006f170001004b: 
network.cpp:744:on_server_session_disconnected(): client ip 10.xxx.xx.1:41689 
has no more session to this server
   E2024-09-11 15:32:08.289 (1726039928289069504 93975) replica.io-thrd.93975: 
asio_rpc_session.cpp:96:operator()(): asio read from 10.xxx.xx.1:41689 failed: 
Operation canceled
   ```
   
   I think the key message is `server_negotiation.cpp:137:do_challenge(): 
SERVER_NEGOTIATION(CLIENT=10.xxx.xx.1:41689): negotiation failed, with err = 
ERR_UNKNOWN, msg = ERR_UNKNOWN`
   
   In my opinion, it failed on `SASL_INITIATE` step of SASL process between 
meta server and replica server.
   ```
          client                              server
           | ---    SASL_LIST_MECHANISMS     --> |
           | <--  SASL_LIST_MECHANISMS_RESP  --- |
           | --     SASL_SELECT_MECHANISMS   --> |
           | <-- SASL_SELECT_MECHANISMS_RESP --- |
           |                                     |
           | ---       SASL_INITIATE         --> |
           |                                     |
           | <--       SASL_CHALLENGE        --- |
           | ---     SASL_CHALLENGE_RESP     --> |
           |                                     |
           |               .....                 |
           |                                      |
           | <--       SASL_CHALLENGE        --- |
           | ---     SASL_CHALLENGE_RESP     --> |
           |                                     |
           |                                     |
           | <--         SASL_SUCC           --- |
           |                                     |
           |                                     |
           | ---         RPC_CALL           ---> |
           | <--         RPC_RESP           ---- |
   ```
   
   
![image](https://github.com/user-attachments/assets/9294dbe6-b4c8-4123-9701-f36e6f022ac1)
   
   
   
   
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to