BewareMyPower opened a new pull request #15009:
URL: https://github.com/apache/pulsar/pull/15009


   ### Motivation
   
   Recently we found C++ client might not refresh the OAuth2 token after
   the reconnection. It's because when Pulsar proxy is enabled, the
   `AuthResponse` might be sent by proxy, which leads to a disconnection
   from broker side. See 
https://github.com/apache/pulsar/blob/94cc46a66fbe322472dbb657803d21320e59079c/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/ServerCnx.java#L687
   
   Then, proxy will return a `ServiceNotReady` error to client as the
   result of topic lookup. However, in this case, C++ client only completes
   the future of lookup with `ResultConnectError`. The `ClientConnection`
   object is still cached in the pool and will be used for followed
   interactions.
   
   In addition, the `AuthChallenge` request from broker might never be sent
   to client when proxy is enabled. So there is no chance for client to
   refresh the token unless a new connection is established.
   
   ### Modifications
   
   - Like what Java client does, add `checkServerError` method to
     `ClientConnection`, which closes the socket for `ServiceNotReady`
     error. Here we don't call `close` directly just not to interrupt the
     execution of `handleIncomingCommand`. `close` will be called in
     `handleRead` because the `err` will be
     `boost::asio::error::bad_descriptor`.
   - To ensure `ClientConnection::close` can remove itself from the
     connection pool, add `remove` method in `ConnectionPool` to remove a
     connection by address. Then hold a reference to the pool in
     `ClientConnection`. At the beginning of `close()`, remove the
     connection from pool.
   
   ### Verifying this change
   
   It's hard to mock the `ServiceNotReady` case in unit test. I have tested
   this patch in my private env and let the OAuth2 token expire quickly. We
   can see the following logs (hosts are hidden):
   
   ```
   2022-04-03 13:07:55.616 ERROR [0x700006480000] ClientConnection:1002 | 
[<local-ip>:62059 -> <remote-ip>:6651] Failed lookup req_id: 136 error: 
ServiceUnitNotReady msg: writeAddress(..) failed: Broken pipe
   2022-04-03 13:07:55.616 ERROR [0x700006480000] ClientImpl:443 | Error 
getting topic partitions metadata: ServiceUnitNotReady
   2022-04-03 13:07:55.617 ERROR [0x700006480000] ClientConnection:599 | 
[<local-ip>:62059 -> <remote-ip>:6651] Read operation failed: Bad file 
descriptor
   2022-04-03 13:07:55.617 INFO  [0x700006480000] ConnectionPool:119 | Removing 
connection from pool for <proxy-url> use_count: 3 @ 0x7f7d3680ec00
   2022-04-03 13:07:55.617 INFO  [0x700006480000] ClientConnection:1568 | 
[<local-ip>:62059 -> <remote-ip>:6651] Connection closed
   2022-04-03 13:07:55.617 INFO  [0x700006480000] ClientConnection:265 | 
[<local-ip>:62059 -> <remote-ip>:6651] Destroyed connection
   2022-04-03 13:07:56.618 INFO  [0x113d8d600] ClientConnection:191 | [<none> 
-> <proxy-url>] Create ClientConnection, timeout=10000
   2022-04-03 13:08:00.841 INFO  [0x113d8d600] ConnectionPool:97 | Created 
connection for <proxy-url>
   2022-04-03 13:08:01.348 INFO  [0x700006480000] ClientConnection:377 | 
[<local-ip>:62552 -> <remote-ip>:6651] Connected to broker
   ```
   
   We can see a new connection (with a new port) was established after
   handling the `ServiceUnitNotReady` error.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to