swuferhong commented on issue #2110: URL: https://github.com/apache/fluss/issues/2110#issuecomment-3621798870
> I understand this question as follows: > > 1. Deploy Fluss in Kubernetes > 2. Use Flink lookup join to query the Fluss table > 3. Rolling Upgrade Fluss Cluster (Pod Rebuild) > 4. Try to connect/use, then keep failing until the flin task times out completely, and then the flink task fails > > I think it could be like this: > > 1. When the request times out, actively disconnect the link, then listen to the changes of zk in real time, and optimize TCP Keepalive > Probably: > Pod deletion - > ZooKeeper notification - > Client update metadata - > Immediately disconnect old connection - > Next request directly with the new IP > > [@swuferhong](https://github.com/swuferhong) If you think this plan is feasible, you can assign this task to me. Hi, @buvb. I'll look into the detailed root cause first. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
