Hi all, Thank you very much for all the suggestions and comments. Here are some other thought about this feature. Also RFC :)
My original thought is to write a simple implementation of tickle ACK to cater to most of the user's need, but now realized the real deployment of HA solution is so complicated and I'm not sure if the solution we have talked above can really achieve that goal. One problem is if the user who really needs this feature will basically have a shared storage or a configured DRBD? If most of the users do have, I think we can go forward. If not, we can considering to implement this like calling Corysnc/openAIS API to sync the TCP connections information in the cluster. It is a little expensive since we just want to sync some information only related to this feature. And if a single service group(I mean not the clone scenario like cluster ip) is the most use-cases, we needn't send the TCP connection information to all the other nodes since only one node who take over that IP will use it. But surely, using Corosync API or openAIS service can do the job, especially to the scenario where the user doesn't have a cluster-visible storage. The other problem is we monitor the established TCP connections every other interval so it is not very precise since things may have changed in one interval. So an event-driven mechanism should be ideal. When the TCP connections have changed, the kernel notify this info to user-space, a daemon in user-space then handle this info. It seems tcp_diag can provide this function in the kernel-space, we just need to write the user-space program to talk to tcp_diag. (Is that so?) I'm going to do some investigation about this and if it is feasible, I'd like to implement it. If anyone know more about the tcp_diag or have other idea about how to implement the event-driven mechanism or you think no need to try this, please comment :) So, there is another way to implement this feature (openais API + tcp_diag), it is a little complicated but should be more precise. Right? But I would like to implement the simple way at first, Hope it can meet most of the users needs. I'm very appreciated to your input (especially tell me most of the production environment is like and what is most admins are complaining.) Thanks, Jiaju _______________________________________________________ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/