> Yes I'm agree but it doesn't cover all cases too, please take a look at > the following bug report > http://pgfoundry.org/pipermail/pgpool-hackers/2011-January/000525.html
Yes, I had read. Your scenario is lead to so called "split brain", where there are two (or more) primary nodes. According to you, the condition to reproduce the problem is: >> You have to use an existing ip address /host running postmaster like an >> other slave. This host must not have a wal writer process. This seems to be not so common case, don't it? Also "split brain" could occur even easily: - Node 0 (primary) goes down by administrator - Node 1 automatically promotes to new primary - The stupid administrator decides to fail back node 0 - Now you have two primary nodes(split brain)! I believe we have even more cases which could cause split brain. Your scenario is just one of those cases. So unless your particluar case is the worst one and frequestly happen, let's leave find_primary_node() as it is. > We need to fix that, any idea ? I've attached a video for demonstration > in the last thread response. One idea is waiting for the promoting primary for N seconds expecting it becomes "true" primary in the failover script(you can check it by issuing "show transaction_read_ony"). If not, the script issues pg_ctl to shutdown the failed-to-promoto-standby. Probably we should have something like "pgpool-shutdown-postmaster()" function to shutdown PostgtreSQL. This will make writing failover script lot easier than using pg_ctl. Also this will reduce the security risk since using pg_ctl requries ssh access from the host where pgpool is running on to the host where PostgreSQL is running on. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp _______________________________________________ Pgpool-hackers mailing list [email protected] http://pgfoundry.org/mailman/listinfo/pgpool-hackers
