[HACKERS] pg_stat_replication when standby is unreachable

2013-05-28 Thread Abhishek Rai
Hello Postgres gurus, I'm writing a thin clustering layer on top of Postgres using the synchronous replication feature. The goal is to enable HA and survive permanent loss of a single node. Using an external coordinator (Zookeeper), one of the nodes is elected as the primary. The primary node

Re: [HACKERS] pg_stat_replication when standby is unreachable

2013-05-29 Thread Abhishek Rai
On Wed, May 29, 2013 at 9:16 AM, Peter Eisentraut pete...@gmx.net wrote: On 5/28/13 9:42 PM, Abhishek Rai wrote: Detecting primary health is easy. But what is the best way to know if the standby is live? Since this is not a hot-standby, I cannot send queries to it. Then how do you

Re: [HACKERS] pg_stat_replication when standby is unreachable

2013-05-29 Thread Abhishek Rai
On Wed, May 29, 2013 at 9:14 AM, Dimitri Fontaine dimi...@2ndquadrant.frwrote: Abhishek Rai abhishek...@gmail.com writes: SELECT * from pg_stat_replication(); I've noticed that when I terminate the standby (cleanly or through kill -9), the result of above function goes from 1 row

Re: [HACKERS] pg_stat_replication when standby is unreachable

2013-05-29 Thread Abhishek Rai
I looked a bit more into the code and it appears to me that the following are true: - A separate wal sender process is created on the primary side for each connected standby. - The wal sender process terminates (walsender.c / WalSndLoop) when there is an error to write to the standby's socket. -