Hello Alexey, On Tue, Mar 04, 2014 at 10:27:42PM +0700, Alexey Medvedchikov wrote: > Hello, > > I'm maintain postgresql cluster with streaming replication for php-based > webapp. And for a few days I'm trying to get rid of errors in my setup: > > Application server DB server > | PHP -> pgbouncer -> haproxy | -> | postgresql | > > pgbouncer pools connections from php (session-based) and haproxy > load-balance and failovering 3 backend postgresql servers. Every ~10 min > haproxy drops connection and pgbouncer reports: > > application logs: failed to execute the SQL statement: SQLSTATE[08P01]: > <<Unknown error>>: 7 ERROR: server conn crashed? > syslog: Mar 4 22:16:12 app1 pgbouncer[15572]: C-0x1d0c130: > mydb/pgsql@unix:5432 > Pooler Error: server conn crashed? > > When I remove haproxy from this setup or change balancer to any other > tcp-balancer: balancer-ng for example everything works fine! > > I tried almost everything I can imagine: > - changing connection between php, pgbouncer and haproxy to tcp/ip or > unix-socket > - changing timeouts, conn lifetimes, keepalive, addition tcp options, pool > modes
How long did you set your timeouts ? Your config shows very short ones compared to the 10 minutes you're talking about : > timeout connect 5s > timeout client 2s > timeout server 10s In practice, the connection will be cut 2 seconds after inactivity, so what I suspect is that depending on your load, some connections remain idle and expire. Also, 2s is very short, it's shorter than a TCP retransmit on packet loss (3s), so you can easily trigger the timeout due to a lost ACK. If you want your connections to last long enough, you'd rather use larger timeouts (at least as large as the connection life you're expecting). Willy

