Hello Alexey,

On Tue, Mar 04, 2014 at 10:27:42PM +0700, Alexey Medvedchikov wrote:
> Hello,
> 
> I'm maintain postgresql cluster with streaming replication for php-based
> webapp. And for a few days I'm trying to get rid of errors in my setup:
> 
>        Application server            DB server
> | PHP -> pgbouncer -> haproxy | -> | postgresql |
> 
> pgbouncer pools connections from php (session-based) and haproxy
> load-balance and failovering 3 backend postgresql servers. Every ~10 min
> haproxy drops connection and pgbouncer reports:
> 
> application logs: failed to execute the SQL statement: SQLSTATE[08P01]:
> <<Unknown error>>: 7 ERROR:  server conn crashed?
> syslog: Mar  4 22:16:12 app1 pgbouncer[15572]: C-0x1d0c130:
> mydb/pgsql@unix:5432
> Pooler Error: server conn crashed?
> 
> When I remove haproxy from this setup or change balancer to any other
> tcp-balancer: balancer-ng for example everything works fine!
> 
> I tried almost everything I can imagine:
> - changing connection between php, pgbouncer and haproxy to tcp/ip or
> unix-socket
> - changing timeouts, conn lifetimes, keepalive, addition tcp options, pool
> modes

How long did you set your timeouts ? Your config shows very short ones
compared to the 10 minutes you're talking about :

>         timeout connect 5s
>         timeout client 2s
>         timeout server 10s

In practice, the connection will be cut 2 seconds after inactivity, so
what I suspect is that depending on your load, some connections remain
idle and expire.

Also, 2s is very short, it's shorter than a TCP retransmit on packet loss
(3s), so you can easily trigger the timeout due to a lost ACK.

If you want your connections to last long enough, you'd rather use larger
timeouts (at least as large as the connection life you're expecting).

Willy


Reply via email to