Hello,
I'm maintain postgresql cluster with streaming replication for php-based
webapp. And for a few days I'm trying to get rid of errors in my setup:
Application server DB server
| PHP -> pgbouncer -> haproxy | -> | postgresql |
pgbouncer pools connections from php (session-based) and haproxy
load-balance and failovering 3 backend postgresql servers. Every ~10 min
haproxy drops connection and pgbouncer reports:
application logs: failed to execute the SQL statement: SQLSTATE[08P01]:
<<Unknown error>>: 7 ERROR: server conn crashed?
syslog: Mar 4 22:16:12 app1 pgbouncer[15572]: C-0x1d0c130:
mydb/pgsql@unix:5432
Pooler Error: server conn crashed?
When I remove haproxy from this setup or change balancer to any other
tcp-balancer: balancer-ng for example everything works fine!
I tried almost everything I can imagine:
- changing connection between php, pgbouncer and haproxy to tcp/ip or
unix-socket
- changing timeouts, conn lifetimes, keepalive, addition tcp options, pool
modes
- downgrading to older versions of pgbouncer and haproxy
- reduced number of TW-sockets by changing connectivity of other components
to unix-socket where possible
Any ideas what to look for?
software versions:
php5 5.4.23
pgbouncer 1.5.4
haproxy 1.5dev22
pgbouncer config:
==============
[databases]
* = host=127.0.0.1 port=6432
[pgbouncer]
syslog = 1
pidfile = /var/run/postgresql/pgbouncer.pid
listen_addr = 127.0.0.1
listen_port = 5432
unix_socket_dir = /var/run/postgresql
listen_backlog = -1
auth_type = trust
auth_file = /etc/pgbouncer/userlist.txt
admin_users =
stats_users = pgsql
pool_mode = session
server_reset_query = DISCARD ALL;
ignore_startup_parameters = application_name
server_check_query = select 1
server_check_delay = 10
max_client_conn = 5120
default_pool_size = 16
reserve_pool_size = 0
reserve_pool_timeout = 0
log_connections = 0
log_disconnections = 0
log_pooler_errors = 1
server_lifetime = 1200
server_idle_timeout = 60
query_timeout = 0
client_login_timeout = 60
==============
haproxy config:
==============
defaults
option splice-auto
option tcpka
timeout connect 5s
timeout client 2s
timeout server 10s
listen stats :18080
mode http
stats enable
stats uri /
listen pgsql 127.0.0.1:6432
maxconn 3000
mode tcp
balance roundrobin
option tcp-smart-accept
option tcp-smart-connect
option pgsql-check user postgres
server slave1 10.0.0.1:5432
server slave2 10.0.0.2:5432
server slave3 10.0.0.3:5432
==============