Hello,

I'm maintain postgresql cluster with streaming replication for php-based
webapp. And for a few days I'm trying to get rid of errors in my setup:

       Application server            DB server
| PHP -> pgbouncer -> haproxy | -> | postgresql |

pgbouncer pools connections from php (session-based) and haproxy
load-balance and failovering 3 backend postgresql servers. Every ~10 min
haproxy drops connection and pgbouncer reports:

application logs: failed to execute the SQL statement: SQLSTATE[08P01]:
<<Unknown error>>: 7 ERROR:  server conn crashed?
syslog: Mar  4 22:16:12 app1 pgbouncer[15572]: C-0x1d0c130:
mydb/pgsql@unix:5432
Pooler Error: server conn crashed?

When I remove haproxy from this setup or change balancer to any other
tcp-balancer: balancer-ng for example everything works fine!

I tried almost everything I can imagine:
- changing connection between php, pgbouncer and haproxy to tcp/ip or
unix-socket
- changing timeouts, conn lifetimes, keepalive, addition tcp options, pool
modes
- downgrading to older versions of pgbouncer and haproxy
- reduced number of TW-sockets by changing connectivity of other components
to unix-socket where possible

Any ideas what to look for?

software versions:
php5 5.4.23
pgbouncer 1.5.4
haproxy 1.5dev22

pgbouncer config:
==============
[databases]
* = host=127.0.0.1 port=6432

[pgbouncer]
syslog = 1
pidfile = /var/run/postgresql/pgbouncer.pid
listen_addr = 127.0.0.1
listen_port = 5432
unix_socket_dir = /var/run/postgresql
listen_backlog = -1
auth_type = trust
auth_file = /etc/pgbouncer/userlist.txt
admin_users =
stats_users = pgsql
pool_mode = session
server_reset_query = DISCARD ALL;
ignore_startup_parameters = application_name
server_check_query = select 1
server_check_delay = 10
max_client_conn = 5120
default_pool_size = 16
reserve_pool_size = 0
reserve_pool_timeout = 0
log_connections = 0
log_disconnections = 0
log_pooler_errors = 1
server_lifetime = 1200
server_idle_timeout = 60
query_timeout = 0
client_login_timeout = 60
==============

haproxy config:
==============
defaults
        option splice-auto
        option tcpka
        timeout connect 5s
        timeout client 2s
        timeout server 10s

listen stats :18080
        mode http
        stats enable
        stats uri /

listen pgsql 127.0.0.1:6432
        maxconn 3000
        mode tcp
        balance roundrobin
        option tcp-smart-accept
        option tcp-smart-connect
        option pgsql-check user postgres
        server slave1 10.0.0.1:5432
        server slave2 10.0.0.2:5432
        server slave3 10.0.0.3:5432
==============

Reply via email to