Besides the useless error message from pgp_child(it seems someone believed that EOF will set some error number to the global errono variable. I will fix this anyway.), for me it seems socket files are going dead. I suspect some network stack bugs could cause this but I'm not sure. One thing you might want to try is, changing this:
backend_hostname0 = 'db1.xxx.xxx' to: backend_hostname0 = '' This will make pgpool to use UNIX domain socket for the communication channel to PostgreSQL, rather than TCP/IP. It may or may not affect the problem you have, since the network code in the kernel will be different. (I assume you are running pgpool on db1.xxx.xxx) -- Tatsuo Ishii SRA OSS, Inc. Japan > Has anyone else run into this: > > My pgpool instance runs without problems for days on end and then suddenly > stops responding to all requests. > At the same moment, one of my three backend db hosts becomes completely > inaccessible. > Pgpool will not respond to shutdown, or even kill and must be kill -9'd > Once all pgpool processes are out of the way, the inaccessible postgres > server once again becomes responsive. > I restart pgpool and everything works properly for a few more days. > > At the moment the problem occurs, pgpool's log output, which typically > consists of just connection logging, turns into a steady stream of this: > Nov 5 11:33:18 s...@obfuscated pgpool: 2009-11-05 11:33:18 ERROR: pid 12811: > pcp_child: pcp_read() failed. reason: Success > These errors show up sporaticlly in my pgpool logs all the time but don't > appear to have any adverse effects until the whole thing takes a dive. > I would desperately like to know what this error message is trying to tell > me. > > I have not been able to correlate any given query/connection/process to the > timing of the outages. > Sometimes they happens at peak usage periods, sometimes they happen in the > middle of the night. > > I experienced this problem using pgpool-II v1.3 and have recently upgraded > to pgpool-II v2.2.5 but am still seeing the same issue. > > It may be relevant to point out that I am running pgpool on one of the > machines that is also acting as a postgres backend and it is always the > postgres instance on the pgpool host that locks up. > This morning I moved the pgpool instance onto another one of the postgres > backend hosts in an effort to see if the cohabitation of pgpool and postgres > is causing problems or if there is simply an issue with that postres on that > host of if this is just a coincidence. > I likely won't gain anything from this test for a day or more. > > Also relevant is that I am running mammoth replicator and am only using > pgpool for connection load balancing and high availability. > > Below is my pgpool.conf. > > Any thoughts appreciated. > > -steve crandell > > > > f > > # > # pgpool-II configuration file sample > # $Header: /cvsroot/pgpool/pgpool-II/pgpool.conf.sample,v 1.4.2.3 > 2007/10/12 09:15:02 y-asaba Exp $ > > # Host name or IP address to listen on: '*' for all, '' for no TCP/IP > # connections > #listen_addresses = 'localhost' > listen_addresses = '10.xxx.xxx.xxx' > > # Port number for pgpool > port = 5432 > > # Port number for pgpool communication manager > pcp_port = 9898 > > # Unix domain socket path. (The Debian package defaults to > # /var/run/postgresql.) > socket_dir = '/usr/local/pgpool' > > # Unix domain socket path for pgpool communication manager. > pcp_socket_dir = '/usr/local/pgpool' > > # Unix domain socket path for the backend. Debian package defaults to > /var/run/postgresql! > backend_socket_dir = '/usr/local/pgpool' > > # pgpool communication manager timeout. 0 means no timeout, but > strongly not recommended! > pcp_timeout = 10 > > # number of pre-forked child process > num_init_children = 32 > > > # Number of connection pools allowed for a child process > max_pool = 4 > > > # If idle for this many seconds, child exits. 0 means no timeout. > child_life_time = 30 > > # If idle for this many seconds, connection to PostgreSQL closes. > # 0 means no timeout. > #connection_life_time = 0 > connection_life_time = 30 > > # If child_max_connections connections were received, child exits. > # 0 means no exit. > # change > child_max_connections = 0 > > # Maximum time in seconds to complete client authentication. > # 0 means no timeout. > authentication_timeout = 60 > > # Logging directory (more accurately, the directory for the PID file) > logdir = '/usr/local/pgpool' > > # Replication mode > replication_mode = false > > # Set this to true if you want to avoid deadlock situations when > # replication is enabled. There will, however, be a noticable performance > # degradation. A workaround is to set this to false and insert a /*STRICT*/ > # comment at the beginning of the SQL command. > replication_strict = false > > # When replication_strict is set to false, there will be a chance for > # deadlocks. Set this to nonzero (in milliseconds) to detect this > # situation and resolve the deadlock by aborting current session. > replication_timeout = 5000 > > # Load balancing mode, i.e., all SELECTs except in a transaction block > # are load balanced. This is ignored if replication_mode is false. > # change > load_balance_mode = true > > # if there's a data mismatch between master and secondary > # start degeneration to stop replication mode > replication_stop_on_mismatch = false > > # If true, replicate SELECT statement when load balancing is disabled. > # If false, it is only sent to the master node. > # change > replicate_select = true > > # Semicolon separated list of queries to be issued at the end of a session > reset_query_list = 'ABORT; RESET ALL; SET SESSION AUTHORIZATION DEFAULT' > > # If true print timestamp on each log line. > print_timestamp = true > > # If true, operate in master/slave mode. > # change > master_slave_mode = true > > # If true, cache connection pool. > connection_cache = false > > # Health check timeout. 0 means no timeout. > health_check_timeout = 20 > > # Health check period. 0 means no health check. > health_check_period = 0 > > # Health check user > health_check_user = 'nobody' > > # If true, automatically lock table with INSERT statements to keep SERIAL > # data consistency. An /*INSERT LOCK*/ comment has the same effect. A > # /NO INSERT LOCK*/ comment disables the effect. > insert_lock = false > > # If true, ignore leading white spaces of each query while pgpool judges > # whether the query is a SELECT so that it can be load balanced. This > # is useful for certain APIs such as DBI/DBD which is known to adding an > # extra leading white space. > ignore_leading_white_space = false > > # If true, print all statements to the log. Like the log_statement option > # to PostgreSQL, this allows for observing queries without engaging in full > # debugging. > log_statement = false > > # If true, incoming connections will be printed to the log. > # change > log_connections = true > > # If true, hostname will be shown in ps status. Also shown in > # connection log if log_connections = true. > # Be warned that this feature will add overhead to look up hostname. > log_hostname = false > > # if non 0, run in parallel query mode > parallel_mode = false > > # if non 0, use query cache > enable_query_cache = 0 > > #set pgpool2 hostname > pgpool2_hostname = '' > > # system DB info > #system_db_hostname = 'localhost' > #system_db_port = 5432 > #system_db_dbname = 'pgpool' > #system_db_schema = 'pgpool_catalog' > #system_db_user = 'pgpool' > #system_db_password = '' > > # backend_hostname, backend_port, backend_weight > # here are examples > backend_hostname0 = 'db1.xxx.xxx' > backend_port0 = 5433 > backend_weight0 = 0.0 > > backend_hostname1 = 'db2.xxx.xxx' > backend_port1 = 5433 > backend_weight1 = 0.4 > > backend_hostname2 = 'db3.xxx.xxx' > backend_port2 = 5433 > backend_weight2 = 0.6 > > > > # - HBA - > > # If true, use pool_hba.conf for client authentication. In pgpool-II > # 1.1, the default value is false. The default value will be true in > # 1.2. > enable_pool_hba = false _______________________________________________ Pgpool-general mailing list [email protected] http://pgfoundry.org/mailman/listinfo/pgpool-general
