I think this is normal. We can easyly reproduce your problem. 1) set num_init_children = 1
2) connect pgpool via psql 3) fire up more psql 4) psql(3) will be "freezed" until psql(2) disconnect session. This behavior is perfectly expected one. Is this what you meant? -- Tatsuo Ishii SRA OSS, Inc. Japan > I've done extensive testing since my last message. > The problem appears to be that I am getting more connections to pgpool than > I have num_init_children. > Per my tests, as soon as pgpool gets connection num_init_children + 1, it > locks up and takes at least one of the backend nodes with it. > > Can anyone else confirm this? > I'd like to make sure it is not just my particular configuration causing the > issue. > > In the mean time I have simply increased the num_init_children parameter > significantly in an effort to stay well ahead of the number of incoming > connections and this appears to be working. > > -s > > On Sun, Nov 8, 2009 at 3:23 AM, Tatsuo Ishii <[email protected]> wrote: > > > I would like to know how the condition of pgpool is. What does ps > > show for pgpool processes? Even better, can you attach debugger to one > > of pgpool process and get back trace? > > -- > > Tatsuo Ishii > > SRA OSS, Inc. Japan > > > > > I've tried setting the local backend_hostname = '' > > > Same problems are occurring. > > > Pgpool has actually failed something like 4 separate times today, all but > > > one of them using this local socket configuration. > > > > > > Any other thoughts? > > > > > > thx > > > -s > > > > > > On Thu, Nov 5, 2009 at 10:52 PM, Tatsuo Ishii <[email protected]> > > wrote: > > > > > > > > Actually I'm was running pgpool on db2 (backend_hostname1) and am now > > > > > running it on db3 (backend_hostname2). > > > > > I have actually suspected that pgpool might be opting for some sort > > of > > > > > socket connection to the local instance of postgres instead of using > > the > > > > > TCP/IP connection parameters in an effort to speed things up. > > > > > > > > > > I have done my best to ensure that pgpool has completely separate > > socket > > > > > directories but it wouldn't be hard for pgpool to find a local > > postgres > > > > > socket if it wanted. If I end up with another outage and this time > > db3 > > > > is > > > > > the postgres instance that locks up, I'll be fairly certain that this > > is > > > > the > > > > > problem but for the moment I can only speculate. > > > > > > > > > > I'm assuming you're suggesting I set backend_hostname0 = '' because > > it is > > > > > already weighted to 0.0 anyway? > > > > > > > > No. Because I thought you are running pgpool on db1. '' means force > > > > pgpool to use UNIX domain socket. So if you running pgpool on db3, you > > > > could set: > > > > > > > > backend_hostname2 = '' > > > > > > > > > I have db1 (backend_hostname0) weighted to 0.0 in an effort to direct > > all > > > > > selects to the two slave hosts (db2 and db3) but still benefit from > > > > pgpool > > > > > intelligently sending writes to db1. > > > > > db1 is the mammoth master host and needs all available i/o to deal > > with > > > > > writes. > > > > > My understanding is that this is how "master_slave_mode = true" > > works. > > > > > Writes are always directed to backend_hostname0. > > > > > > > > > > If I need to reevaluate that thinking, please advise but that has > > been > > > > > working for me for months now. > > > > > > > > > > thx > > > > > -s > > > > > > > > > > On Thu, Nov 5, 2009 at 9:26 PM, Tatsuo Ishii <[email protected]> > > wrote: > > > > > > > > > > > Besides the useless error message from pgp_child(it seems someone > > > > > > believed that EOF will set some error number to the global errono > > > > > > variable. I will fix this anyway.), for me it seems socket files > > are > > > > > > going dead. I suspect some network stack bugs could cause this but > > I'm > > > > > > not sure. One thing you might want to try is, changing this: > > > > > > > > > > > > backend_hostname0 = 'db1.xxx.xxx' > > > > > > > > > > > > to: > > > > > > > > > > > > backend_hostname0 = '' > > > > > > > > > > > > This will make pgpool to use UNIX domain socket for the > > communication > > > > > > channel to PostgreSQL, rather than TCP/IP. It may or may not affect > > > > > > the problem you have, since the network code in the kernel will be > > > > > > different. > > > > > > > > > > > > (I assume you are running pgpool on db1.xxx.xxx) > > > > > > -- > > > > > > Tatsuo Ishii > > > > > > SRA OSS, Inc. Japan > > > > > > > > > > > > > Has anyone else run into this: > > > > > > > > > > > > > > My pgpool instance runs without problems for days on end and then > > > > > > suddenly > > > > > > > stops responding to all requests. > > > > > > > At the same moment, one of my three backend db hosts becomes > > > > completely > > > > > > > inaccessible. > > > > > > > Pgpool will not respond to shutdown, or even kill and must be > > kill > > > > -9'd > > > > > > > Once all pgpool processes are out of the way, the inaccessible > > > > postgres > > > > > > > server once again becomes responsive. > > > > > > > I restart pgpool and everything works properly for a few more > > days. > > > > > > > > > > > > > > At the moment the problem occurs, pgpool's log output, which > > > > typically > > > > > > > consists of just connection logging, turns into a steady stream > > of > > > > this: > > > > > > > Nov 5 11:33:18 s...@obfuscated pgpool: 2009-11-05 11:33:18 > > ERROR: > > > > pid > > > > > > 12811: > > > > > > > pcp_child: pcp_read() failed. reason: Success > > > > > > > These errors show up sporaticlly in my pgpool logs all the time > > but > > > > don't > > > > > > > appear to have any adverse effects until the whole thing takes a > > > > dive. > > > > > > > I would desperately like to know what this error message is > > trying to > > > > > > tell > > > > > > > me. > > > > > > > > > > > > > > I have not been able to correlate any given > > query/connection/process > > > > to > > > > > > the > > > > > > > timing of the outages. > > > > > > > Sometimes they happens at peak usage periods, sometimes they > > happen > > > > in > > > > > > the > > > > > > > middle of the night. > > > > > > > > > > > > > > I experienced this problem using pgpool-II v1.3 and have recently > > > > > > upgraded > > > > > > > to pgpool-II v2.2.5 but am still seeing the same issue. > > > > > > > > > > > > > > It may be relevant to point out that I am running pgpool on one > > of > > > > the > > > > > > > machines that is also acting as a postgres backend and it is > > always > > > > the > > > > > > > postgres instance on the pgpool host that locks up. > > > > > > > This morning I moved the pgpool instance onto another one of the > > > > postgres > > > > > > > backend hosts in an effort to see if the cohabitation of pgpool > > and > > > > > > postgres > > > > > > > is causing problems or if there is simply an issue with that > > postres > > > > on > > > > > > that > > > > > > > host of if this is just a coincidence. > > > > > > > I likely won't gain anything from this test for a day or more. > > > > > > > > > > > > > > Also relevant is that I am running mammoth replicator and am only > > > > using > > > > > > > pgpool for connection load balancing and high availability. > > > > > > > > > > > > > > Below is my pgpool.conf. > > > > > > > > > > > > > > Any thoughts appreciated. > > > > > > > > > > > > > > -steve crandell > > > > > > > > > > > > > > > > > > > > > > > > > > > > f > > > > > > > > > > > > > > # > > > > > > > # pgpool-II configuration file sample > > > > > > > # $Header: /cvsroot/pgpool/pgpool-II/pgpool.conf.sample,v 1.4.2.3 > > > > > > > 2007/10/12 09:15:02 y-asaba Exp $ > > > > > > > > > > > > > > # Host name or IP address to listen on: '*' for all, '' for no > > TCP/IP > > > > > > > # connections > > > > > > > #listen_addresses = 'localhost' > > > > > > > listen_addresses = '10.xxx.xxx.xxx' > > > > > > > > > > > > > > # Port number for pgpool > > > > > > > port = 5432 > > > > > > > > > > > > > > # Port number for pgpool communication manager > > > > > > > pcp_port = 9898 > > > > > > > > > > > > > > # Unix domain socket path. (The Debian package defaults to > > > > > > > # /var/run/postgresql.) > > > > > > > socket_dir = '/usr/local/pgpool' > > > > > > > > > > > > > > # Unix domain socket path for pgpool communication manager. > > > > > > > pcp_socket_dir = '/usr/local/pgpool' > > > > > > > > > > > > > > # Unix domain socket path for the backend. Debian package > > defaults to > > > > > > > /var/run/postgresql! > > > > > > > backend_socket_dir = '/usr/local/pgpool' > > > > > > > > > > > > > > # pgpool communication manager timeout. 0 means no timeout, but > > > > > > > strongly not recommended! > > > > > > > pcp_timeout = 10 > > > > > > > > > > > > > > # number of pre-forked child process > > > > > > > num_init_children = 32 > > > > > > > > > > > > > > > > > > > > > # Number of connection pools allowed for a child process > > > > > > > max_pool = 4 > > > > > > > > > > > > > > > > > > > > > # If idle for this many seconds, child exits. 0 means no > > timeout. > > > > > > > child_life_time = 30 > > > > > > > > > > > > > > # If idle for this many seconds, connection to PostgreSQL closes. > > > > > > > # 0 means no timeout. > > > > > > > #connection_life_time = 0 > > > > > > > connection_life_time = 30 > > > > > > > > > > > > > > # If child_max_connections connections were received, child > > exits. > > > > > > > # 0 means no exit. > > > > > > > # change > > > > > > > child_max_connections = 0 > > > > > > > > > > > > > > # Maximum time in seconds to complete client authentication. > > > > > > > # 0 means no timeout. > > > > > > > authentication_timeout = 60 > > > > > > > > > > > > > > # Logging directory (more accurately, the directory for the PID > > file) > > > > > > > logdir = '/usr/local/pgpool' > > > > > > > > > > > > > > # Replication mode > > > > > > > replication_mode = false > > > > > > > > > > > > > > # Set this to true if you want to avoid deadlock situations when > > > > > > > # replication is enabled. There will, however, be a noticable > > > > > > performance > > > > > > > # degradation. A workaround is to set this to false and insert a > > > > > > /*STRICT*/ > > > > > > > # comment at the beginning of the SQL command. > > > > > > > replication_strict = false > > > > > > > > > > > > > > # When replication_strict is set to false, there will be a chance > > for > > > > > > > # deadlocks. Set this to nonzero (in milliseconds) to detect > > this > > > > > > > # situation and resolve the deadlock by aborting current session. > > > > > > > replication_timeout = 5000 > > > > > > > > > > > > > > # Load balancing mode, i.e., all SELECTs except in a transaction > > > > block > > > > > > > # are load balanced. This is ignored if replication_mode is > > false. > > > > > > > # change > > > > > > > load_balance_mode = true > > > > > > > > > > > > > > # if there's a data mismatch between master and secondary > > > > > > > # start degeneration to stop replication mode > > > > > > > replication_stop_on_mismatch = false > > > > > > > > > > > > > > # If true, replicate SELECT statement when load balancing is > > > > disabled. > > > > > > > # If false, it is only sent to the master node. > > > > > > > # change > > > > > > > replicate_select = true > > > > > > > > > > > > > > # Semicolon separated list of queries to be issued at the end of > > a > > > > > > session > > > > > > > reset_query_list = 'ABORT; RESET ALL; SET SESSION AUTHORIZATION > > > > DEFAULT' > > > > > > > > > > > > > > # If true print timestamp on each log line. > > > > > > > print_timestamp = true > > > > > > > > > > > > > > # If true, operate in master/slave mode. > > > > > > > # change > > > > > > > master_slave_mode = true > > > > > > > > > > > > > > # If true, cache connection pool. > > > > > > > connection_cache = false > > > > > > > > > > > > > > # Health check timeout. 0 means no timeout. > > > > > > > health_check_timeout = 20 > > > > > > > > > > > > > > # Health check period. 0 means no health check. > > > > > > > health_check_period = 0 > > > > > > > > > > > > > > # Health check user > > > > > > > health_check_user = 'nobody' > > > > > > > > > > > > > > # If true, automatically lock table with INSERT statements to > > keep > > > > SERIAL > > > > > > > # data consistency. An /*INSERT LOCK*/ comment has the same > > effect. > > > > A > > > > > > > # /NO INSERT LOCK*/ comment disables the effect. > > > > > > > insert_lock = false > > > > > > > > > > > > > > # If true, ignore leading white spaces of each query while pgpool > > > > judges > > > > > > > # whether the query is a SELECT so that it can be load balanced. > > > > This > > > > > > > # is useful for certain APIs such as DBI/DBD which is known to > > adding > > > > an > > > > > > > # extra leading white space. > > > > > > > ignore_leading_white_space = false > > > > > > > > > > > > > > # If true, print all statements to the log. Like the > > log_statement > > > > > > option > > > > > > > # to PostgreSQL, this allows for observing queries without > > engaging > > > > in > > > > > > full > > > > > > > # debugging. > > > > > > > log_statement = false > > > > > > > > > > > > > > # If true, incoming connections will be printed to the log. > > > > > > > # change > > > > > > > log_connections = true > > > > > > > > > > > > > > # If true, hostname will be shown in ps status. Also shown in > > > > > > > # connection log if log_connections = true. > > > > > > > # Be warned that this feature will add overhead to look up > > hostname. > > > > > > > log_hostname = false > > > > > > > > > > > > > > # if non 0, run in parallel query mode > > > > > > > parallel_mode = false > > > > > > > > > > > > > > # if non 0, use query cache > > > > > > > enable_query_cache = 0 > > > > > > > > > > > > > > #set pgpool2 hostname > > > > > > > pgpool2_hostname = '' > > > > > > > > > > > > > > # system DB info > > > > > > > #system_db_hostname = 'localhost' > > > > > > > #system_db_port = 5432 > > > > > > > #system_db_dbname = 'pgpool' > > > > > > > #system_db_schema = 'pgpool_catalog' > > > > > > > #system_db_user = 'pgpool' > > > > > > > #system_db_password = '' > > > > > > > > > > > > > > # backend_hostname, backend_port, backend_weight > > > > > > > # here are examples > > > > > > > backend_hostname0 = 'db1.xxx.xxx' > > > > > > > backend_port0 = 5433 > > > > > > > backend_weight0 = 0.0 > > > > > > > > > > > > > > backend_hostname1 = 'db2.xxx.xxx' > > > > > > > backend_port1 = 5433 > > > > > > > backend_weight1 = 0.4 > > > > > > > > > > > > > > backend_hostname2 = 'db3.xxx.xxx' > > > > > > > backend_port2 = 5433 > > > > > > > backend_weight2 = 0.6 > > > > > > > > > > > > > > > > > > > > > > > > > > > > # - HBA - > > > > > > > > > > > > > > # If true, use pool_hba.conf for client authentication. In > > pgpool-II > > > > > > > # 1.1, the default value is false. The default value will be true > > in > > > > > > > # 1.2. > > > > > > > enable_pool_hba = false > > > > > > > > > > > > _______________________________________________ Pgpool-general mailing list [email protected] http://pgfoundry.org/mailman/listinfo/pgpool-general
