Cool Thanks I will try that in a bit. Indeed right after I sent the previous email I tried pgbench with only 1 client and it did work but anything higher did not. Really appreciate the help here
Marcelo Linux/Solaris System Administrator [EMAIL PROTECTED] http://www.zeroaccess.org On Sep 18, 2008, at 9:51 PM, Tatsuo Ishii wrote: > First of all you need to set -c 1. This is due to the architecure of > pgbench (it tries to establish all the connections before proceeding > in your case 5, which gives pgpool virtually no chance to block new > connections from clients). Another point is, you might want to give > some sleep time between pgbench sessions. I use following script for > demonstration. > > \set naccounts 100000 * :scale > \setrandom delta -5000 5000 > \setrandom aid 1 :naccounts > UPDATE accounts SET abalance = abalance + :delta WHERE aid = :aid; > SELECT pg_sleep(0.5); > > then run pgbench as follows: > > pgbench -d -p 5432 -C -t 10000 -f pgbench.sql test > > Hope this helps, > -- > Tatsuo Ishii > SRA OSS, Inc. Japan > >> Thanks for the suggestion but that did not appear to solve the issue. >> >> I started pgpool with the following timout related options hoping it >> would help out but no go there. >> >> child_life_time = 30 >> connection_life_time = 10 >> client_idle_limit = 0 >> recovery_timeout = 1200 >> >> * I have not set "lient_idle_limit" because that will also close idle >> in transaction connections and >> on production I cannot allow that. >> >> >> Basically stage 1 runs just fine and then when it starts stage 2 it >> takes forever for the checkpoint to start .. >> The checkpoint itself takes about 3 minutes which I think it's >> still a >> lot (ok that I'm using a VM with not a lot of resources so that may >> be >> misleading) >> >> >> So, again I have started a pgbench process and stopped db3 and then >> used pcp_recovery_node to have that node recovered. >> >> pgbench -C -h 10.1.100.213 -p 5432 -c 5 -t 200 -U postgres pgbench >> >> >> LOG: db1 backend (where stage1/stage2 scripts run from) >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> ===================================================================== >> .... >> global/2845 >> global/2846 >> global/2847 >> global/pg_auth >> global/pg_control >> global/pg_database >> global/pgstat.stat >> pg_clog/ >> pg_clog/0000 >> pg_multixact/ >> pg_multixact/members/ >> pg_multixact/members/0000 >> pg_multixact/offsets/ >> pg_multixact/offsets/0000 >> pg_subtrans/ >> pg_subtrans/0000 >> pg_tblspc/ >> pg_twophase/ >> >> sent 11648717 bytes received 16540 bytes 158710.98 bytes/sec >> total size is 172764067 speedup is 14.81 >> building file list ... done >> 000000030000000000000048 >> pg_stop_backup >> ---------------- >> 0/488C002C >> (1 row) >> >> >> sent 16779407 bytes received 42 bytes 4794128.29 bytes/sec >> total size is 16777216 speedup is 1.00 >> building file list ... done >> 000000030000000000000048.002F255C.backup >> >> sent 414 bytes received 42 bytes 912.00 bytes/sec >> total size is 251 speedup is 0.55 >> 2008-09-18 16:26:22 CDT LOG: checkpoint starting: time >> 2008-09-18 16:28:22 CDT LOG: checkpoint complete: wrote 774 buffers >> (25.2%); 0 transaction log file(s) added, 0 removed, 0 recycled; >> write=119.383 s, sync=0.209 s, total=119.644 s >> >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> ===================================================================== >> * it's now 16:50 PM and stage 2 still seems stuck ! >> >> >> So if I do a Ctrl -C on the pgbench process that I started then the >> CHECKPOINT on stage 2 finishes and then recovery is finished. >> It looks like for some reason pgpool is not able to close the >> connections on the backends and therefore stage 2 gets stuck. >> Am I missing some other timeout or parameter ? >> >> >> PGpool-II LOG: >> >> Sep 18 16:21:55 debian-db1 pgpool: 2008-09-18 16:21:55 LOG: pid >> 6876: starting recovering node 2 >> Sep 18 16:22:00 debian-db1 pgpool: 2008-09-18 16:22:00 LOG: pid >> 6876: CHECKPOINT in the 1st stage done >> Sep 18 16:22:00 debian-db1 pgpool: 2008-09-18 16:22:00 LOG: pid >> 6876: starting recovery command: "SELECT pgpool_recovery('copy-base- >> backup', 'debian-db4', '/var/lib/po >> stgresql/8.3/main')" >> Sep 18 16:23:34 debian-db1 pgpool: 2008-09-18 16:23:34 LOG: pid >> 6876: 1st stage is done >> Sep 18 16:23:34 debian-db1 pgpool: 2008-09-18 16:23:34 LOG: pid >> 6876: starting 2nd stage >> Sep 18 16:41:19 debian-db1 pgpool: 2008-09-18 16:41:19 LOG: pid >> 6958: ProcessFrontendResponse: failed to read kind from frontend. >> frontend abnormally exited >> Sep 18 16:41:19 debian-db1 pgpool: 2008-09-18 16:41:19 LOG: pid >> 6955: ProcessFrontendResponse: failed to read kind from frontend. >> frontend abnormally exited >> Sep 18 16:41:19 debian-db1 pgpool: 2008-09-18 16:41:19 LOG: pid >> 6919: ProcessFrontendResponse: failed to read kind from frontend. >> frontend abnormally exited >> Sep 18 16:41:19 debian-db1 pgpool: 2008-09-18 16:41:19 LOG: pid >> 6908: ProcessFrontendResponse: failed to read kind from frontend. >> frontend abnormally exited >> Sep 18 16:41:21 debian-db1 pgpool: 2008-09-18 16:41:21 LOG: pid >> 6876: all connections from clients have been closed >> Sep 18 16:41:21 debian-db1 pgpool: 2008-09-18 16:41:21 LOG: pid >> 6876: CHECKPOINT in the 2nd stage done >> Sep 18 16:41:21 debian-db1 pgpool: 2008-09-18 16:41:21 LOG: pid >> 6876: starting recovery command: "SELECT >> pgpool_recovery('pgpool_recovery_pitr', 'debian-db4', '/var/lib/ >> postgresql/8.3/main')" >> Sep 18 16:41:38 debian-db1 pgpool: 2008-09-18 16:41:38 LOG: pid >> 6876: 2 node restarted >> Sep 18 16:41:38 debian-db1 pgpool: 2008-09-18 16:41:38 LOG: pid >> 6876: send_failback_request: fail back 2 th node request from pid >> 6876 >> Sep 18 16:41:38 debian-db1 pgpool: 2008-09-18 16:41:38 LOG: pid >> 6876: recovery done >> >> >> >> Any help much appreciated >> >> Marcelo >> Linux/Solaris System Administrator >> http://www.zeroaccess.org >> >> >> On Sep 10, 2008, at 8:34 PM, Tatsuo Ishii wrote: >> >>>> Hi, >>>> >>>> I have recently setup a VM environment to test out an online >>>> recovery >>>> process which works great. Basically I have the following. >>>> >>>> pgpool-II 2.1 server: >>>> 1 - pgpool >>>> >>>> 3 postgresql 8.3.3 servers: >>>> 1 - db1 (backend_node0) >>>> 2 - db2 (backend_node1) >>>> 3 - db3 (backend_node2) >>>> >>>> The 3rd PG server is kept in detached status so that I can use it >>>> for >>>> online recovery tests. >>>> I have created the scripts for 1st/2nd stage and also the >>>> pg_remote_start script. >>>> I can use pcp_recovery_node to bring a new node online (3rd PG >>>> server) >>>> or recover one of the existing ones without any issues. >>>> >>>> Now, until today all my previous tests on performing online >>>> recovery >>>> involved calling pcp_recovery_node and during that time no clients >>>> were using the database servers through pgpool of course. There was >>>> no >>>> activity going on on the database servers at all. So everything >>>> worked great, pgpool went through 1st and 2nd stage and then called >>>> the remote start and finished the recovery process. So server got >>>> online and sync'd. >>>> >>>> Then, today I tried to do the same but right before calling >>>> pcp_recovery_node I started a pgbench process pointing to the >>>> pgpool >>>> server to create some activity on the database. I was under the >>>> impression that during 2nd stage pgpool would perhaps start to >>>> queue >>>> some of the transactions along with not allowing new clients to >>>> connect to it. Allowing the 2nd stage to occur and then bring the >>>> new >>>> node online. Once the online recovery was finished, pgpool would go >>>> through its queue and send those transactions to all nodes. >>>> Is that not the case ? Cause basically it went through 1st stage >>>> and >>>> then pcp_recovery_node timed out. >>> >>> Perhaps you need -C option to pgbench? >>> -- >>> Tatsuo Ishii >>> SRA OSS, Inc. Japan >> _______________________________________________ Pgpool-general mailing list [email protected] http://pgfoundry.org/mailman/listinfo/pgpool-general
